If you’ve implemented sales automation tools at any kind of scale, you’ve likely run into the limitations of the Salesforce Activity object.
The ones we hear about the most are:
- Limit to 100 custom fields: this means that you have to audit and ration your field usage every time you install a new sales tool (since many of them try to create 20+ fields each.
- Lack of support for custom LONG TEXT fields: this prevents you from storing long form text or metadata on the activity object itself.
- Only a single ‘WhatId’ and ‘WhoId’ value to store relationship mappings to people and objects: this means you can’t construct complex queries across objects.
- Inability to do JOINS between Activities, Events and Other Objects: this means that in order to construct complex queries, you need to do the joins yourself in an ETL.
Why Does Salesforce Do This?
To understand why Salesforce imposes these constraints, you have to know how it works under the hood.
Relational Database Architecture
Salesforce’s entire object and reporting model is built on Oracle’s Relational Database. Relational DBs are the most popular type of database on the planet because they come with a lot of inherent advantages.
These include:
- The ability to define a clear and highly customizable schema.
- The ability to enforce this schema (eg: don’t put text in a number field, insert two records at once to prevent incomplete data, etc.)
- The ability to build foreign keys that map different records together, allowing for easy querying and reporting.
Indeed, this is what makes Salesforce so great – so much of any given application is the data model, and by giving you such strong control over the data layer, Salesforce gives non-coders the ability to build enterprise grade applications that are highly reliable.
Performance Tradeoffs
BUT, like all things in life, this database architecture comes with some limitations. With relational DBs, these primarily live in the realm of performance:
- DB Size: more records tend to slow down reporting queries
- JOINS: the ability to combine data across multiple tables (eg: show me Activities with Opportunities) requires the reporting engine to essentially perform a VLOOKUP on the fly every single time you run a report, which consumes a lot of compute resources and potentially slows down you instance.
So, in giving you all of this data/reporting flexibility, Salesforce introduced the possibility that you could significantly degrade the performance of your instance if you created a schema and reporting system that was not configured right (which isn’t good for them, or you).
What does ‘degraded performance’ mean in this case? Potentially locking up the entire database so your users couldn’t access the system, and causing all reports to fail.
In order to prevent this, they introduced a few concepts in their architecture
First, Activities and Events are stored in a different DB than other objects like Accounts and Opportunities. This is because there are often 10-100x more Activity records than there are related accounts, and this means that there needs to be a different level of performance optimization. By segregating this data into a different DB, Salesforce’s engineers are able to perform a different level of tuning on that record type.
Second, Activities have limitations on the data they can contain. Every custom field means more data. More data means a larger database and more information to query/return for any given report. More data to manage means lower performance. Therefore, Salesforce decided to cap the number of custom fields they supported, and also made sure that the data each field supported was limited. This is why the only LONG/TEXT fields that are supported on the Activity is the native Comments field.
Third, Activities have limited foreign keys out of the box. A foreign key is a custom field that links two objects to each other. This is what allows you to run an Activity report and see the related Account as a clickable link (the Activity object knows that the Account is actually another object, and gives you the ability to connect these objects).
What If I Want to Get Around These Limitations?
Some limitations, you can’t get around. For example, no matter what you do, Salesforce just won’t let you add a TEXT field to the Activity object, or add more than 100 fields.
This means you need to have a strategy for how you think about rationing/allocating these fields to the many different use cases you support, whether it’s an internal requirement or one imposed on you by one of your many sales tools.
Field Mapping
One strategy you can employ is field mapping. For example, let’s say that you are using one dialing tool for SDRs, and another for AEs. It’s likely that both of these products will offer the ability to push “call duration” as a metric into SFDC. If your software vendors offer the ability to create a ‘field mapping’ between their call duration field and a field name that you specify in Salesforce, you can get multiple systems writing to the same field.
NOTE: the downside in this case is going to be that different systems may calculate metrics differently (eg: is does “call duration” include the time that the phone is ringing, or only the time after the phone is picked up?)
Use the Native Salesforce fields
Speaking of field mapping, Salesforce’s Activity object has a bunch of native fields that aren’t exposed by default in the Activity Page Layout. Examples of this are fields for Call_Duration and Task Type. If you can store information in these fields, you can forego creating a custom field altogether.
Abstracting Activity Tracking From Workflow Tools
Many sales tool vendors offer a bundled solution that combines workflow and activity tracking. This is partially because the vendors want to capture share of wallet, and partially because they need to prove the value of the workflow in SFDC (and need an activity logging mechanism to do so).
Most sales engagement platforms now allow you to turn CRM activity logging off, in order to delegate logging of activities to some other centralized tool in the stack. For example, our partner SalesLoft makes it possible to create a Call Task in a cadence, but delegates the creation of the activity to Truly’s phone system application. Since everyone in the company uses Truly as its voice solution, the company can track activity in a single dimension, with a single set of metric definitions.
Enhancing the Salesforce Data Model
note: this part is A LOT of work, and before we glance over it, we have to let you know there is an easier way to do this – check out Truly’s Unified Analytics for Salesforce Package to see how we automate all of this.
As we mentioned, the issue of scalability with relational databases is not a novel concept. Application developers deal with this all the time – sometimes, they do this by paying hundreds of thousands of dollars to create a Data Lake to delay the need to structure/organize their data.
However, more often than not, they build aggregate tables (aka custom objects) that precompute data across a large number of rows and compress them into one.
For example, let’s say you wanted to compute Connect Rate by Day. With SFDC’s default architecture, you would need to build a report that performs the following operation every single time the report runs:
- Fetch every single record that meets your criteria
- JOIN every record to a related object to get the relevant information (eg: Account Name)
- Perform a computation on every record to determine if it was a connect or not (maybe it’s call duration, maybe it’s comparing against a list of dispositions)
- Group all the records by day
- Divide the total number of connected records over the number of records for each day.
That is A LOT OF WORK
The alternative approach some developers would take if they were building an application outside of Salesforce would be performing a daily aggregation, where they take all of the calls from the last 24 hours, perform all this computation once, and then create a new aggregate record that contains this information in a single row.
Then, when someone runs a report the next day to see connect rate by day, they are querying a handful of aggregate records, instead of the thousands of records that are contained in the raw data.