What is Salesforce Data Cloud?

What is Salesforce Data Cloud - David Palencia

The Data Problem

In today’s day and age, data-driven companies have a significant advantage over their competition.

However, consumer data comes from more sources than ever.
And it keeps growing exponentially.

– Identity Data
– Purchase Data
– Customer Service Case Data
– Web and Mobile Data
– Health Data
– Engagement Data
– Advertising Data

About 100 ZB of data is expected to be in the cloud by 2025. The average company has about 976 different applications, which means hundreds of customer versions across systems.

Collecting, modelling, connecting and activating this data for marketing activities is a big challenge for businesses.

Deduplication. Multiple profiles of the same customer. Integrations. Just to name a few examples.

Salesforce Data Cloud was built to solve this issue. It is, in Salesforce’s words, “a hyperscale data engine inside Salesforce”.

Data Cloud allows you to:

  1. Ingest all the data sources you have from customers
  2. Map and Harmonize that data into a structured data model
  3. Unify the data into a single 360 view of each customer
  4. Gather insights & analytics from your customers
  5. Segment & Activate this data (e.g: in marketing activities)

Salesforce Data Cloud Terminology

Salesforce Data Cloud has been through a lot of different product naming phases and announcements.

Genie.
Salesforce CDP.
Data Cloud.

This can be confusing. Especially since Salesforce Data Cloud builds upon different technologies.

Below is a detailed breakdown of all these technologies and their key differences, which later on explains why Salesforce decided to shift their strategy with this product.

Data Lake

A Data Lake is a centralised repository to store and process large amounts of raw unstructured, semi-structured and structured data, in a non-relational way.

  • Structured data examples: an excel sheet, a table.
  • Unstructured data examples: PDF files, emails.
  • Semi-structured data examples: CSV files, JSON.

Data Lakes are often used by different users on different layers, to power big data analytics, machine learning, data visualization, etc but they require data science resources and time before the data can be actioned or used by other systems.

Some examples of popular Data Lakes are Google Cloud Storage, Azure Data Lake and Amazon S3.

Data Warehouse

Enterprise Data Warehouses (EDW) are similar to Data Lakes, but they store relational data from different sources, which is modelled and treated for a specific purpose.

Business defines requirements and data schema, and the data stored is optimized for querying it and using it in a specific way. The data in Data Warehouses normally goes through ETL and ELT (extract, load, transform) processes to ensure data quality when importing, integrating, etc.

Data Warehouses, on top of having optimized and connected data, store both current and historical records of that data. Therefore, they are used for core Business reporting and Business Intelligence (BI).

Some examples of popular Enterprise Data Warehouses are Google Big Query or Snowflake.

Data Lakehouse

A Data Lakehouse mixes the best of both worlds.

It is built as a hybrid: partly a Data Lake and also similar to a Data Warehouse (while not being any of them 100%).

This way, a business can benefit from both capabilities without having to integrate and connect two disparate data systems.

Salesforce Data Cloud is built like a Data Lakehouse.

For example, it can store large amounts of data from multiple sources, be used as an ecosystem for data lakes & AI models while at the same time, it also allows you to harmonize, query and activate that data for Reporting, Marketing, etc.

Unlike those two other systems (Data Lake and Data Warehouses), Data Cloud is very open and extensible, to simplify integration of all those capabilities and connect the data with 3rd parties (e.g: Advertising platforms, Salesforce ecosystem products, etc).

Finally, the hybrid model of Salesforce Data Cloud brings another benefit, related to the concept of the final technology in our analysis, the CDP (Customer Data Platform).

Customer Data Platform (CDP)

CDPs have become very popular in recent years. They are today considered a central component of the MarTech stack of many enterprise companies.

What does a CDP do?

A Customer Data Platform (CDP) is a collection of software where the data from multiple sources is ingested, cleaned and harmonized to provide a single customer view. This data is then availabe to other marketing systems.

A company might have several data sources for a customer, Jane:

  • Jane’s social media accounts
  • Jane’s 3 email addresses
  • Jane’s cookies
  • Jane’s purchase information on the business ecommerce

The CDP pulls the data from all these sources, combines it, and provides all the marketing systems a single view of Jane, so they can, for example: email her on her preferred or most used address or include her in the right marketing segments based on her purchase history.

Some examples of CDP software are Tealium or MParticle.

Hybrid CDP & AI Data Lakehouse

Now that we’ve covered all the different technologies, we can try to define what Data Cloud is and why Salesforce is making it their flagship product on top of their CRM clouds (Salesforce Sales and Service Cloud):

Salesforce Data Cloud is a hybrid software that mixes the functionalities of a CDP and a Data Lakehouse.

It then has 2 end goals to help a business:

1. CDP Goal

First, to provide a 360 Customer profile with all your data, as a 1st party-data single source of truth, so you can leverage the right data at the right time to maximize ROI.

Some examples in industries:

  • Unified Health Scoring for Health industries
  • Automation of marketing engagements based on data
  • Proactive Customer Service case management
  • Personalized campaigns
  • Real-time Financial profile of a customer

2. Data Lakehouse goal

Second, to provide a safe and trusted LLM layer in all your company data.

Data Cloud is built on top of the Salesforce Platform, so it is connected to Salesforce metadata and it can leverage Einstein AI capabilities throughout all your business apps (e.g: Salesforce Marketing Cloud, Service Cloud, etc).

Examples:

  • Generate marketing segments and lookalike audiences
  • Dynamic Personalization in near real-time
  • Anticipate customer needs from Service cases
  • Automate prospecting for Sales teams

As Salesforce says:

“AI without Data is pointless…
Data without AI is worthless”

Key Benefits of Salesforce Data Cloud

Although the hybrid model can seem rather confusing and complex, the combination of CDP and AI Data Lakehouse capabilities can be key if an organization is already built in the Salesforce ecosystem.

The way I see it, there are 4 key benefits:

Reduced Costs

Near real-time unified customer data can be leveraged by Salesforce Einstein’s AI capabilities to make predictions, recommendations, build segments, provide insights, etc in a low-code way. The direction Salesforce is taking is more and more low-code to enable users to leverage the full potential of their cloud apps.

Increased Productivity

Once the system is fully integrated, having near real-time data across the entire organization (Marketing, Sales, Service, etc) can save lots of manual tasks, time spent, etc if this data is linked to Salesforce flows and automated processes.

360 Actionable Data

Having unified enterprise data across the Salesforce cloud infrastructure can reduce “time to next best action” in all business flows and areas and it can also enable personalization anywhere. Instead of siloed data with complex flows between systems, Salesforce Data Cloud can provide a seamless underlying layer of insights for every department.

Data-driven AI infrastructure

The near future is data-driven enterprises. According to VML, there is a “23% average ROI improvement from applied data and AI”. What’s more, “consumers become increasingly concerned with how their data is captured and used”. In just a few years, enterprises will have data in every decision, process and business interaction. Salesforce is building the foundational system to enable this shift.

Further Reading

https://www.vml.com/expertise/data
https://www.vml.com/insight/protecting-customer-privacy-and-increasing-trust
https://www.pwc.com/us/en/technology/alliances/library/salesforce-data-cloud-and-cdp.html
https://www.databricks.com/glossary/data-lakehouse
https://en.wikipedia.org/wiki/Customer_data_platform
https://www.salesforce.com/data/
https://www.salesforce.com/marketing/data/what-is-a-customer-data-platform/