1、1An IT managers guide to building or buying a CDP.Sponsored by:601 Townsend Ave San Francisco,CA 94103 Published by:231 Second Avenue Milford,CT 06460 www.cdpinstitute.org2Customer Data Platforms(CDPs)have been gaining popularity in the marketing world for nearly a decade.The idea is simple:put all
2、customer data in one place so that marketers and all other userscan easily use it to build better customer relationships.In a world where actionable customer data is critical to business success,this is a powerful promise to make.Yet many IT departments are skeptical of the concept.To them,the CDP o
3、ften sounds like another name for their existing data warehouse,data lake,or CRM system.The last thing any IT manager needs is one more system to manage,especially one that holds sensitive customer data.So the first question they ask is,Can the goals of CDP be better met by extending their existing
4、systems?Its an important question that deserves careful consideration.The right answer for any company will depend on the state of existing systems,exact user requirements,and available resources,among other considerations.This paper will identify the information needs to make a sound decision.3Need
5、s addressed by a CDP.The Customer Data Platform Institute defines a CDP as“packaged software the builds a unified,persistent customer database that is accessible to other systems.”That definition implies a number of specific capabilities,encapsulated in the Institutes RealCDP certification.But the t
6、erm Customer Data Platform is often understood more loosely to describe any system that delivers unified,actionable customer profiles.Lets accept the looser definition for the moment,since it brings us directly to the most important question:What does it take to deliver the unified,actionable custom
7、er profiles that are the core CDP benefit?Business UsersData scientists,business analysts,and other advanced users traditionally pull data for ad hoc projects from data lakes or directly from source systems,and rely on data warehouses for information they need on a regular basis.The unified profiles
8、 in a CDP offer an appealing middle ground of data thats ready for use even though the specific purpose is not known in advance.This is especially important for modern machine learning and artificial intelligence models that can take advantage of more inputs than human analysts find practical.Needs
9、for analytical users include:Easy access to a wide variety of data sources,including the ability to quickly add new sources as these appear.Metadata and discovery tools that make it easy for users to understand whats available.Retain original details provided by source systems,including data element
10、s or values that are not defined in advance.Data quality and cleaning processes performed on data before it is presented for use,to save user effort.Create,store,document,govern,share,and reuse derived variables so users can reduce redundant effort and take advantage of complex standard calculations
11、.Access to historical values so predictive models can be based on data as it appeared at different points in the past.Add calculated values and model scores to customer profiles,including options to recalculate scores in batch processes and on demand.4 Governance Teams.These include privacy,complian
12、ce,security,and legal departments.All must deal with increasing government regulations on how consumer data can be used,as well as increasing demands for control over their data from the consumers themselves.The result is a greater need to govern how customer data is used throughout the company and
13、to ensure the data does not leak to unauthorized users.The CDP offers a single point of access to customer data,making it easier to apply security controls and enforce compliance with usage restrictions.Needs for governance teams include:Access policies that reflect permissible use of data,based on
14、data type,purpose,consumer consents,regulatory jurisdictions,and other variables.Record-keeping to track how data has been used and the legal authority for each use.Security controls that monitor access to customer data and detect any questionable or unauthorized use.Integration with consent managem
15、ent,consumer preference centers,data subject requests,and other systems that help to manage privacy compliance.IT Teams.The IT department is responsible for providing other departments with the data and processing support that they need,while ensuring security and managing costs.They bring an enterp
16、rise-wide perspective that enables them to identify opportunities to use shared resources to meet several user needs at once.Data lakes and warehouses are prime examples of shared resources,and their presence often leads IT departments to question the apparent redundancy of a separate CDP.However,th
17、e CDP often adds new,specialized capabilities that cost-effectively complement rather than duplicate data lake and warehouse functionality.Key considerations for IT teams include:Minimizing total cost,including software licenses,cloud service fees,development expense,and staff costs for system maint
18、enance and support.Ensuring adequate performance,including the ability to meet agreed-upon service levels for processing volumes,response times,and availability.Maintaining adequate system security as well as compliance with legal requirements and internal policies for responsible data management.Bu
19、ilding in the flexibility to meet future needs without major revisions to current system architectures.5How a CDP differs from a data warehouse.From a corporate perspective,all these needs are important.If theyre met by a companys existing systems,then the organization probably doesnt need a CDP.But
20、 most data lakes and warehouses do not meet these requirements because they were designed for other purposes.In particular,data lakes and warehouses have been designed as analytical systems.Data lakes give analysts easy access to raw data,without requiring them to access the original source systems
21、directly.Data warehouses contain carefully cleaned and formatted data for predefined analytical tasks such as business performance measurement and financial reporting.Even a customer-focused data warehouse is highly structured and rigid.These are very different goals than the primary use case of a C
22、DP,which is to activate unified customer profiles with application systems.In other words,the CDP isnt just an analytical system thats more structured than a data lake but more flexible than a warehouse.Rather,the CDP is an entirely different type of system that connects existing data stores to appl
23、ication systems.What About Cloud Databases?A data warehouse built on a cloud database such as Snowflake,Databricks or Google BigQuery is sometimes offered as an alternative to a CDP.Its true that those systems are more flexible than older data management systems,and can in fact be used as the primar
24、y data store for a CDP system.But this only works if they are deployed with a suitable data model and supporting technologies for data collection,preparation,identity matching and profile access.If these were not built into the existing data warehouse,it will not be an adequate substitute for a CDP,
25、regardless of the data engine it uses.6Lets look more closely at how traditional data warehouses diverge from CDP requirements.Data sources.Most warehouses are limited to structured data sources,and will not necessarily include the sources needed to build complete customer profiles.Adding a new sour
26、ce to a warehouse can be difficult because the warehouse data model is finely tuned for specific analytical purposes.By contrast,the CDP needs to easily adapt to new sources as these become available.This often means it stores inputs in something like a data lake,and then lets users load selected it
27、ems into the unified profiles as they are needed.CDP vendors also provide libraries of prebuilt connectors to common customer data source systems,while each connector for a warehouse must often be custom-built.Data preparation.Data is added to a warehouse with a specific purpose in mind,and runs thr
28、ough preparation processes designed to support that purpose.Since these purposes usually relate to business or financial analysis,the preparation processes generally do not include specialized features required for customer data,such as address standardization.In particular,a warehouse not expressly
29、 designed to unify customer data will lack identity matching and persistent identity management,two complex processes that are essential for a CDP.Similarly,CDPs will require processing to extract structured information from unstructured data,something that is rarely needed for a standard data wareh
30、ouse.Derived variables.A data warehouse will often contain calculated values needed for its own purposes.However,setting these up is often a technical task that is optimized for efficiency.The CDP will need different calculated variables,and probably many more of them.Since in-database calculations
31、can be expensive at scale,the CDP may have special features to make this practical.It will also need other features to let non-technical users create their own variables,features to document and share user-created variables,and support for complex calculations such as predictive model scores.Precalc
32、ulated derived variables can be critical for reducing query complexity and improving system performance.7 Business user self-service.Access to data warehouse contents is often highly constrained through user interfaces that limit what questions can be asked.Even if free-form SQL queries are allowed,
33、warehouse administrators often limit which users are authorized to make them to avoid performance problems.CDPs must support a much wider variety of queries against their contents,including many that cannot be anticipated by system administrators.To meet this requirement,they offer a variety of acce
34、ss mechanisms including direct SQL queries,sophisticated query builders,and API connections.These can be governed by security procedures and other guardrails to ensure that data is used properly and efficiently.In addition to making users more effective,they reduce the demand for support from techni
35、cal staff.Predictive modeling support.CDPs also support predictive modeling capabilities,including data preparation for technical users and self-service modeling by business users.Often the self-service features rely on machine learning and other types of artificial intelligence.As with queries and
36、derived variables,models may be built in large numbers for a wide variety of unforeseeable purposes.The CDP must support these with low-effort deployment and high volume scoring calculations as well as model creation.Not all CDPs provide self-service predictive modeling,although it is increasingly c
37、ommon.Privacy support.Consumer privacy isnt an issue with many data warehouses,since they dont hold sensitive customer data.Even warehouses that do hold data often provide tightly constrained access that enables them to avoid the risk of privacy violations such as letting users see personal data for
38、 an individual consumer.By contrast,individual-level data is the lifeblood of the CDP,and its contents are intended to be used for sensitive purposes such as personalized treatments.As a result,CDP systems must contain a rich set of privacy-supporting capabilities,including constraints on how data i
39、s used,validation that a particular piece of data is legally available for a particular customer for a particular purpose,and records of how protected data is read and processed.Similarly,the CDP needs tight integration with other privacy-related systems such as consent managers and preference cente
40、rs.Real-time processing.Data warehouses are nearly always loaded and updated in batch processes,and generally limit real-time processing to analytical queries against a large data set.They are not designed to return details of a single individual in real time.The ability of CDPs to provide real-time
41、 updates is often limited by the capabilities of source systems,which often cannot themselves provide real-time inputs.But all CDPs have some ability to ingest and react to real-time data streams,and to return the full profile of a single individual on demand.These are critical requirements for key
42、CDP use cases such as website personalization and retargeting.Some CDPs support this type of real-time processing directly from their primary data store,while others synchronize their primary data store with a separate file that is optimized for real-time processing.8Does extending the warehouse mak
43、e sense?Any given companys data warehouse and related systems may not have all the limitations listed above.But at most companies,most of these requirements will,in fact,be gaps that must be filled to meet CDP user needs.Filling those gaps requires decisions relating to architecture and applications
44、.What does it take to deliver the unified,actionable customer profiles that are the core CDP benefit?Architecture.The architecture decision determines where data resides.Meeting user requirements for flexibility and real-time access nearly always requires copying data from the existing data lake and
45、 warehouse structures into a new data store that is optimized for those purposes.This is exactly the outcome that IT managers often hoped to avoid by extending the existing systems rather than purchasing a separate CDP.This doesnt mean the CDP is totally independent of the warehouse.A well-designed
46、CDP will leverage existing warehouse capabilities wherever possible.In particular,many CDPs ingest data from existing data lakes or warehouses,rather than setting up a duplicate data collection process to read the source systems directly.This approach also lets the CDP take advantage of data cleanin
47、g and preparation processes developed for the warehouse assuming they dont throw away details that CDP needs to retain.Using this processed data both saves effort and improves consistency between warehouse and CDP results.A CDP may even go further and rely on the existing data lakes and warehouses a
48、s on-demand data sources.This lets the CDP avoid loading and storing selected data elements,while still making them available to CDP users as part of an actionable customer profile.This approach is increasingly practical,as data lakes and warehouses are built on data stores that can respond to real-
49、time data access requests,even if they dont meet the full CDP requirement for real-time updates.Even when the existing data lakes and warehouses do not support on-demand access,knowing that historical data is stored in those systems lets the CDP rely on them rather than storing its own copies.9Appli
50、cation.The CDP may be able to use tools already implemented in the warehouse for data ingestion,cleaning,query,segmentation,business intelligence,machine learning,and other functions.But this will often leave gaps to support loading data from sources that are not captured in existing systems,running
51、 specialized data preparation processes,managing derived variables and predictive modeling functions,implementing complex privacy and regulatory compliance,providing self-service data access to business users,and sharing profiles with external systems in batch and real-time.Acquiring a CDP with thes
52、e functions already available will usually be much easier,faster,and less expensive than developing custom versions or buying and integrating separate components for each of these functions separately.Companies can also expect lower support costs for a purchased CDP than custom-built systems,and tak
53、e advantage of vendors regular product updates rather than continually investing new development resources to create similar enhancements of their own.What ultimately emerges from this approach is an optimal division of labor:use the existing data lake and warehouse for what they do best,which is la
54、rgely to collect,refine,and store data,and use the CDP for what it does best,which is to construct profiles and activate them in application systems.There will still be a need to copy some data into the CDP for special processing and access speed.But other data can remain in the lake and warehouse s
55、o long as those systems can effectively access them as needed.ConclusionThe broad description of a Customer Data Platform sounds very similar to existing data lakes and warehouses,so its natural for IT managers to wonder whether a separate CDP is really needed.But closer examination finds that the u
56、ser needs being met by a CDP are very different from the needs met by most data lakes and warehouses.In simplest terms,the CDP acts as a connector between a store of unified customer profiles and systems that activate those profiles for marketing,advertising,service,support,and other activities.This
57、 is not a service that data lakes or warehouses are designed to deliver.The CDP also includes many applications that were not needed to meet data lake and data warehouse requirements.As a result,many functions needed for a CDP are not found in existing data lake and warehouse systems.Total costs can
58、 be reduced by reading data directly from the existing lake and warehouse data stores,so long as the data is formatted appropriately and easily accessible,and by repurposing some data lake and warehouse processes.But extending those systems to provide the same services as a CDP requires substantial
59、new development.Companies facing this choice should consider carefully whether this is a good investment when suitable commercial products can be purchased.10About Adobe Experience Cloud:Adobe Experience Cloud is the most comprehensive suite of customer experience management tools on the market.With
60、 solutions for data,content delivery,commerce,personalization,and more,this marketing stack is created with the worlds first platform designed specifically to create engaging customer experiences.Each product has built-in artificial intelligence and works seamlessly with other Adobe products.And the
61、y integrate with your existing technology and future innovations,so you can consistently deliver the right experience every time.About Adobe Real-Time Customer Data Platform:Adobe Real-Time Customer Data Platform collects,normalizes,and unifies known and unknown individual and company data into robu
62、st customer and account profiles that automatically update in real time.Marketers use these profiles to deliver timely,relevant,and personalized experiences to any channel,at scale.And with best-in-class usage governance,brands can use data more responsibly and transparently,so consumers have greate
63、r control over their information.Contact:Adobe|601 Townsend Ave|San Francisco,CA 94103|About the CDP InstituteThe Customer Data Platform Institute educates marketers and marketing technologists about customer data management.The mission of the Institute is to provide vendor-neutral information about
64、 issues,methods,and technologies for creating unified,persistent customer databases.Activities include publishing of educational materials,news about industry developments,best practice guides and benchmarks,directories of industry vendors,and consulting on related issues.The Institute is managed by
65、 Raab Associates,a consultancy specializing in marketing technology and analysis.Raab Associates identified the Customer Data Platform category in 2013.Funding is provided by a consortium of CDP vendors.Contact:Customer Data Platform Institute|231 Second Avenue|Milford,CT 06460|www.cdpinstitute.org infocdpinstitute.org 2023 Adobe.All rights reserved.Adobe,the Adobe logo,Adobe Real-time Customer Data Platform and Adobe Experience Platform are either registered trademarks or trademarks of Adobe in the United States and/or other countries.