Customer Data Integration (CDI)

Customer Data Integration (CDI) is the sum of procedures, automation, and skills that are required to standardize and integrate customer data that originates from different sources. CDI attempts to address the situation when two or more sources that contain records referring to an overlapping set of real-world customers lack unique identifiers that show the correspondences between the records in the multiple sources. Furthermore, the records representing the same entity might have differing information. For example, one record might have the address that misspelled, another record might be missing some fields, and so on.[1]

Properly conducted, CDI ensures that all relevant departments in the company have constant access to the most current and complete view of customer information available. As such, CDI is an essential element of customer relationship management (CRM). Although many companies have been gathering customer data for a good number of years, it hasn't always been managed very effectively. As a result, companies may maintain outdated, redundant, and inconsistent customer data. According to a Forrester Research report, although 92% of companies surveyed believe having an integrated view of customer data is either "critical" or "very important," only 2% have actually managed to achieve that goal.[2]

CDI is technically a subset of MDM (Master Data Management) which comprises a set of processes and tools which consistently define and manage the nontransactional data entities of an organization. CDI and MDM however share a common logical approach. Both integrate data from across different sources. Both document data lineage and data evolution over time. Both strive to achieve single “golden” records which consolidate data and eliminate duplication of information. MDM is often perceived as covering a broader spectrum of data. However, in reality, although initially focused on customer data, CDI solutions can cover much of the same ground. The essential construct is the same – a truly robust CDI solution can be readily expanded to include larger MDM applications by moving beyond customer data to include that of other key parties.

What CDI Is Not[3]
There are really two separate definitions of the term “customer data integration.” On its surface, the term refers to a set of tried and true methods and technologies for integrating customer data from different data sources, a classic problem for both operational systems and reporting. But the emerging definition of CDI is more specific to the evolving technological capabilities that, when combined, help to automate the reconciliation and synchronization of the data from disparate systems in order to propagate it to systems across the enterprise for a range of uses and processing. As many people learn more about CDI they relate it to their existing paradigms and often color it with long-held technology biases. In defining CDI, it’s helpful to discuss how it differs with existing technology solutions. CDI is not:

  • A CRM tool
  • A solution to a technical problem
  • A replacement for a data warehouse
  • An “application”
  • An analysis tool
  • An Operational Data Store (ODS)
  • The automation of a customer data model

The Value and Uses of Customer Data Integration (CDI)[4]
Customer data integration may involve between six and 12 fields of data for every individual customer, such as name prefix, first name, last name, middle name or initial, name suffix, nickname, maiden name and professional or academic title. Complicating the data management further, much of this data changes frequently and becomes obsolete. For example, customers may change their names, move, get divorced or die. The value of the data is divided into five categories:

  • Completeness: Organizations may lack all the required data to make sound business decisions.
  • Latency: If data is not used quickly enough, it can become obsolete.
  • Accuracy
  • Management: Data integration, governance, stewardship, operations and distribution all combine to make or break the value of the data.
  • Ownership: The more dissimilar customers are, the more difficult it will be to use the data to make decisions.

Accurate and comprehensive customer data retrieved through CDI has many uses and applications. These include:

  • Providing raw data for various service providers
  • Optimizing product assortment, promotion, pricing and inventory rotation (merchandising)
  • Reducing waste
  • Choosing the best locations for branch offices or outlets
  • Supporting customer relationship management
  • Supporting master data management
  • Differentiating customers and their needs

Customer Data Integration
source: Xoriant

The Importance of Customer Data Integration[5]
Customer Data Integration is a key part of Customer Relationship Management. If conducted properly, CDI ensures that all departments in the company have constant access to the most current and complete view of customer information. This access and singular, complete view of your customers across the organisation helps to drive:

  • Improved customer retention by creating a personalised customer service with customer facing employees having a detailed view of all their interactions with the organisation.
  • A more tailored journey for marketing and sales communications – increasing the chance of them converting.

CDI Styles[6]
Currently, there are four "styles" of CDI implementations. Before committing to a specific CDI style, organizations need to consider the fundamental business purpose of their current/future CDI hub, including: frequency of business change, scope/latency of unified views, legacy IT environment, operational versus analytical applications need, types of data sources and data governance policies. Beyond these characteristics, an organization should choose a CDI style that is future-proof, i.e., it can adapt to merger & acquisitions, re-organizations and other systemic organizational changes. This involves four critical factors. First, the CDI hub architecture must adapt easily to changes over time, such as adding new business processes, data sources and applications. Second, it must allow for ongoing data stewardship/governance by both business and IT teams. Third, it must be an extensible IT platform in order to build new data views and services. Finally, it should be able to deliver these views in multiple modes - real-time and batch - to other systems at the performance and scaling requirements of the business. Today, there are four different CDI styles which include:

  • Custom-Built Data Hub: This style reflects the historical way of building customer hubs using software tools and custom-coded rules. Building such a hub requires the use of an Extract,Transform,Load (ETL) tool to bring data into the hub's data model from multiple sources and Data Quality (DQ) tools to cleanse and match similar records within the hub. These matched records are then merged based on simple, custom-coded rules, which generally results in the system choosing one record over another. Typically, ongoing code development is needed to integrate this custom hub to downstream systems, to make changes to match and merge rules and to add new data sources.

This tools-based approach is the least adaptable to business changes and has severely limited extensibility to new data sources.

  • Fixed Transaction Hub: Several application vendors like Siebel and Oracle offer this style of CDI hub which is built on a comprehensive but fixed data model and developed to support a specific set of applications. Despite its richness, this data model may not encapsulate all the relevant attributes required for unified customer views and needed by disparate downstream systems. To build the hub, all data from contributing sources is conformed to this data model, which may require extending the existing model and creating extensive ETL scripts to bring in data sources outside of the vendor's model. This makes it difficult to upgrade the hub to the next version of the software product and may require specific tools to systematically update their normalized data model - which results in longer times for data loads. Once data is cleansed, matched and merged, it becomes an operational data store (ODS) that serves up-to-date customer data, to multiple applications. The entire customer data - reference, relationship and transaction - required for a unified view is persisted in advance and tied to the hub's fixed data model. As a result, this style is best suited when there are only a few operational applications - identified in advance - supported by the hub's fixed data model and when the business does not anticipate significant changes (i.e. additions of new applications/data sources).
  • Match and Link Cross-Reference Hub: Also referred to as the "registry" style, this approach is offered by certain best-of-breed CDI vendors that historically provided matching tools. As a result, the data model of the customer hub contains only the selective attributes needed to match similar records across multiple data sources. The hub matches against these attributes and links the matched records to create a customer identity master store. This hub physically stores only the global ID, the cross-references ("links") back to source systems and any mappings/transformations necessary to achieve semantic reconciliation. With this style, standard DQ and ETL tools may be used to cleanse the source data and transfer the data from the sources to the system area for matching. Since there is neither any resolution of matched yet conflicting records, nor any history of past data states, this style cannot offer the best, resolved view of customer master data. While this style offers beguilingly fast performance with low product "footprint," it does so at the expense of critical functionality - primarily because little data is being stored or managed. Similarly, this style is highly scalable as long as functionality remains limited to persisting cross-reference links and no other data is stored/accessed (i.e. transaction or meta-data). Data stewardship is also limited since there are no merged master records to manage. Net, this style doesn't create a full transactional hub for serving complete customer views, or an IT platform for developing and delivering data services to downstream systems.
  • Adaptive Transaction Hub: This style emerged most recently to address the limitations of the above approaches. With the adaptive style, the hub is built as a platform for consolidating data from disparate third party and internal sources and for serving unified customer views to operational applications, analytical systems or both. This hub is data-model-neutral and uses templates, tailored to each industry, which allows enterprise-specific data models to be implemented quickly. With the data model defined, data is loaded using standard ETL tools and cleansed via integrated third party DQ tools. Beyond just matching similar records, it merges matched records to build a "best-of-breed" master record that reflects the best-version-of-truth - at the cell-level - across multiple source systems. In effect, becoming the customer master or "system-of-record" for all systems. This approach delivers a real-time hub that has a reliable, persistent foundation of master reference and relationship data, along with all the history and lineage of data changes needed for audit and compliance tracking. On top of this persistent master data foundation, the hub can dynamically aggregate transaction data - on demand - from different source systems to compose/deliver unified customer views to downstream systems. The scalability and performance characteristics of an adaptive transaction hub can also be altered - at multiple levels - to fit business requirements. Once built, it delivers unified views to portals or embeds them within applications. Data can also be accessed through batch interfaces, published to a message bus or served through a real-time services layer. As a platform, new data sources can be readily added in this approach by extending the data model and by configuring the new source mappings, meaning that all legacy data hubs - CIF systems, identity and data cleansing hubs - can be leveraged to contribute their records/rules into the new transaction hub. Finally, through rich-user-interfaces for data stewardship, it allows exception handling by business analysts to keep it current with business rules/practices while maintaining the reliability of best-of-breed master records. Designed as a manageable, extensible and scaleable platform, the adaptive transaction hub serves as the most reliable foundation for delivering trusted unified customer views to all systems.

Customer Data Integration Challenges[7]
Undertaking a CDI program is fraught with a few challenges. The CIO needs to carry out due diligence before investing in the program.

  • Unintentional data inaccuracies in customer data hub: Accidental update to customer record in the hub would result in inaccurate version of the data being used by all departments/ subsystems dependent on it.
  • Single point of failure: In the absence of backup and recovery mechanism for the customer data hub, organization’s business can be adversely impacted.
  • Data model rigidity: Vendors design data models to be best suited for their applications, thereby restricting the scope for changes that the organization may need.

Benefits of Customer Data Integration[8]

  • CDI Reduces Customer Churn: CDI enables organizations to enact programs that enhance customer loyalty and improve retention. Across the organization, CDI enables you to:
    • Determine high-value customers and apply the appropriate level of service
    • Identify “hot prospects” for additional products/services
    • Personalize and individualize customer communications across all points of contact
  • CDI Increases Agility and Reduces Costs: By reducing data and systems redundancies, CDI makes data processing and application systems more efficient, while helping to significantly reduce costs. That, however, is just the beginning. In the process, CDI provides efficiencies that often translate to dramatically faster time-to-market for new products and services. Once an organization integrates customer data well, and continues to follow up on the integration to keep records current and complete, they avoid time-consuming project by project integration efforts. Coordination is simplified, and the organization becomes far more nimble in addressing the ever-changing needs of their marketplaces. CDI efficiencies range from IT to customer care/service/call center, to marketing as well as across other staff functions. With CDI, day-to-day operations run more smoothly, and organizations can shift their IT focus toward other systems development.

While the primary reason most companies pursue a CDI strategy is profitability related, CDI often provides other important benefits as well.

  • Compliance: companies are subjected at different levels to all the recent regulatory and Homeland Security initiatives such as Sarbanes-Oxley, Patriot Act, Department of Treasury’s Office of Foreign Assets Control (OFAC) and the Health Insurance Portability and Accountability Act (HIPAA). All of these initiatives require a solid data foundation. These more stringent requirements mean that businesses often need to retain more extensive customer data and to have better data access and control. Today’s requirements often mean that companies need to increase:
    • Data accuracy and timeliness
    • Traceability of transactions for audit trails
    • Point-in-time accountability
  • Fraud Detection: CDI enables improved customer analysis, and with that, the potential to better protect the organization from fraud. By using an actively managed, central store of customer data, organizations can gain real-time insight into the identity of new applicants — and better detect behavior patterns indicative of possible issues.

See Also

Customer Data Integration (CDI) is a process that consolidates data about customers from various sources to provide a single, comprehensive view. This unified data can then be used across the organization to improve customer service, enhance decision-making, and effectively tailor marketing efforts. CDI combines data from disparate systems such as CRM, sales, marketing, and customer support platforms, ensuring data consistency, accuracy, and accessibility. Effective CDI strategies enable businesses better to understand customer behaviors, preferences, and trends, leading to improved customer experiences and business outcomes.

  • Data Warehouse: Discussing centralized repositories that store data from multiple sources in a single location for reporting, analysis, and data mining. Data warehousing is a foundational technology for CDI, enabling the consolidation and management of customer data.
  • Master Data Management (MDM): Covering the process of defining and managing an organization's critical data (such as customer and product data) to provide, with the help of a single point of reference, an authoritative view of this data. MDM is closely related to CDI, focusing on creating a single master record for each customer.
  • Data Quality: Explaining the importance of accuracy, completeness, reliability, and relevance of data. Data quality initiatives are crucial for successful CDI, ensuring that integrated customer data is accurate and actionable.
  • Data Governance: Discussing the overall management of the availability, usability, integrity, and security of data used in an organization. Data governance frameworks support CDI by establishing policies and procedures for data management and use.
  • Customer Relationship Management (CRM): Covering the strategies and technologies used by companies to manage and analyze customer interactions and data throughout the customer lifecycle. CDI enhances CRM systems by providing a unified view of customer data across different touchpoints.
  • Business Intelligence (BI) : Discussing the technologies, applications, strategies, and practices used to collect, integrate, analyze, and present business information. BI benefits from CDI through access to comprehensive and integrated customer data for analysis and decision-making.
  • Data Integration Tools and Technologies: Covering the software solutions used to aggregate, cleanse, and consolidate data from disparate sources. These tools are critical for implementing CDI strategies.
  • Big Data Analytics: Discussing the process of examining large and varied data sets to uncover hidden patterns, customer preferences, and other insights. CDI feeds into big data analytics by providing a unified data source.
  • Customer Data Platform (CDP): Explaining platforms that create a persistent, unified customer database accessible to other systems. CDPs are a solution for achieving CDI by aggregating and organizing customer data across channels.
  • Real-Time Processing: Covering the capabilities for processing data as it becomes available, enabling immediate analysis and action. Real-time data processing is increasingly important in CDI for delivering up-to-date customer insights.
  • Privacy Policy and Data Protection Regulations: Discussing the legal frameworks such as GDPR (General Data Protection Regulation) and CCPA (California Consumer Privacy Act) that impact the collection, storage, and use of customer data. CDI must comply with these regulations to ensure customer data is handled ethically and legally.
  • Personalization and Customer Experience (CX): Explaining how integrated customer data can be used to tailor interactions and services to individual customer needs and preferences, enhancing the overall customer experience.


  1. Defining Customer Data Integration (CDI) IBM
  2. What is Customer Data Integration (CDI)? Techtarget
  3. What CDI Is Not BI Best Practices
  4. The Value and Uses of Customer Data Integration (CDI) Techopedia
  5. Why is Customer Data Integration important? Experian
  6. CDI Styles CRM Buyer
  7. Customer Data Integration Project Challenges Computer Weekly
  8. Benefits of Customer Data Integration PBInsight

Further Reading

  • Customer Data Integration (CDI) – for a single view of your customer Athena
  • 10 best Practices for Integrating your Customer Data Scribe
  • Benefit from customer data integration (CDI) on cloud platform Computer Weekly
  • Customer Data Integration: Reaching a Single Version of the Truth Jill Dyche, Evan Levy