Data Governance

According to DAMA International, "Data Governance is the exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets.”[1] Data governance is the discipline of cataloging and defining important data, assigning ownership of data and incorporating governance of data into the everyday business process.

Data Governance
source: DAMA.Org

What Data Governance is Not

Data Governance is frequently confused with other closely related terms and concepts, including data management and master data management.[2]

  • Data Governance is Not Data Management

Data Management refers to the management of the full data lifecycle needs of an organization. Data governance is the core component of data management, tying together nine other disciplines, such as data quality, reference and master data management, data security, database operations, metadata management, and data warehousing.

  • Data Governance is Not Master Data Management

Master Data Management (MDM) focuses on identifying an organization's key entities and then improving the quality of this data. It ensures you have the most complete and accurate information available about key entities like customers, suppliers, medical providers, etc. Because those entities are shared across the organization, master data management is about reconciling fragmented views of those entities into a single view—a discipline that gets beyond data governance. However, there is no successful MDM without proper governance. For example, a data governance program will define the master data models (what is the definition of a customer, a product, etc.), detail the retention policies for data, and define roles and responsibilities for data authoring, data curation, and access.

  • Data Governance is Not Data Stewardship

Data governance ensures that the right people are assigned the right data responsibilities. Data Stewardship refers to the activities necessary to make sure that the data is accurate, in control, and easy to discover and process by the appropriate parties. Data governance is mostly about strategy, roles, organization, and policies, while data stewardship is all about execution and operationalization. Data stewards take care of data assets, making certain that the actual data is consistent with the data governance plan, linked with other data assets, and in control in terms of data quality, compliance, or security.

Data Governance Process

The Process Stages of Data Governance[3]
To truly manage data as a valued enterprise asset, data governance must be managed as a business function like finance or human resources. The primary business processes (illustrated in the figure below) enable data governance and stewardship, which include processes that cleanse, repair, mask, secure, reconcile, escalate, and approve data discrepancies, policies and standards. There are over twenty distinct processes segmented into four core process stages – all of which are iterative and encompass many parallel activities depending on the stage of maturity:

The Process Stages of Data Governance
source: Informatica

  • Discover processes capture the current state of an organization’s data lifecycle, dependent business processes, supporting organizational and technical capabilities, as well as the state of the data itself. Leverage insights derived from these steps to define the data governance strategy, priorities, business case, policies, standards, architecture and the ultimate future state vision. This process runs parallel and is iterative to the Define process stage as Discovery drives Definition, and Definition drives more targeted focus for Discovery.
  • Define processes document data definitions and business context associated with business terminology, taxonomies, relationships, as well as the policies, rules, standards, processes, and measurement strategy that must be defined to operationalize data governance efforts. This process runs parallel and is iterative to the Discover process stage as mentioned above.
  • Apply processes aim to operationalize and ensure compliance with all the data governance policies, business rules, stewardship processes, workflows, and cross-functional roles and responsibilities captured through the Discover and Define process stages.
  • Measure and Monitor processes i) capture and measure the effectiveness and value generated from data governance and stewardship efforts, ii) monitors compliance and exceptions to defined policies and rules, and iii) enables transparency and auditability into data assets and their life cycle.

A data governance initiative must build competencies, assign roles and responsibilities and invest in technologies to enable these core processes no matter the scope and scale of your business objectives. A pilot data governance project focusing on improving the quality or security of a single data item, phone number as an example, should follow the same approach as a holistic data governance function that’s managing all business critical data assets.

The Importance of Data Governance

Why Data Governance Matters[4]
Most companies already have some form of data governance for individual applications or business departments, although it is not necessarily comprehensively institutionalized. The systematic introduction of data governance is therefore often an evolution from informal rules to formal control.

Formal data governance is normally implemented once a company has reached a size at which cross-functional tasks can no longer be implemented efficiently.

Data governance is a prerequisite for numerous tasks or projects and has many clear benefits:

  • Consistent, uniform data and processes across the organization are a prerequisite for better and more comprehensive decision support;
  • Increasing the scalability of the IT landscape at a technical, business and organizational level through clear rules for changing processes and data;
  • Central control mechanisms offer potential to optimize the cost of data management (increasingly important in the age of exploding data sets);
  • Increased efficiency through the use of synergies (e.g. by reusing processes and data);
  • Higher confidence in data through quality-assured and certified data as well as complete documentation of data processes;
  • Achieving compliance guidelines, such as Basel III and Solvency II;
  • Security for internal and external data by monitoring and reviewing privacy policies;
  • Increased process efficiency by reducing long coordination processes (e.g. through clear requirements management);
  • Clear and transparent communication through standardization. This is the prerequisite for enterprise-wide data-centric initiatives;
  • Further, specific benefits result from the specific nature of each data governance program.

More than ever, data governance is vital for companies to remain responsive. It is also important to open up new and innovative fields of business, for example by big data analyses, which do not permit the persistence of backward thinking and overhauled structures.

Data Governance Framework

Framing Data Governance[5]
Data governance is a combination of strategy and execution (Jill Dyché and Evan Levy Customer Data Integration:Reaching a Single Version of the Truth). It’s an approach that requires one to be both holistic and pragmatic: • Holistic. All aspects of data usage and maintenance are taken into account in establishing the vision. • Pragmatic. Political challenges and cross-departmental struggles are part of the equation. So, the tactical deployment must be delivered in phases to provide quick “wins” and avert organizational fatigue from a larger, more monolithic exercise.

To accomplish this, data governance must touch all internal and external IT systems and establish decision-making mechanisms that transcend organizational silos. And, it must provide accountability for data quality at the enterprise level. The SAS Data Governance framework (illustrated below) is a comprehensive framework for data governance that includes all the components needed to achieve a holistic, pragmatic data governance approach.

Data Governance Framework
source: SAS

The top portion of the framework – Corporate Drivers – deals with more strategic aspects of governance, including the corporate drivers and strategies that point to the need for data governance. The Data Governance and Methods sections refer to the organizing framework for developing and monitoring the policies that drive data management outcomes such as data quality, definition, architecture and security.

The Data Management, Solutions and Data Stewardship sections focus on the tactical execution of the governance policies, including the day-to-day processes required to proactively manage data and the technology required to execute those processes.

While the framework can be implemented incrementally, there are significant benefits in establishing a strategy to deploy additional capabilities as the organization matures and the business needs require new components. It’s important to develop a strategy that can address short-term needs while establishing a more long-term governance capability.

Benefits of Data Governance

Benefits of Data Governance Include:[6]

  • Decrease the costs associated with other areas of Data Management.
  • Ensure accurate procedures around regulation and compliance activities.
  • Increase transparency within any data-related activities.
  • Help with instituting better training and educational practices around the management of data assets.
  • Increase the value of an organization’s data.
  • Provide standardized data systems, data policies, data procedures, and data standards.
  • Aid in the resolution of past and current data issues.
  • Facilitate improved monitoring and tracking mechanisms for Data Quality and other data-related activities.
  • Increase overall enterprise revenue.

Data Governance Drivers and Regulations

The Drivers and Regulatory Requirements of Data Governance[7]
While data governance initiatives can be driven by a desire to improve data quality, they are more often driven by C-Level leaders responding to external regulations. In a recent report conducted by CIO WaterCooler community, 54% stated the key driver was efficiencies in processes; 39% - regulatory requirements; and only 7% customer service. Examples of these regulations include Sarbanes–Oxley Act, Basel I, Basel II, HIPAA, GDPR, cGMP, and a number of data privacy regulations. To achieve compliance with these regulations, business processes and controls require formal management processes to govern the data subject to these regulations. Successful programs identify drivers meaningful to both supervisory and executive leadership.

Common themes among the external regulations center on the need to manage risk. The risks can be financial misstatement, inadvertent release of sensitive data, or poor data quality for key decisions. Methods to manage these risks vary from industry to industry. Examples of commonly referenced best practices and guidelines include COBIT, ISO/IEC 38500, and others. The proliferation of regulations and standards creates challenges for data governance professionals, particularly when multiple regulations overlap the data being managed. Organizations often launch data governance initiatives to address these challenges.

Data Governance Success Factors

Successful Efforts in Data Governance[8]
The following factors are critical to the success of your data governance efforts:

  • Recognize that IT governance is the real goal. Data governance is just one part of your IT governance program, and it's highly coupled to other aspects such as development governance, security governance, and so on. Focusing just on data governance puts you at risk of optimizing data governance in such a way that it doesn't work well with the rest of your governance efforts, putting the entire governance program at risk.
  • The governance effort must be owned. If someone is not responsible for the IT governance effort it will very likely die a quick death in your organization. The people most suited to be the owners of IT governance are the executive business stakeholders. When it comes to data governance, the people least suited to govern are data management professionals as they're the ones being governed (amongst others). In short, don't let your data governance efforts devolve into yet another political ploy of your data management group,
  • Have clear, quantifiable goals. What are you trying to achieve? Improved quality? Improved productivity? Improved time to value? Improved stakeholder satisfaction? Combinations thereof?
  • Measure and honestly report the results. It's easy to talk about quantifiable goals, but it takes a fair bit of integrity to live up to promises. It's easy to manage direct costs such as the salaries of the people involved with the governance effort, but a bit more difficult to measure indirect costs such as the opportunity cost of potentially lengthening decision and development life cycles due to increased governance (this issue is particularly acute for traditional governance strategies). Measuring the benefits can also be challenging, although as Doug Hubbard points out in How to Measure Anything: Finding the Value of Intangibles in Business, it is possible if you think outside the box a bit. Automating metrics collection is an important aspect of lean governance.
  • Less is more. You need a lot less governance than what the pro-governance people believe, although probably a bit more governance than what the anti-governance people think. If you find that you need more governance it's a lot easier to add it than it is to remove unnecessary governance activities once they're in place.
  • Educate the people affected. If the people involved, including those being governed, don't understand what you're trying to achieve and or don't believe that any additional effort on their part is worthwhile then your governance effort will quickly fall apart. Furthermore, this sort of education must be ongoing.

Data Governance Challenges

The Challenges of Data Governance[9],br /> Data Governance is one of the biggest challenges that companies face and failing to get it right can not only result in financial implications, but reputational damage too.

  • Data Volumes Are Growing: On one hand, the fact that businesses are developing more and more data is a great thing; it shows that they are expanding and becoming more complex. However, it also means that data governance becomes more complex, as you must consider each individual piece of data, its sensitivity and storage and distribution needs. In order for your data governance strategy to be effective, you must maintain an inventory of all your data, which is naturally more challenging the more data you have.
  • Encouraging Employee Compliance: Unfortunately, one of the biggest challenges in data governance is ensuring that your employees comply with your overall data strategy, something often borne out by a lack of understanding rather than a lack of desire. To combat this you can present your data governance strategy in a centralised and easy-to-access location and you might also invite your employees to give their own opinions and suggestions to improve your strategy and to make it more accessible. Employees are more likely to comply with procedures they have helped to create, as it will not feel like such a restriction.
  • Ensuring Accountability: One of the most important aspects of data governance is assigning accountability. Under the GDPR, it will be compulsory for certain businesses to appoint a Data Protection Officer, who will be accountable for data governance as a whole. However, even if your business is not legally required to do so, it may be prudent to establish a data governance council (with representatives from each department) to define data procedures and policies. This will help to individualise accountability, as procedure and policy can be implemented on a smaller scale, to ensure the business’ overarching data governance strategy is effective.
  • Dealing with Redundant Data: It is only natural that a business will accumulate data that they no longer need. However, it is important that you have a policy for dealing with this redundant data, as storing it unnecessarily will just make the management of your most valuable data even more more of a challenge. Identifying redundant data is only one step, perhaps the most important step is to ensure that your data governance strategy is followed, and that the data is disposed of securely.

Data Governance Best Practices

Essentially, there are 4 data governance best practices when launching a data governance program:[10]
1. Focus on the operating model: The operating model is the basis for any data governance program. It includes activities such as defining enterprise roles and responsibilities across the different lines of business. The idea is to establish an enterprise governance structure. Depending on the type of organization, the structure could be centralized (if a central authority manages everything), decentralized (if operated by a decentralized or group of authorities), or federated (if controlled by independent or multiple groups with little or no shared ownership).
2. Identify Data Domains: After establishing the data governance structure, the next step is to determine the data domains for each line of business. The most famous examples include customer, vendor, and product data domains. Depending on the type of industry, we come across different kinds of domains. But everything boils down to identifying domains and capturing information about a business and its consumers.
3. Identify critical data elements within the data domains: After defining the data domains, now, we are standing at the pinnacle. From here, evidently, we see data domains touching 10s, 100s, and 1000s of systems and applications containing key reports, critical data elements, business processes, and more. Obviously, we don’t want to boil the ocean by focusing on all the data artifacts at once. Instead, we should only identify what’s critical to the business.
4. Define control measurements: The next step is to set and maintain control to sustain the data governance program. Data governance is not a one-time project. It is an ongoing program to fuel data-driven decision making and creating opportunities for business. It prepares an organization to meet business standards. Control measurements include the following key activities:

  • Define automated workflow processes and thresholds for approval, escalation, review, voting, issue management and more
  • Apply workflow processes to the governance structure, data domains, and critical data elements
  • Develop reporting on the progress of steps 1 through step 4
  • Capture feedback through automated workflow processes

See Also

IT Governance
IT Governance Framework
Corporate Governance
Big Data
Master Data
Master Data Management (MDM),br /> Data Mining
Data Management
Data Warehouse
Customer Data Management (CDM)
Business Intelligence
Data Analysis
Data Analytics
Predictive Analytics


  1. Definition: What is Data Governance? DAMA International
  2. What Data Governance is Not Talend
  3. The Process Stages of Data Governance Rob Karel
  4. Why Data Governance Matters
  5. Framing Data Governance SAS
  6. What are the Benefits of Data Governance? Dataversity
  7. The Drivers and Regulatory Requirements of Data Governance Wikipedia
  8. Data Governance Success Factors Agile Data
  9. What are the Challenges of Data Governance EOL
  10. What’s the data governance best practices you should start with when kicking off a governance program? Collibra

Further Reading

  • Effective Data Governance Infosys
  • An Overview of Data Governance Zhang Ning
  • Data Governance is Imperative for Big Data Analytics Ahima
  • The Role Of Data Governance In An Effective Compliance Program Thomas Sehested
  • The Ethics Of Data Governance - 'Data Comes With Benefits And Liabilities' Charles Towers-Clark