Data Access

What is Data Access?

Data Access refers to a user's ability to access or retrieve data stored within a database or other repository. Users who have data access can store, retrieve, move, or manipulate stored data, which can be stored on a wide range of hard drives and external devices.[1]Data access works with complementary technologies, including data virtualization and master data management, to put your data to work on-premise, in the cloud, and everywhere in between.

Types of Data Access

Data access is one of the main outputs of effective data governance programs. Organizations should ideally have well-thought-out, structured means of granting data access to different users. This is reinforced by various permissions and levels of security required for data access. Frequently, these permissions are based on organizational roles or responsibilities, which are structured according to data governance policies. When data is at rest in a repository, there are two basic ways of accessing it: sequential access and random access:

  • Sequential access uses a seek operation to move the different data on a disk until the requested data is found. However, each data segment is read (in sequential order) until the sought-after data is found, which can tax computational resources. Still, this method is often faster than random access because it requires fewer seek procedures than random access does.
  • Random access stores or retrieves data from anywhere on the disk. The advantage of this approach is that not all data has to be read in sequential order to find what a user’s looking for. Also, the data is located in constant time, which means there’s an upper limit to how long it will take for it to be retrieved. When that limit is less than how long it could take to sequentially read and retrieve data, random access is preferable.[2]

Data Access Use Cases

Data Access Use cases include:

  • Reporting: Business users often need to compile information from legacy, relational, and non-relational data sources. With data access tools, end users can get the data they need, so IT can reduce internal support requests and free up valuable time. Users can run queries and create customized reports based on their own needs, saving them for distribution in a wide range of file formats, including Excel®, PDF, HTML, image, and delimited.
  • Web access: Most companies have mission-critical data running to and from their website, customer portal, or intranet for services that include e-commerce, customer support, service chain, and supply chain management. All of these web-based applications require a data access platform if they have legacy systems in place to store their information.
  • Application modernization: Modernizing data access with the web enables companies to engage with customers, suppliers, partners, and employees in more far-reaching and powerful ways, putting data "in the trenches” where it can be used effectively to keep operations moving forward.[3]

Data Access Control

Data access control is a fundamental security tool that enables you to restrict access based on a set of policies. By implementing robust policies for your data access, you help in keeping personally identifiable information (PII), intellectual property, and other confidential information from getting into the wrong hands, whether internally or externally. Data access control works by verifying the user’s identity to ensure they are who they claim they are, and ensuring the users have the right to access the data.

How Organizations Approach Data Access

When organizations are putting together data governance frameworks, they have to manage a variety of concerns, some of which seem to contradict each other. There are three primary concerns that organizations always have to consider when forming data access policies. This is the CIA Triad:

  • Confidentiality: No one should be able to access data unless authorized to do so.
  • Integrity: The system should never allow a data operation that will cause errors or data loss.
  • Availability: Whenever someone has a legitimate business need, they should have unfettered access to data.

Addressing these concerns is often a delicate balancing act. There has to be a careful equilibrium between security and ease of use.

Each organization devises a data access policy that suits its specific needs. These policies often emerge over time, and they can evolve as the business grows. When building a data access policy, companies will typically walk through the following steps:

  • Categorize the Data: Not all data is the same. Data access is rarely one-size-fits-all. Typically, organizations build flexible policy that suits various circumstances. There are different categories, and each of these categories will have its own data access policy. The main categories are:
    • PII: Personally Identifiable Information is sensitive information about actual people, such as their names, addresses, or social security numbers. Companies may have a regulatory responsibility to protect this data. As such, this data requires tight access controls.
    • Sensitive business information: Leaked internal information can threaten a company’s position. This might include unpublished financial records or analytics results. Such data calls for very a strict access policy.
    • Low-risk data: Some data may not present a major security risk. For example, pseudonymized customer data or publicly available company information. A more relaxed access policy might be appropriate here.
    • System information: This is data that other systems generate automatically, such as network logs and error reports. This information rarely requires strong access controls.
  • Review Compliance Requirements: Regulations can have a tremendous impact on data access policy. For instance, Europe’s GDPR rules specify that employees can only access PII when they have a legitimate business purpose. The law also restricts international data transfers, which could affect data access if an organization uses a cloud service based abroad. In general, companies need to contemplate the following questions:
    • What local laws impact our data access policy? (i.e., CCPA)
    • What industry laws impact data access (i.e., HIPAA)
    • Are we transacting in areas with tougher laws? (for example, GDPR applies to American companies that do business with European customers)
    • How can we anticipate new laws and future-proof our data access policy?
  • Build a Centralized Data Structure: Data access management is tricky, but it’s easier with centralized data. For instance, picture a company with a dozen discrete systems. That company may need to establish a dozen individual data access policies to cover each one of those systems. Alternatively, companies can store all data in a central repository, such as a data warehouse. They would usually implement this by using an Extract, Transform, Load (ETL) process. The ETL will pull data from each source, integrate it, and then load it to a central location. This approach is perhaps the best way to tackle the CIA Triad. Centralized data is of good quality and easily available. Companies can also limit access to the data warehouse, which helps ensure confidentiality.
  • Grant Role-Based Access: So who gets to access the data? This is the biggest question concerning data access, and it gets harder to answer as an organization grows. If a company has five staff members, then a database administrator can set personal access levels. When there are 5,000 employees, that’s not possible. Role-based access is the most elegant solution to this problem. The administrators create a set of roles based on job title, seniority, and other factors. They then assign each user to one of these roles. If the user changes positions, the admins don’t manually reconfigure their permissions. Instead, they just assign them a new role. This helps get the data access balance right. An example of this is customer data. Salespeople and service agents will both need to see this data, but they might look at different subsets. With role-based access, you can configure data access so that service agents can view active customers but not view unconverted leads.
  • Log Data Transactions: The last element of data access is logging. Organizations should build their data infrastructure in a way that offers visibility and accountability. When someone performs an unauthorized data action, the system should have a log of exactly what happened. This is another area where ETL makes a difference. ETL can power a data pipeline that connects each database. It can also help to keep track of transactions that occur within the ETL pipeline so that admins have a paper trail if something goes wrong. This is an important step in ensuring a strong, functional data access policy.[4]

Best Practices in Secure Data Access

Although implementing data access control is the way to go in order to protect the sensitive data of your organization, it isn’t sufficient to keep your data from being compromised. Let’s have a look at some of the best practices that are involved in secure data access.

  1. Encrypt Data at Rest and in Transit: Data encryption is one of the best ways to ensure better data access control, but it has to be done both when data is at rest or in transit. Data in transit is encrypted before it is transmitted, and the system endpoints are authenticated. The data is decrypted when access is granted and the user retrieves it. For encryption at rest, the data stored in the warehouse or repository is encrypted.
  2. Use Proper Authentication: Another important thing to do is to maintain strong and proper authentication for your entire data, otherwise, your data access protocols won’t be able to protect your systems from data theft and breaches.
  3. Set Network Access Control: Network access control is used to combine endpoint security, user authentication, and network security in order to keep unauthorized users and devices from penetrating or compromising a private network. As long as the network is protected, it would be easier for administrators to enforce stricter access control policies.
  4. Have Clear Security Policies: The key to secure data access is to have clear and defined security policies that govern and protect the access and transmission of data. In this regard, you can also make do with the principle of least privilege, which dictates that if a user doesn’t require access to a certain data or function, then they shouldn’t be granted access to it.[5]

See Also

Data access refers to the methods and processes involved in retrieving and manipulating data from databases, data warehouses, or other data storage systems. It's a foundational concept in fields like database management, software development, and information technology, playing a crucial role in ensuring that data is available, secure, and usable for applications and end-users.

  • Database Management System (DBMS): Exploring the software tools that enable users to store, modify, and extract information from a database. Discussions can include types of DBMS, such as relational, NoSQL, and NewSQL databases.
  • SQL (Structured Query Language): Covering the standard language used to query and manage data in relational databases. SQL topics might include data query, data manipulation, and database schema creation commands.
  • Data Warehouse: Discussing large repositories of corporate data, collected from different sources and optimized for analysis and reporting. This includes topics on data warehouse design, ETL (extract, transform, load) processes, and data mart.
  • Data Privacy and Data Security: Highlighting the importance of protecting data from unauthorized access and ensuring that personal and sensitive information is handled in compliance with legal and ethical standards.
  • APIs (Application Programming Interfaces): Exploring how APIs enable different software applications to communicate with each other and access data services, including RESTful APIs and SOAP (Simple Object Access Protocol) for web services.
  • Data Governance: Covering the process of managing the availability, usability, integrity, and security of the data in enterprise systems, based on internal data standards and policies that also control data usage.
  • Big Data Technologies: Discussing the tools and techniques used to process and analyze large volumes of data that cannot be handled by traditional database systems, such as Hadoop and Spark.
  • Cloud Storage and Databases: Exploring how data is stored, managed, and accessed in cloud environments, including discussions on cloud database services like Amazon RDS, Google Cloud SQL, and Microsoft Azure SQL Database.
  • Data Integration: Covering the processes involved in combining data from different sources to provide a unified view, including data consolidation, data federation, and data virtualization techniques.
  • OLAP (Online Analytical Processing): Discussing technologies that allow users to analyze multidimensional data from multiple perspectives, including concepts like OLAP cubes and dimensions.
  • ORM (Object-Relational Mapping): Exploring tools and techniques that facilitate the conversion of data between incompatible type systems in object-oriented programming languages, simplifying data access in software development.
  • Transaction Management: Covering the principles and mechanisms for managing database transactions to ensure data integrity and consistency, including concepts like ACID (Atomicity, Consistency, Isolation, Durability) properties.
  • Data Access Control