Business Dictionary defines Data Analysis as "the process of evaluating data using analytical and logical reasoning to examine each component of the data provided". This form of analysis is just one of the many steps that must be completed when conducting a research experiment. Data from various sources is gathered, reviewed, and then analyzed to form some sort of finding or conclusion. There are a variety of specific data analysis method, some of which include data mining, text analytics, business intelligence, and data visualizations.1Types of Data Analysis2
Categories of Data Analysis
- Descriptive Analysis: is the first type of data analysis that is usually conducted. It describes the main aspects of the data being analyzed. For example, it may describe how well a football player is performing by looking at the number of touchdowns. This allows one to make comparisons among different athletes.
- Exploratory Analysis: is when one is looking for unknown relationships. This type of analysis is a great way to find new connections and to provide future recommendations.
- Inferential Analysis: When a researcher takes a small sample in order to point out something about a larger population, they are using inferential analysis, for instance, looking at the grades of all first graders to explain how well the entire elementary school is doing.
- Predictive Analysis predicts future happenings by looking at current and past facts.
- Causal Analysis is used to find out what happens to one variable when you change some other variable. So, if police give out tickets for texting, this may cause less accidents to occur.
Considerations/Issues in Data Analysis5
- Quantitative Data Analysis: In quantitative data analysis you are expected to turn raw numbers into meaningful data through the application of rational and critical thinking. The same figure within data set can be interpreted in many different ways; therefore it is important to apply fair and careful judgement. For example, questionnaire findings of a research titled “A study into the impacts of informal management-employee communication on the levels of employee motivation: a case study of Agro Bravo Enterprise” may indicate that the majority 52% of respondents assess communication skills of their immediate supervisors as inadequate.This specific piece of primary data findings needs to be critically analyzed and objectively interpreted through comparing it to other findings within the framework of the same research such as organizational culture of Agro Bravo Enterprise, leadership styles exercised, the levels of frequency of management-employee communications etc.3
- Qualitative Data Analysis: Qualitative data analysis is the process in which we move from the raw data that have been collected as part of the research study and use it to provide explanations, understanding and interpretation of the phenomena, people and situations which we are studying. The aim of analyzing qualitative data is to examine the meaningful and symbolic content of that which is found within. What we are aiming for is to try to identify and understand such concepts, situations and ideas as:
>A person’s interpretation of the world/situation in which they find themselves at any given moment.
>How they come to have that point of view of their situation or environment in which they find themselves.
>How they relate to others within their world.
>How they cope within their world.
>Their own view of their history and the history of others who share their own experiences and situations.
>How they identify and see themselves and others who share their own experiences and situations.4
There are a number of issues that researchers should be cognizant of with respect to data analysis. These include:
A Diagrammatic Illustration of The Data Analysis Process (Figure 1.)Figure 1.
- Having the necessary skills to analyze
- Concurrently selecting data collection methods and appropriate analysis
- Drawing unbiased inference
- Inappropriate subgroup analysis
- Following acceptable norms for disciplines
- Determining statistical significance
- Lack of clearly defined and objective outcome measurements
- Providing honest and accurate analysis
- Manner of presenting data
- Environmental/contextual issues
- Data recording method
- Partitioning ‘text’ when analyzing qualitative data
- Training of staff conducting analyses
- Reliability and Validity
- Extent of analysis
source: Barbara FusinskaPhases in the Data Analysis Process (Figure 2.)6
Data Analysis Process consists of the following phases that are iterative in nature −
- Data Requirements Specification: The data required for analysis is based on a question or an experiment. Based on the requirements of those directing the analysis, the data necessary as inputs to the analysis is identified (e.g., Population of people). Specific variables regarding a population (e.g., Age and Income) may be specified and obtained. Data may be numerical or categorical.
- Data Collection: Data Collection is the process of gathering information on targeted variables identified as data requirements. The emphasis is on ensuring accurate and honest collection of data. Data Collection ensures that data gathered is accurate such that the related decisions are valid. Data Collection provides both a baseline to measure and a target to improve.
- Data Processing: The data that is collected must be processed or organized for analysis. This includes structuring the data as required for the relevant Analysis Tools. For example, the data might have to be placed into rows and columns in a table within a Spreadsheet or Statistical Application. A Data Model might have to be created.
- Data Cleaning: The processed and organized data may be incomplete, contain duplicates, or contain errors. Data Cleaning is the process of preventing and correcting these errors. There are several types of Data Cleaning that depend on the type of data. For example, while cleaning the financial data, certain totals might be compared against reliable published numbers or defined thresholds. Likewise, quantitative data methods can be used for outlier detection that would be subsequently excluded in analysis.
- Data Analysis: Data that is processed, organized and cleaned would be ready for the analysis. Various data analysis techniques are available to understand, interpret, and derive conclusions based on the requirements. Data Visualization may also be used to examine the data in graphical format, to obtain additional insight regarding the messages within the data. Statistical Data Models such as Correlation, Regression Analysis can be used to identify the relations among the data variables. These models that are descriptive of the data are helpful in simplifying analysis and communicate results. The process might require additional Data Cleaning or additional Data Collection, and hence these activities are iterative in nature.
- Communication: The results of the data analysis are to be reported in a format as required by the users to support their decisions and further action. The feedback from the users might result in additional analysis. The data analysts can choose data visualization techniques, such as tables and charts, which help in communicating the message clearly and efficiently to the users. The analysis tools provide facility to highlight the required information with color codes and formatting in tables and charts.
source: Tutorials PointBenefits and Challenges of Data Analysis7
Data analysis is a proven way for organizations and enterprises to gain the information they need to make better decisions, serve their customers, and increase productivity and revenue. The benefits of data analysis are almost too numerous to count, and some of the most rewarding benefits include getting the right information for your business, getting more value out of IT departments, creating more effective marketing campaigns, gaining a better understanding of customers, and so on.data analysis models. But, there is so much data available today that data analysis is a challenge. Namely, handling and presenting all of the data are two of the most challenging aspects of data analysis. Traditional architectures and infrastructures are not able to handle the sheer amount of data that is being generated today, and decision makers find it takes longer than anticipated to get actionable insight from the data. Fortunately, data management solutions and customer experience management solutions give enterprises the ability to listen to customer interactions, learn from behavior and contextual information, create more effective actionable insights, and execute more intelligently on insights in order to optimize and engage targets and improve business practices.Barriers to Effective Analysis8
Barriers to effective analysis may exist among the analysts performing the data analysis or among the audience. Distinguishing fact from opinion, cognitive biases, and innumeracy are all challenges to sound data analysis.
- Confusing Fact and Opinion: Effective analysis requires obtaining relevant facts to answer questions, support a conclusion or formal opinion, or test hypotheses. Facts by definition are irrefutable, meaning that any person involved in the analysis should be able to agree upon them. For example, in August 2010, the Congressional Budget Office (CBO) estimated that extending the Bush tax cuts of 2001 and 2003 for the 2011-2020 time period would add approximately $3.3 trillion to the national debt. Everyone should be able to agree that indeed this is what CBO reported; they can all examine the report. This makes it a fact. Whether persons agree or disagree with the CBO is their own opinion. As another example, the auditor of a public company must arrive at a formal opinion on whether financial statements of publicly traded corporations are "fairly stated, in all material respects." This requires extensive analysis of factual data and evidence to support their opinion. When making the leap from facts to opinions, there is always the possibility that the opinion is erroneous.
- Cognitive Biases: There are a variety of cognitive biases that can adversely effect analysis. For example, confirmation bias is the tendency to search for or interpret information in a way that confirms one's preconceptions. In addition, individuals may discredit information that does not support their views. Analysts may be trained specifically to be aware of these biases and how to overcome them. In his book Psychology of Intelligence Analysis, retired CIA analyst Richards Heuer wrote that analysts should clearly delineate their assumptions and chains of inference and specify the degree and source of the uncertainty involved in the conclusions. He emphasized procedures to help surface and debate alternative points of view.
- Innumeracy: Effective analysts are generally adept with a variety of numerical techniques. However, audiences may not have such literacy with numbers or numeracy; they are said to be innumerate. Persons communicating the data may also be attempting to mislead or misinform, deliberately using bad numerical techniques. For example, whether a number is rising or falling may not be the key factor. More important may be the number relative to another number, such as the size of government revenue or spending relative to the size of the economy (GDP) or the amount of cost relative to revenue in corporate financial statements. This numerical technique is referred to as normalization8 or common-sizing. There are many such techniques employed by analysts, whether adjusting for inflation (i.e., comparing real vs. nominal data) or considering population increases, demographics, etc. Analysts apply a variety of techniques to address the various quantitative message and may also analyze data under different assumptions or scenarios. For example, when analysts perform financial statement analysis, they will often recast the financial statements under different assumptions to help arrive at an estimate of future cash flow, which they then discount to present value based on some interest rate, to determine the valuation of the company or its stock. Similarly, the CBO analyzes the effects of various policy options on the government's revenue, outlays and deficits, creating alternative future scenarios for key measures.
See AlsoBig DataBusiness IntelligenceData AnalyticsPredictive AnalyticsData CleansingData MiningData ManagementData WarehouseCustomer Data Management (CDM)
- Data Analysis and the Future of Health Care WSJ
- Data Is Useless Without the Skills to Analyze It HBR
- My Nine 'Truths' of Data Analysis Ronald S Thomas