What is Data Analytics?
By Helen Lee
Analyzing data is an integral part of life. People make multiple decisions and come to many conclusions daily using information (data) they have at hand. In the simplest terms, data analytics requires the gathering of information, using a method to organize it, and making a decision based on the output of this process. Whether or not it’s formally referred to as data analysis or data analytics, it happens every day.
In This Article:
At a very broad level, Data Analytics involves making decisions or coming to conclusions based on data that has been systemically organized. A more formal definition would include stages of the Data Analytics process, which include Planning, Collecting, Cleansing, Organizing, and Interpreting/Communicating. A deeper dive into the process will reveal how each state is integral to Data Analytics that produces value.
The Planning Phase is where the Data Analytics project details would be articulated. This includes any specific questions that need to be answered and methodologies that will be used. This phase is essential as it maps the details of the subsequent steps. For example, during the Planning phase, analysts will decide what data sources to use for the project. Not only will data sources be determined, but time periods of analysis as well. In other words, from what time period will data be pulled for use in the analysis? Most importantly, careful planning will ensure that the Data Analytics project is providing value that justifies the costs involved.
The Collecting Phase involves the gathering of the data that will be used in the analysis. This may include data stored internally within an organization, obtained through primary survey research, or gathered from external sources. The data may be quantitative or qualitative. Quantitative data is expressed in numbers as measures or counts, while qualitative data expresses traits or characteristics. The difference between the two is quantitative data defines, while qualitative data describes (BusinessDictionary.com).
In the Cleansing Phase, the collected data is checked for quality and usefulness. This phase is important as it has implications on the analysis of the data and ultimately the decisions and conclusions made from it. Data cleansing checks for relevance, corruption, duplication and correct formatting. When possible, data should be corrected or fixed. However, if this is not possible then it should be excluded from the analysis to avoid compromising the validity of the data analysis. After completing the data cleansing, it is important to run reports to confirm the necessary changes were successfully executed and the data is not presenting contradictory information.
The preceding phases are vital and allow the Organizing Phase to run more smoothly. This phase is where the core of Data Analytics takes place. The cleansed data is organized and manipulated to find answers to the questions articulated in the Planning Phase. There are several types of Data Analytics tools that can be used to manipulate the data. Tools can be as simple as spreadsheet software such as Microsoft Excel or more advanced statistical software packages including SPSS or SAS. R, a programming language and software environment, has become a popular option for data analysts.
The final phase is the Interpreting/Communicating Phase. The first step in this phase is to develop a “story” using the answers found in the Organizing Phase. The “story should aid in the development of actionable steps. Decisions and conclusions made in this phase should be supported by the learnings discovered in the prior phase.
An important part of this final phase is communicating these discoveries, and ultimately the conclusions and recommendations, to the stakeholders involved. In most situations, final decisions are not made by one person, but a group. Often, at least some members in the group will have a limited understanding of Data Analytics and the methodologies involved. Therefore, it will be important to communicate the learnings and recommendations clearly and in an uncomplicated manner.
Data visualization will be an important component as the results of any Data Analytics project is communicated. The Visual Teaching Alliance has reported that 65% of the population are visual learners, while 30% have been found to be auditory learners. Including data visualizations in the mix, along with verbal communication, will ensure at least most of your audience’s learning needs are met. By doing so, the audience will more likely provide valuable feedback, and hopefully buy in to your recommendations.
As noted, each phase of the Data Analytics process is critical and therefore cannot stand-alone. It is important a Data Analyst has a thorough understanding and a high level of involvement in each phase to execute a successful Data Analytics project.
Types of Data Analytics
Just as the “parts” (phases) of the Data Analytics process are important to understand, the types are essential as well. There are four main types of Data Analytics. Of these types, one is not superior over another. They each have a role and fulfill a purpose in Data Analytics. The types, as listed, follow an order, with subsequent types building off prior ones.
Descriptive Analytics involves summarizing data into easily understood pieces that are meaningful and useful. This type is the simplest type of Data Analytics and is often used in the Cleansing Phase to identify inconsistencies or anomalies in the data. Descriptive Analytics include basic arithmetic functions such as sum, mean, median, proportions, and percent changes. Organizations will often use this type of Analytics to compare current performance to historical. Patterns in historical data will often be used to predict future performance.
Diagnostic Analytics aims to find causal relationships in the data being analyzed. While Descriptive Analytics will reveal what is happening, Diagnostic Analytics attempts to answer “why” it is happening and is sometimes referred to as root cause analysis. Data discovery, data mining, and drilling down or through are processes often associated with Diagnostic Analytics. The goal of these processes is to use all the data from available sources to investigate the “why”.
Predictive Analytics uses historical data to find patterns and relationships to predict future behavior. Data models are built using historical data and are fed with current data to make forecasts. These models can be built using statistical algorithms or machine learning. An important thing to note is Predictive Analytics cannot determine if an event will happen, but only the likelihood of the occurrence.
The goal of Prescriptive Analytics is to aid organizations in planning future actions. Just as Diagnostic and Predictive Analytics builds on Descriptive Analytics, this type builds on Predictive Analytics. Prescriptive Analytics helps organizations take advantage of future opportunities or decrease upcoming risks by identifying possible implications of selected decisions. Machine learning models used for this type of Analytics is continually refined with additional data.
Difference between Data Analytics and Data Science
Highlighting the ways in which Data Analytics differs from Data Science will help to more concretely define the discipline. Although the terms Data Science and Data Analytics are used synonymously, there are distinct differences between them. At the most basic level, Data Science has been defined as “the discipline of making data useful”. Taking it a step further, it brings together domain knowledge and statistical knowledge, using machine learning algorithms to build artificial intelligence. Data Analytics is a core component of the Data Science Life Cycle. Its scope is smaller while that of Data Science is broader.
Data Science includes the identification of problems that, when addressed, could lead to positive changes for an organization. At the Data Analytics stage, the problem has already been defined. Data Analytics looks to articulate and answer more specific questions related to the identified problem. Data Science takes a macro view of an organization’s business problems and builds a roadmap of sorts to ensure they are addressed, while Data Analytics investigates each problem individually. For a deep dive into the difference between a data analyst and a data scientist click here.
Why is Data Analytics Important?
The importance of Data Analytics cannot be emphasized enough as it paints a picture of an organization’s current situation and reveals opportunities for growth. It allows organizations to operate more efficiently resulting in timely reactions to industry changes, cost savings, and more effective use of funds. Marketing Data Analytics, specifically, helps companies find ways to better meet the needs of its customers and remain competitive in their industries. In the science and healthcare industries, Data Analytics helps advance technology, identify treatments and cures for ailments and diseases, and improve the overall quality of life. Data Analytics is critical in identifying viable options and measuring the effectiveness of possible solutions. Without it, organizations are lost, operating without a clear path before them.
Examples of Data Analytics at Work in Different Industries
Practical uses of Data Analytics in a variety of industries further illustrates what it is and its far-reaching importance.
Education – Universities across the country collect a vast amount of data on students including historical data from their high school years to grades earned while on campus. Using this type of data, universities can develop profiles of students who would likely thrive at their institution.
Healthcare – Any medicine used to cure a disease, treat a condition, alleviate pain, or prevent disease was developed using some form of Data Analytics. Experiment data was collected and analyzed to develop and evaluate the effectiveness of varying formulations until the optimal one was identified.
Government – The Internal Revenue Service, to improve collection efforts, redesigned its notices to taxpayers. As a result, there was an increase in payment compliance and a reduction in penalties. In addition, the IRS was able to use resources more efficiently. Data analytics was used to prioritize the issue of payment collection and to measure the effectiveness of the redesigned notices.
Business – Capital One is one of many companies that use Data Analytics to guide their Marketing strategies. The financial services company uses Data Analytics to develop and target promotional offers to its clients. They also use Data Analytics to determine the optimal time at which to extend these offers to them. This results in higher conversion rates, more efficient use of marketing budgets, and a more loyal client base.
Our industry section highlights at a far greater level of detail how data analytics and data science are being used in big industries throug