What is Predictive Analytics?
Predictive analytics is a term used for analytical and statistical techniques that assist in predicting future changes, events, and behavior for a variety of topics. Methods such as data mining are used to gather data about a specific audience or topic. Then this data used to create a predictive model of the future.
Predictive analytics is particularly useful when trying to determine optimal strategies for an array of objectives: business growth, epidemiological trajectories, economic forecasts, etc. If we can predict behaviors of various factors in our environments, then we are better able to decrease the risk of disastrous outcomes.
Types of Data Analytics
Predictive analytics is part of a larger “family” of data analytics; each analytics approach plays an essential role in all supply chains, whether physical or digital. Whether we’re aware of it, our lives are run by the constant hum of analytics processes running quietly in the background.
Data analytics assist in the production and delivery of all goods and services essential to human life. This is particularly true in a global digital supply chain where instant communications are facilitated by massively parallel computational systems which also need to be monitored via data analytics.
But, for optimal outcomes and outputs the correct analytics “type” needs to align with the information you’re seeking. To understand what predictive analytics “is”, you also must know how it’s different from the other analytics
Descriptive analytics entails summarizing your data. This makes your data more easily understood. Descriptive analytics is definitely the most simplistic type of data analytics and is used in the cleansing phase to help identify useless or compromised data.
Descriptive analytics employs basic arithmetic functions such as sum, mean, median, proportions, and percentage values to narrow down and find either an average, or a percentage spread of your data. Descriptive data answers the question “What is happening?”
Diagnostic analytics answers the question “Why is this happening?” This is because rather than telling you what is happening, the diagnostic analysis method tells you why something is happening. Combined with descriptive analysis, the two make a pretty potent analytic combo.
Processes associated with diagnostic analytics are data discovery, data mining, and drilling down. All analytic processes are used within diagnostic analysis to understand why a certain statistical relationship is occurring.
Predictive analysis uses past data, patterns, and relationships to predict future behavior. It compares historical data models with present data models to compare trends and help make a prediction. If patterns from the present models and historical models show a high rate of parallel relationships, then there is a degree of likelihood that future data will reflect a similar pattern to the data within the time frame of the historical model.
Predictive models are built using statistical algorithms and machine learning, machine learning being an automated model analyzing software that identifies patterns from data it is fed. Note, predictive analysis cannot determine if an event will happen, rather it is used to determine the likelihood, or the percentage chance of that event occurring. The question answered via predictive analytics is “What’s the chance that this will happen?”
Both R and Python are frequently used for the predictive analytics process (more on this in the section that follows). Examples of predictive analytics software include SPSS, SAP Predictive Analytics, and SAS Enterprise Miner.
Finally, prescriptive analytics is used to aid in the quantitative analysis of decision making. The focus here is on deriving actionable insights into the likely impact of future actions given a selection of possible scenarios. From the options derived through prescriptive analytic models, those utilizing the findings can more accurately and precisely plan the next steps to take. Prescriptive analytics focuses on the questions “What’s the best course of action?”
Predictive Analytics Process
While uses of the data analytics techniques vary based on the “what” of the data, there is a common process for predictive analytics. This is known as the predictive analytics process. The predictive analytics process is separated into 5 separate steps: Planning, Collecting, Data Analysis and Statistical Analysis, Building the Model, and Monitoring the Model.
It’s hard to do anything without a plan which is why the first step in the process is coming up with what you’re trying to accomplish. This is the time to develop your questions and figure out what you’re trying to predict. Plan what methods you will use and how you will use them. Setup of definitions of what makes up high quality data. How will you measure this? What type of data will you need? What are the data sources?
There are times when you need to develop your questions after an initial exploration of the available data. In fact, it’s not an uncommon occurrence. You can add an initial phase to your predictive model building plan where you conduct exploratory data analysis (EDA). However, you still must have a plan for your objectives once you complete an EDA.
The collecting phase is the phase in which you gather your data. This is where all that planning goes to work. Use your planned methods to extract your data, whether it be internal organizational data, gathered via survey, or obtained from external sources.
It is also important to know what type of data you’ll be focusing on. Are you looking for quantity, or quality or a balance of the two? Are you collecting quantitative or qualitative data, or both? It’s important you know what type of data you are looking for, so make sure you’re looking at the right type to avoid setbacks.
Data Analysis and Statistical Analysis
Now that you’ve collected data, you need to sort through the data and find data that’s the most useful to you. During this phase, you’ll clean the data and perform a quality check. Data cleansing and quality control are important for multiple reasons. Do you have missing or null values? How will you treat these values? There are many approaches.
Cleansing also checks for data relevance. If any data is corrupted or inaccurate, duplicated, and checking for correct formatting of your data. You need to take the time to correct your data if possible. If not, you need to consider the risk of discarding the inaccurate or useless data so it doesn’t impact your result.
Statistical analysis involves the quantitative testing and validating assumptions using appropriate statistical techniques.
Build the Model
By now, you should clearly understand which predictive model best fits the numerous impact factors you have identified and analyzed. During the model building phase, you’ll create, test, and validate your predictive model.
A few examples of possible predictive models to use are classification, clustering, and time series. Machine learning algorithms associated with predictive analytics include Logistic Regression, K-Nearest Neighbors (KNN), and Random Forest.
Deploy the Model
After completing your model testing and validation, you’ll move the model into production. In a large enterprise environment, this means that wherever the model is supposed to “live”, it will be lifted and shifted for broader access.
For example, an organization may want to push the predictive analysis results to a dashboard for internal users. This may require placing the predictive model in an analytics layer of the IT infrastructure where internal stakeholders can access the results through an API.
Monitor the Model
Over time, models degrade. As part of the initial predictive analytics process plan, you should have ensured an ongoing monitoring component. You’ll need to have established threshold metrics or key performance indicators (KPIs). You can design and implement automated model performance analytics so you’ll receive reminders and alerts as the performance fluctuates.
Eventually, you may need to retire or revamp the model. This is to be expected. One rule of thumb in model monitoring is how fast it degrades over time. The faster the degradation, the more likely the model is faulty in some capacity.
Continuing Education in Predictive Analytics
Most data science academic programs provide courses in predictive analytics. They may not be specifically entitled “predictive analytics.” But, it’s near impossible to not be exposed to this form of analytics during a data science program. Those who major in statistics will also receive in-depth education in all types of statistical analyses, especially within a graduate degree program.
However, you need not major in data science to get started in self-directed study. A plentiful and diverse number of online predictive analytics and data analytics courses exist and are offered through Coursera, edX, and many other MOOCs. You can give those a try, many are free, before you decide to take a deeper plunge into predictive analytics.
Importance of Predictive Analytics
Without a doubt, being able to gather data, identify patterns, and make predictions gives an organization a distinct advantage when it comes to efficiency and responsiveness. Given the intricate interconnectedness of our essential supply chains, being able to accurately and precisely predict the ebbs and flows of supply vs. demand is critical.
We now have more data available than at any other time in history. Enterprises of all sizes need skilled and knowledgeable analysts to help collect, analyze, and communicate findings abstracted from the data.