What is Predictive Analytics and its Role in Sustainability?

Share this article
Share this article
Prioritise Us on Google
What is predictive analytics? Credit: Harvard Business School
Google, IBM and Harvard Business School show show how predictive analytics uses past data and AI to forecast risks & guide smarter sustainability decisions

Sustainability is constantly being intertwined with tech and AI.

Predictive Analytics can help predict future outcomes to avoid environmental, climate and sustainability disasters.

Youtube Placeholder
AI Model Fairness: Tackling Bias in Predictive Analytics

Inside the world of predictive analytics

Predictive analytics is the use of previous data to predict future trends and events.

Historical data is used to forecast potential scenarios in order to help drive logical and strategic decisions, according to Harvard Business School (HBS).

“The predictions could be for the near future, for instance, predicting the malfunction of a piece of machinery later that day, or the more distant future, such as predicting your company’s cash flows for the upcoming year,” says HBS.

The use of predictive analytics can be carried out manually, or by using machine learning algorithms – both use historical data.

One predictive tool is regression analysis which can "determine the relationship between two variables (single linear regression) or three or more variables (multiple regression),” says HBS.

“Regression allows us to gain insights into the structure of that relationship and provides measures of how well the data fit that relationship,” explains Jan Hammond, the Jesse Philips Professor of Manufacturing at Harvard Business School.

Janice Hammond, Jesse Philips Professor of Manufacturing, at Harvard Business School. Credit: Harvard Business School

“Such insights can prove extremely valuable for analysing historical trends and developing forecasts.”

Predictive analytics techniques

According to Google Cloud, “in general, there are two types of predictive analytics models: classification and regression”.

Classification models attempt to store data objects into one category or another whereas regressions models attempt to predict continuous data.

Predictive analytics tends to be performed using one of the three main types of techniques.

Regression analysis

Regression is a statistical technique used to estimate relationships between variables. 

It helps identify patterns in large datasets and understand how inputs are correlated. 

It is most effective with continuous data that follows a known distribution. 

Regression is commonly used to assess how one or more independent variables influence another, for example how a price increase might affect product sales.

Decision trees

Decision trees are classification models that assign data to categories based on specific variables. 

They are particularly useful for understanding individual decisions. 

The model resembles a tree, where each branch represents a possible choice and each leaf represents the resulting outcome. 

Decision trees are generally easy to interpret and perform well even when a dataset contains several missing values.

Neutral networks

Neural networks are machine learning methods well suited to predictive analytics involving highly complex relationships. 

They act as powerful pattern recognition systems. 

Neural networks are most effective for identifying nonlinear relationships in data, especially when no established mathematical formula is available. 

They can also be used to validate the outputs of decision tree and regression models.

Linda Rae, Retired Vice President and General Manager of GE Vernova’s Power Generation and Oil & Gas software

"Predictive analytics solutions are designed to empower organizations with the tools they need to make data-driven decisions, optimize asset performance and ultimately achieve their business objectives,” says Linda Rae, previously Vice President and General Manager of GE Vernova’s Power Generation and Oil & Gas software.

Types of predictive modelling

IBM acknowledges that there are many different types of predictive modelling.

The company states that the most popular include classification, clustering and time series models.

Classification models

Classification models are a type of supervised machine learning model. 

They use historical data to learn patterns and categorise new data accordingly. 

For example, they can group customers or prospects into segments, or provide binary outputs such as yes/no or true/false. 

Common applications include fraud detection and credit risk assessment. 

Widely used classification techniques include logistic regression, decision trees, random forest, neural networks and Naïve Bayes.

PwC's input on the role of AI in sustainability technology. Credit PwC

Clustering models

Clustering models are a form of unsupervised learning. 

They group data points based on shared characteristics. 

For example, an e-commerce company can use clustering to divide customers into similar groups according to common behaviours or attributes, then tailor marketing strategies for each group. 

Common clustering algorithms include k-means, mean-shift, density-based spatial clustering of applications with noise (DBSCAN), expectation-maximisation (EM) with Gaussian Mixture Models (GMM) and hierarchical clustering.

Time series models

Time series models work with data recorded at regular intervals, such as daily, weekly or monthly. 

The dependent variable is typically plotted over time to identify seasonality, trends and cyclical patterns, which can inform the choice of transformations and model type. 

Common time series models include autoregressive (AR), moving average (MA), ARMA and ARIMA. 

For example, a call centre can use a time series model to forecast how many calls it will receive per hour at different times of day.

Executives