The Prediction Business

Businesses are in the prediction and risk business, a bold statement if there ever was one. No matter what, when running a business, there are going to be things that are known and things that are unknown. We can control the things that are known to us and act on them in clear ways. But the unknowns are a bit trickier because they are unknown. We just don’t know what they are there. For example we don’t know how our customers or the market are going to respond to a new product. Or if we should be doing a heavy amount of fundraising based on the customer growth curve and whether or not we have sustainable operations to continue heading down this path. This is the risky part of the business.

In order to better understand how to predict these variables, more companies should be collecting as much data as they can about their businesses and understand it the best they can. Data is the 21st century oil. Reading it is the 21st century literacy. Big Data is what the press is calling it. Almost all companies collect data now and because of this, it is much better to understand how it works, what is the information is trying to convey, and how the information can be used to make a forecast, tell a story, or just inform. General statistics has always been used to mine and learn from data in the past. But with faster computer speeds and access to an almost unlimited bandwidth, it is now much easier to run much more advanced algorithms over the data using what is known as machine learning. What is machine learning? It is applying statistical models to the data you or your company has so that smarter predictions can be made about the data you don’t have.

With each new set of data we encounter, new uses for algorithms must be found. And while machine learning has a great amount of potential to the way companies approach their business problems and the way entire industries operate, it can still be thought of as a branch of statistics that is to be used on big data. And the tools that machine learning bring are designed to make better use of that data.

How can we approach thinking about big data and machine learning along with it? The enormous scale of data available to firms can be challenging; using machine learning is as much about data analysis as it is about adapting to the sheer size of any particular data set. A great way to think about data is how long and wide it is. What is meant by this statement is our data set will be long depending on how many rows it happens to have. Let’s say we are analyzing a large company’s data and what it will look like; we can imagine each row being one unique customer. Depending on the size of the company there could be up to millions or even billions of customers. So with that line of thought, the more customers there are, the longer or higher our data set will be. Width then corresponds to the number of columns in the data set. So in our case, each column is considered a unique variable assigned to our customers. For example, our columns can be purchase and browser history, mouse clicks, and even text. This data set can become rather large and overbearing and this is where machine learning makes use of a tool set to better analyze wide data.

We can further refine our initial question down further by asking what machine learning is used for? The most common application is to make predictions and this is why it is becoming so important to businesses. Being able to make predictions about data that isn’t available can be used to formulate sales, marketing, operations, and financial strategy. Here are a few examples of how it is used in industry today:

  • Personalized recommendations for each customer. (Amazon product recommendations, Spotify and Pandora recommending new music, and Netflix movie recommendations)
  • Forecasting customer loyalty (How often they shop with a company down to the time and what they consistently spend their money on.)
  •  Fraud detection and credit card risk (More banks and insurance companies are using their data to make predictions about what customers may be a moral hazard)
  • Facial recognition software (Facebook makes great use of this when it recommends who should be tagged in a photo)
  •  Advertisements that create their own copy and images (M&C Saatchi partnered with Clear Channel UK and company called Postercope to create these ads.)
  • Personalized assistants (Apple’s Siri, Google’s Now, and Microsoft’s Cortana are just the big name examples of what can be accomplished. There will be many, better assistants down the road.

The common identifier is the need for a unique business process and the decision that must be acted upon to get to that accurate prediction. Each of these examples come from complex environments where a correct decision depends on many different variables. (Our wide data). And each prediction will ultimately lead to an outcome with whatever it is helping the model become continuously better.The business value of machine learning is enormous even with its limitations are taken into consideration. It is focused on prediction which means the model of the environment might be all that is needed to make the right decision.

So let’s get into how machine learning can be used in practice. Within each machine learning algorithm there are generally three broad concepts. They are:

  • Feature Extraction: This determines what data to use in the model.
  • Regularization: Used to determine how the data are weighted.
  • Cross-Validation: Tests the accuracy of the model.

What each of these concepts does is separate the “signal” from the “noise” which is common in most every data set and helps sort through the mix to get to better predictions.

Feature extraction is the process where the variables that the model will use are discovered. There are times where all features are dumped in to a model and used but more often than not this doesn’t happen due to overfitting. Features help aggregate important signals that are spread out over the data. For example, if your company runs an online music store, each feature could correspond to musical genre, record label, or even the artist’s home county. Once these data points are collected they are combined through automation that clusters the features together and the model can then analyze customer predictions. A very well-known business case is Netflix’s movie recommendation algorithm. The more each customer uses their product, the more data points they are able to collect about that user and the company is better able to predict what movie or television show the customer is interested in watching.

After we have our features chosen we must understand if the data we have been collection and what it is being combined reflects a signal or noise. So we begin by playing it safe with the model using regularization. This is a way to split the difference between a flexible model and a conservative model. For example, one effect is known as “selection” which happens when the models algorithm focuses on a smaller number of features that contain the best signal, discarding all other features. Regularization helps the model stay away from overfitting, (overfitting is when a model learns patterns from the data that ultimately are not helpful and won’t hold up in future cases) and helps it learn from both signal and noise.

In order to test the accuracy of the models predictions, a process is used called cross-validation. To test that the model is “out-of-sample”, which is when predictions are made on data we don’t have based on data we do have, our initial definition of machine learning. This is done by splitting the data into two sets called the training and test data. The model is first built using the training data and then more tests are done with the use of the test data. Keeping a clear partition between the two sets is instrumental in not over estimating how good the model actually is.  

There are many examples of machine learning being used in production that we use on a daily basis. In some cases, we might not even be aware that our technology is using it in the background. Netflix was used as an example of a business that makes great use of its data. Amazon is also extremely data driven with their product recommendations being used skillfully with each customer who shops with them. However, the company that probably uses machine learning the most right now is Alphabet, Inc. (The Company formerly known as Google.) Machine learning not only guides how their search engine works so efficiently, but is also used in Google Translate, Nest, their self-driving cars, Google Now, and many other products they offer. The more data they collect from us, the better they will be able to fine tune their algorithms in their products so that they will interact with us seamlessly.

One final intriguing example is how a digital agency is using artificial intelligence to create ‘self-writing’ campaigns in London. How it works is the ad itself is placed on a bus stop and has a camera connected to it. This camera registers commuters’ engagement based on whether they look happy, sad, or neutral. Then, an algorithm executes various responses based on the commuters’ responses to the ad. This campaign in particular only used a fake coffee brand since it was more of a test than anything. But if the proof of concept works, we may start seeing more interactive billboards out and about.

Being able to make the correct forecast and predictions for your company isn’t something that can be done with 100% accuracy, but businesses that do utilize the data they are collecting to its fullest potential find they are better able to cope with the uncertainty of variables they can control. Forecasting isn’t about getting the answer to your questions correct, because that isn’t going to happen. Forecasting is about being able to make sound judgement from the data and the algorithms used to mine that data will help anybody or business get a bit closer to the answers they are looking to find.