Predicting through Randomness

I wrote this back in January of 2015. I wanted to wait until closer to the end of the tournament until I posted it. What I found interesting about it was that although I didn't answer all of the questions, those questions I did answer gave me a fairly low brier score, which is used to show how accurate the forecast happened to be. Sticking with financial markets and indexes seems to suit me as there is plenty of data available and I have zero issues with knowing that I am completely wrong and need to quickly change my thinking. Political campaigns we're a bit harder to tackle and are not something I would want to even try to forecast. In the end though, forecasting is a great tool to have in the toolkit, but really if it's understood by the user. 

7/3/15 - Just received my feedback report and I placed in the top 5-6% of the tournament, which is on the edge of superforecaster territory. The tournament was an amazing amount of fun but what really struck me was how much I don't know about anything. Still, the experience was great.


Forecasting is hard and there is no way around it. When you are trying to forecast the outcome of a certain idea or situation, it is nothing but back and forth with data points and constantly digging to try and find the right answer. Simply put, there is no right answer because the data is constantly in flux. There are people who make entire careers out of forecasting, some more on force of personality (think political pundits) than actual hard analysis, and some on actual deep dive analysis (those with an incentive to find the answer, think investors). I've been thinking about forecasting now for the past year since I was accepted into the Good Judgement Project tournament and it has been a fantastic way to learn about what is happening in the world around us and quickly become humbled by what we know and think we know. Forecasting, if done correctly, forces you to look hard at the reality of what is happening and make judgments from there. And many times, we have to redo what we were thinking about. 

For a quick history lesson, the Good Judgement Project was started in 2011 by Philip Tetlock and funding from IARPA. Over the course of the next few years, the researchers found that accurate training plus a strong dose of statistics, pyschology, game theory, and interactions among team members helped create highly accurate forecasters. These people were not government trained at all. Most of them were only getting their news from online sources. The same same sources that most anyone has access too. However, these amateur forecasters were displaying a 30% more accurate prediction than those in the intelligence community with access to classified information. I thought this was pretty exciting and applied quickly thinking that it would be great way to finally apply some of the game and probability theory I learned in college. But that wasn't the only reason I wanted to join. 

Thanks to the influx of more technology driven businesses there has been larger amounts of data available not only on open source, but for companies as well. Most companies have been using data to understand their customers for years (the credit card companies can paint an accurate portrait of a consumer and do it with ease), but there are new techniques being used to accurately mine that data. These techniques come from a brand of artificial intelligence and are focused mainly in an area called machine learning. What machine learning does is construct algorithms, study the results of those algorithms, and learn from that data. For example, there is one algorithm called "Association Learning or the Apriori Algorithm". It is generally known as the first algorithm data miners try. What it does is learn interesting relationships among data in large sets. An easy example would be the groceries you purchase from the store. Management can mine each transaction that is made and discover if there is any correlation between items being purchased. Perhaps people who buy apples also buy cheese. Management can then market their products in order to keep that relationship continuing and perhaps even discover more relationships. Many companies that do this can learn quite a bit about their customers (even though some argue this can't be done right now, with the way technology is growing, it probably will soon.)

For me, gaining more confidence in my ability to apply mathematical probabilities to real life events is something I've been wanting to do but never had the proper outlet to succeed at doing. And having spent the past five years learning all I can about computer programming, this finally feels like an excellent outlet for me to pursue. And sure enough, once I started learning how to apply data mining techniques to large datasets, hours started flying by as I immersed myself into this new world. Gaining the right information to then apply as knowledge is becoming the new currency in the world today. There is so much raw data that anyone who can sift through it and correctly analyze and accurately communicate what it means will easily by a one-eyed king in the kingdom of the blind. 

There are a few rules and although I'm not going to give a rundown of the entire training materials, I will explain a few ways it is currently being used.

  • It is a foolish to ask for predictions for the fundamentally unpredictable. What I mean by this is that although there is a method and process for accurately forecasting the probability of an event happening, there are still mostly events that cannot be predicted. The best we can do is ask ourselves the right questions and move forward from there. But the big key is knowing the right questions to ask and when to ask them. Trying to predict where the ball will land in a game of roulette on every play is mathematically impossible (unless there is cheating involved because the probability of guessing correctly just three times in a row is .00182%). However, if you know what the odds are against you (trust me, the house has a 5.26% advantage on every spin) and what your long term expected value for continual play is, then the game becomes much more manageable. The movement of the financial markets could also be thought of as unpredictable and if you are going to play that game, your best bet is to have a strategy and system in place. (I'm not going to go into details on this as I don't work in finance and am not a professional trader. If you really are interested there are mountains of information online and in book stores. Again, I will say that the house has an advantage.)
  • Forecasters need positive and negative feedback. When there is a question you have been thinking about, the worst thing you can do is make your forecast and then walk away when you are done. Forecasting requires constant iteration over the course of the questions lifecycle. The outcomes change constantly and the best thing the forecaster can do is constantly take in the feedback they are getting and apply it the best they can. Not listening to the feedback, even if it's negative can be detrimental to the overall outcome. Always take in the feedback and move forward from there. 
  • Prove yourself wrong. If there is a event of some sort that you are trying to predict and the evidence seems to point overwhelmingly to a specific answer, try finding evidence that will prove that outcome wrong. Maybe it's just an opinion piece or from a new source that may not be entirely credible, but find it and analyze it anyway. As I stated in the last point, outcomes can change in the blink of an eye, even entirely predictable outcomes. But confirmation bias can be even more detrimental to a proper forecast analysis. Knowing all sides to a story is always a good bet. 

It isn't just the financial industry that has been using forecasting successfully. Although everyone from George Soros, Ray Dalio, to James Simmon's Renaissance Technologies are incredibly effective in the short term (milliseconds for Ren-Tech) and longer term thinking (macro thinking application in the case of Dalio and Soros), other industries such as technology to sports are effectively using these techniques to model better outcomes for their businesses. The big five tech companies, Google, Apple, Facebook, Amazon, and Microsoft use and offer forecasting techniques in a myriad of ways. Google is by far the most innovative technology wise (type anything into the search input box and watch it "magically" try to predict what you are going to type), but all of these companies use forecasting in ways that weren't even possible 10 years ago. Amazon can predict what you are going to buy and Facebook can predict how you will react to a status update. That is just the basic level of forecasting these companies can accomplish. Then there is sports and more specifically the "Moneyball" techniques that were used in baseball with the Oakland A's and Boston Red Sox. (It helped that the Red Sox owner, John W. Henry owned a CTA fund and had successfully applied forecasting and trend techniques to make money in the markets which is how he could afford to buy the Red Sox in the first place). I am convinced that other industries will start getting better at collecting and applying the data they get far more effectively.

Which leads me to write about how using the analytics techniques I mentioned above with more of the technological models, machine learning, plus the addition of human judgement to get to effective forecasting. These two approaches aren't mutually exclusive and should be combined. In fact, that combination is probably one of the best ways that businesses (and human beings in particular) can really start to effectively use artificial intelligence; by using machine and human judgement to attempt to forecast better results. Machine learning algorithms can model data faster and more effectively than a human being can but a human being can use quick judgement to understand when to use and not to use certain data. Weak A.I. is still the most prevalent in the world today (think Siri, it's artificial intelligence, but it isn't very intelligent.) but we may start to see the emergence of stronger A.I. over the next 10 to 20 years (machines that can learn, apply, learn some more. Like how a child learns by doing.) As more and more data becomes available, there will be more and more of a need to accurately forecast outcomes using that data. Data science and analysis plus applications of new technology in the field of A.I. will play a huge roll in pushing this forward.

From everything I have learned since I joined the tournament, I am extremely excited about what will be possible. For me, learning how to apply forecasting techniques has opened my eyes to my own capabilities and what I would like to do with them. I've always been fascinated by the emergence of A.I. and have wanted to somehow work in the field and am started to find ways to become more and more involved. I had already been pushing forward learning as much as I could about machine learning (yes, I took the Coursera class, but I still want to learn more applications) and since I have started to mine more open datasets, how these algorithms really work is becoming more clear to me. It is a fascinating and exciting field to be in and I can't wait to see what's to come.