Our Way Forward

A common theme among many people who write essays involving business, economics, tech, and media is that there is usually a year end post where the previous year is analyzed. From there, the upcoming year is forecast in a hopefully illuminating light. This is something I attempted to start a couple of years ago but haven't been consistent with at all. So I thought it would be fun to start it back up again since one of my focuses for the next year is to start sharing more of writing in the digital public. This post is going to be pretty simple; the focus will be on what I think what is likely to happen over the next year. I'm hoping to go back over these at the end of the year and see what part of the analysis was close and what was way off. I'm going to make these as simple as possible. So let's get it started.

  • The security and privacy issues that are a result of the hacking into older companies will probably get worse. We are still in this middle ground between 20th century old industrial thinking and 21st century new information and data thinking and the companies that haven't moved from the former to the latter will suffer the greatest. Why is this? Because these older companies have never had to deal with problems of this scale before and malicious actors will do their best to exploit it to their benefit. What we will see from this is more and more people may start wanting to migrate their data to a place where they control it themselves which brings me to my next bullet point.

  • The crypto space has gone through an intense reckoning over the past year and it may be in for some more this year. I'm not going to forecast the price of any cryptoassets because I really don't know what will happen. What I do think will happen is that more and more of the "projects" and "protocols" that were created in the last couple of years are going to disappear completely. And that is probably a great thing to have happen in the space. The are some legitimate projects that are being built in the space (Blockstack and Mimblewimble come to mind as well as a few others) but there are many more that add very little value. I still think it's early days for crypto and open networks but the kinks are still being worked out. So what I do think will happen is that projects and companies that have built a useful product or service will keep pushing forward and adding value to the space while most others will fizzle out. One big idea that I think will be incredibly useful that can only really be implemented on open networks will be giving consumers the option to control their own data. The majority of the security and privacy issues come down to users not owning their own data. Which brings me to the next bullet point.

  • The techlash is going to continue relentlessly and the aim of that ire will be directed not only at Facebook, but many other tech companies as well. This will be for similar yet very different reasons. Take Facebook for example: The leadership team has had to go from one crisis to the next dealing with Russian interference in U.S. elections, security gripes, and the Cambridge Analytica fiasco. But Facebook isn't the only company dealing with a growing techlash. Googles size, clout, and control over what pops up in its search results (not to mention the algorithmic problems with Youtube), has been making many people uncomfortable and Amazon may start causing even more ire after its incredibly strange search American Idol-esque search for a second head quarters. 

  • AI and automation will be the much bigger story for the economy at large. More and more products will probably have some sort of AI/machine learning algorithms built into them. This will be great for the companies that control and sell those products and services but bad for the jobs that automation will make redundant. The pop culture idea of automation is that there is a robot uprising and suddenly we are at war with the terminator. Sounds very dramatic and al very unlikely. What is more apt to happen is that a piece of AI software will automate some aspect of a business that needed 30 people in a department. Now, that department may only need 2 people and let the rest go. A sort of silent job loss. That is something that I think we will start seeing more of in the future.

  • There are a few other subjects that will probably take center stage and that I would like to dive a bit deeper into understanding are: AI and censorship on the social networks (global communications platforms) and the repercussions of this happening. The subversive hidden networks that are quickly becoming the voice of our culture (an example of these networks can found through hashtags). And of course, the role of technology companies in our society today and how their size, money, and power are effecting everything from politics on the national stage to housing and pricing on the local level.

For me personally I am planning on writing more about the role of media in our culture today. This will include everything from music, entertainment, gaming, VR and the like to how media effects our culture in general and the role technology is playing in culture around us. It's my way of helping us find our way forward.

The Chain Files pt 2.

This is a continuation of The Chain Files, a series of essays about blockchain related technologies. For starters, these essays will be focusing on the technical details of Bitcoin and then move into other cryptocurrencies and assets.

Blocks & Header Chains

When transaction data is recorded, it is recorded permanently into what are called blocks. As can be expected, blocks are what make up the blockchain. Each block can be thought of as its own input into a much larger ledger. For example, when Alice sends Bob 15 bitcoin, that transaction is processed by miners and if those transactions are confirmed, they are then recorded to a block. Each block is then organized in a linear sequence over time, this chain of blocks is what makes up the blockchain. As the chain grows longer and longer over time, it becomes much more difficult to reverse any earlier transactions due to Proof-of-Work. 

The main way to identify a block in the blockchain is by using the block header. Block headers are 80 bytes long and calculated by running through the sha256 algorithm twice. The block header hash is not sent through the network but is instead calculated by each node as part of the verification process for each block. Within each block header is the contents of the previous blocks hash. Since changing any input will change the output of the hash, this means that since the header of block 2 contains the hash of block 1's header, block 1's header cannot be changed without breaking the link to block 2. 


Bitcoin requires that that every block header links to the pervious header block which links to the previous header block and so on and so forth until we reach the very first block. This block is called the Genesis block. If a link between a block header and the previous block header cannot be verified, the block is considered invalid and becomes what is known as an orphan block. Because of this, the Proof-of-Work in Bitcoin is cumulative as it builds the chain of blocks connected to the previous block. For example, the Proof-of-Work that is created by the 2nd block header adds to the Proof-of-Work created by the 1st block header. A 3rd block header that links to the 2nd block header then accumulates that Proof-of-Work. These links between blocks are called header chains and are what extend the chain further. Each Bitcoin client looks for the best header chain which is the chain of valid blocks headers that has the greatest amount of cumulative work.

Header chain forks

There are complications that can arise between miners and that is when two miners each create a new and valid header at the same time that both refer to the same header parent. This collision is known as a chain fork and since each header can only refer to one previous header, every Bitcoin miner now needs to choose which of the competing previous headers to use in the next header. When moving onto the next block, the chain that demonstrates the most Proof-of-Work becomes the main chain. It is at this point that the miner who had created the header that loses has wasted their Proof-of-Work and missed out on the bitcoin mining reward.

Short chain forks happen quite a bit in Bitcoin and are usually nothing to worry about. We can take note that much work goes into keeping them from happening to keep the maximum amount of Proof-of-Work added to the headers chain.

Blocks of transactions

Ultimately Bitcoin is a transactional system, although the are no transactions in the block headers. Miners instead collect groups of transactions that are called the block of transactions and then hash them using a specific formula. That hash of the transactions is called a merkle root. The merkle root is included as 32 of the 80 bytes in the block header and together, the block of transactions and their corresponding header are what make up a block. In the previous post, it was pointed out that hashing some input will produce a specific output. So when miners hash the same block of transactions, it will always produce the same hash as the output and proves that the miner chose those particular transactions. And since the block header is protected by Proof-of-Work, it just isn't possible to change and of the transactions that appeared in that block.


As mentioned in the previous paragrah, the hash of the transactions is called a merkle root which is the root hash of what is known as as merkle tree, which is a hash based tree data structure that is a generalization of a hash list. Within the tree structure, each leaf node is a hash of the block of data and each non-leaf node is a hash of it's children. They are primarily used in distributed and peer-to-peer data systems for verification which works out perfectly for the Bitcoin blockchain.

The Bitcoin blockchain is an interesting piece of technology that demands further study and research. From a technical perspective, by combining the power of such data structures and algorithms as the hash function, Proof-of-Work, and the merkle root, we now have a better way to define attribution to digital space. Each transaction is just that, digital space that is located on a block in the ledger. From a more philosophical perspective, we can think of each transaction that takes place as it's own unique digital moment; further proof that an event*** happened at a specific time and there is proof that it took place. Bitcoin itself is more focused on financial transactions which has been a major contributor to its overall security. This is one of the top reasons to study and understand its technical merits.There are other projects focused less on the financial and more on the digital moment. It will be interesting to see how newer projects in the space will build on the technical considerations from Bitcoin and create their own projects that are more focused on the philosophical sense.

*** In this case, we can think of an event in the abstract and it could refer to most anything. A less abstract and more definite way would be to think of an event as something tangible like a digital photo, recorded music, or a piece of art that is assigned to a transaction and can be found in the ledger, easily giving attribution to the owner via their wallet address. 

Want proof I wrote this post? Take this key and decrypt it using Keybase.

A Technical Dive into Bitcoin (The Chain Files pt. 1)

I haven't written anything in a long while and have been thinking about this subject lately and figured I would do a little research and get my thoughts down on digital paper. You can read more about it from this old post here.

Bitcoin has been around for almost a decade now and since its arrival, has had an incredible impact within the technology and business communities. When the white paper was released in 2009, the world had just suffered through the 2008 financial collapse and was still dealing with the after effects. Most essays these days focus solely on the supposed business and economic changes that the Bitcoin blockchain will introduce to the world; those changes have not fully materialized yet although there are many companies, governments, and individuals around the world doing their best to make that happen. However, in it's entire run, the security of Bitcoin has not been compromised. There have been hacks, but those are mainly due to third party vendors such as exchanges making mistakes. The mathematics and technology underpinning Bitcoin itself still runs securely. That made me more curious about it and ultimately led me to the conclusion to learn about it. I'm hoping that this essay will explain some, but not all, of the basic technical underpinning of Bitcoin. So let's get started.

Hash Functions

One of the data types that is used most frequently in Bitcoin is the cryptographic hash or hash function. A hash function takes an input of some arbitrary amount of data (it can be any amount) and turns it into a fixed-length string or byte array. No matter how large the input data is, the output data will always be the same length. However, every time you hash different data, even if the same function is being used, a completely new hash is created.  Sometimes you will see the cryptographic hash referred to simply as "hashes" since there are non-cryptographic hashes that don't have the same security properties. 

The following examples below show how hashing can be used to verify documents with text, photos, or other forms of multimedia. I am using text only for an easy to understand example.

## Run the following commands in a linux terminal
1) echo "this is our secret message. 'the rain in spain falls mainly in seattle'" | sha256sum
   ab9c98ed0d99f6cf8b7f2a8d2e96a1526aae7bf084fcca87edd41029e3adce9e  -
2) echo "this is our secret message. 'the rain in spain falls mainly in seattle'" > hash.txt
3) cat hash.txt | sha256sum
   ab9c98ed0d99f6cf8b7f2a8d2e96a1526aae7bf084fcca87edd41029e3adce9e  -

Running the echo command along with the given text and the sha256 algorithm, we see the output is a string of 64 characters that looks absolutely random. However, both end up being the same string of 64 characters. This is an extremely important part of hashing as it can be used for verification. In this case we have used it for a simple line of text; but it can be used to verify other forms of documentation and identification now. There are a fair amount of companies that are in the first stages of building an identification layer using blockchains and hashing is an important characteristic of how they will function.

Proof-of-Work

So how do hash functions work within Bitcoin? In order to understand this better we have to travel back to 1997, when cryptographer Adam Back created a technology called HashCash which pioneered something called Proof-of-Work (PoW). HashCash used PoW to limit email spam and denial-of-service attacks and it did so by requiring a selected amount of work to compute with a proof that could be verified efficiently. The algorithm was later slightly tweaked by Hal Finney for his bitcoin precursor project as a way to mine coins. But how does it work within the Bitcoin network? In order to understand this we need to understand the technology that is causing the biggest hype these days, the blockchain.

The Bitcoin blockchain is a ledger that contains the entire record of bitcoin transactions that have taken place since the original genesis block. All of these records are arranged in a sequence of "blocks" in such a way that no user can spend their holdings twice. This record is also public and anyone can verify that these transactions have taken place if they have a specific wallet address or hash. Each block references the hash of the previous block and creates a "chain" to all previous blocks back to the genesis block. These blocks are also computationally impractical to modify because each proceeding block would then have to be regenerated as well, creating completely different hashes. This helped solve the double-spend problem of previous digital currencies and has applications beyond money as well such as identity and intellectual property. 

Hash functions and the Proof-of-Work algorithm make it extremely difficult to alter the blockchain which has brought us to where we are today and helps explain why the idea behind a blockchain has turned into such a buzzworthy business focus. Trying to do so would require that all subsequent blocks are re-mined. Doing this is difficult though since any attempt to monopolize the networks computing power would be expensive due to the machinery need to complete the hash functions.

OP_RETURN

One fascinating use case of a hash function within the Bitcoin protocol is that of the OP_RETURN, a script opcode that is used to mark a transaction output as invalid. Although the protocol caused a bit of an uproar when first announced due to many in the community believing that it was irresponsible since Bitcoins use case was for financial applications and adding this made it a record for arbitrary data, the proposal was eventually accepted. This was the first use case for a digital asset proof-of-ownership and there have been quite a few use cases and apps created since its release, most notably, proofofexistence.

Although the hash function has a been a critical component for cryptography and security software in general, it has started to become a subject more people are learning due to so many newcomers to the cryptocurrency industry. This is a good thing in my opinion for two reasons; first, more people with a better understanding of the security underpinnings of technology will hopefully lead to a demanding of more robust software being built in the future, whether it is based in the cryptocurrency space or not. Second, aspiring developers who are taking what they learn and applying it are also building those security standards into their applications. I know that many people could argue back that this idea is more hopeful than anything, especially with so many hacks taking place among large companies and governments, but only time will tell. We are still in the early stages of attempts by many companies and teams to bring this technology to the mainstream and it is exciting to see many of the results.


Want proof I wrote this post? Take this key and decrypt it using Keybase.

Decentralization


At the moment there is a huge debate raging in the Bitcoin community on whether or not there should be a hard fork into two different currencies along the same lines as what happened to Ethereum in 2016. As can be imagined, the debate is intense with many folks taking extreme sides. My goal for this essay is not to weigh in on that battle as my technical understanding of Bitcoin is limited in this debate compared to many of the developers and other folks who are staking their businesses and focus on it. But there is another idea that is constantly springing up in many of these debates and that is this idea of decentralization and more importantly, building decentralized applications. So why is the concept of building decentralized applications gaining prominence?

First, there are three types of ways that applications can be built: centralized, decentralized, and distributed. The majority of general applications being built are centralized, that is, there is a unique core node that must be used in order to access what the app is offering (be it data, an API, or your account) and the core node instructs all of the connected nodes as to what to do. If we take a step back and analyze this idea we can see that all information being produced will flow through a single center (or node). Every person who uses these services is dependent directly on this central authority maintaining the power to send and receive information. Google, LinkedIn, Facebook, and Amazon are all built on centralized stacks and this design works powerfully for them both technologically and business wise. 

Then there are distributed and decentralized applications. A distributed system means that computation is spread across a network of multiple nodes which helps speed up computing and latency of data access. A company like Google builds distributed software to help speed up their services. A decentralized application means that there is no central node that instructs the other nodes on what to do. Bitcoin is the ultimate decentralized application because if one node fails, it will not have an effect on any of the other nodes and the network will continue to operate. For the purposes of this essay, I am going to skip going through anymore detail on distributed systems. 

So why should we care about decentralized systems/applications especially since centralized systems already work so well for these companies already? For one, there is a rather large possibility that these applications will be used for their superior incentive structure, resiliency, transparency, and distributed nature. Using a blockchain (peer-to-peer distributed ledger) to a form a trustless system, value can be created using cryptographic tokens, which can then be used to access the application. As I stated in the previous paragraph, the premier decentralized app at the moment is Bitcoin (and this could very well change down the road) which simplifies the traditional financial system. In order to access the network, one must own some bitcoin, which can then be used to store value, or easily transfer it from one wallet address to another. For example, cross-border payments are made easily since the value isn't being transferred through several financial middleman.

Another way that decentralized applications are being built is as protocols that use another blockchain, such as Ethereum, and issue their own tokens to function. One interesting example is the Golem Project; it lets users access another users computer using their tokens as the exchange of value. For example, if I set up some spare computers and put them on the Golem network, anyone with Golem tokens can use my computers in exchange for those tokens. We suddenly have a way to put our spare CPU's to work. This has been an idea I have thought about considerably using bitcoin as the exchange of value instead. Either way would work well and put spare computers to work.

But let's not get ahead of ourselves quite yet. Centralized services still absolutely dominate the vast amount of users and will continue to do so over the coming years. It may not even be until innovation begins to slow down or companies begin having other problems that will eat up their time (this could be anything from internal problems to governments/states coming down hard on them). If this does happen, then decentralized systems built on blockchains could start becoming more well known as easier and stable computing platforms. In fact, I am thoroughly convinced that these systems will be incredibly important for businesses, customers, and citizens. But it will be a long road before we get there successfully and my goal is help pave it along the way.


You can make sure that the author wrote this post by copy-pasting this signature into this Keybase page and decrypt it for proof.

HTTP vs HTTPS


The Basics

HTTP

Security is now increasingly important as better online experiences now involve trusted third parties and good encryption. A basic understanding of how this works is knowing the difference between HTTP and HTTPS.

Hypertext Transfer Protocol (HTTP) is the system used for sending and receiving information across the internet. It's what is known as an "application layer protocol" so its main focus is on how information is presented to the user. This option doesn't care how data gets from point A to point B and it is also "stateless" which means that it doesn't remember anything about the previous web session. There is a benefit to being stateless which is that there is less data to send meaning there is increased speed. 

The most common use for HTTP is to access HTML pages, which are the backbone of the websites we visit on the internet. However, it is important to remember that other resources can be accessed and utilized through HTTP as well. In fact, this is the most common way that websites that do not house confidential information (such as credit cards and/or usernames and passwords) are setup.

HTTPS

Secure Hypertext Transfer Protocol (HTTPS) is for all intents and purposes, a similar system used for sending and receiving information across the the internet, it's just the secure version. The protocol was developed to allow for secure authorization and transactions. We don't want malicious actors gaining access to the private information we are creating and HTTPS adds an extra layer of security to that exchange of confidential information. That extra layer is made possible because it uses a Secure Socket Layer/Transport Layer Security (SSL/TLS) to move data back and forth. Neither protocol cares how the data gets to its destination although HTTP cares about what the data looks like whereas HTTPS does not.

Google actually prefers websites are encrypted with HTTPS because of that guarantee of extra security. When a business owner, developer, or webmaster goes through the motions of obtaining a certificate, the issuer then becomes a trusted third party. The information in the certificate is used to verify that site is what it claims to be and finally the user/customer that knows the difference between HTTP and HTTPS can by buy with confidence, giving electronic commerce more credibility. For anyone maintaining a site with heavy traffic, Google and the other search engines will put priority on sites with security and keep them boosted in the rankings as long as the multitude of other SEO related work follows their guidelines.

More Detail

Data sent using HTTPS is secured using via the Transport Layer Socket protocol (TLS) which provides three important layers of protection: 

  1. Data Integrity - Data that cannot be modified or corrupted during transfer without being detected.
  2. Encryption - Encrypting the exchange data to keep it secure.
  3. Authentication - Proves that the sites users/customers communicate with the intended site.

These three layers are the main motivation behind the HTTPS protocol and help prevent against eaves dropping and tampering with the communicated content via man-in-the-middle (MITM) attacks. 

How do browsers know who to trust?

Browsers come pre-installed with certificate authorities, meaning they know who to trust. Likewise, the browser software is trusting those authorities will provide valid certificates. A user/customer should be able to trust an HTTPS connection provided the following are all true:

  • Trust that the browser software correctly implements HTTPS with the correct pre-installed certificates.
  • Trust that the certificate authority will vouch only for legitimate websites.
  • The website provides a valid certificate signed by a trusted authority.
  • The certificate correctly identifies the website.
  • The user/customer trusts the protocols encryption layer (SSL/TLS) is secure against eavesdroppers.

It is becoming increasingly important to use HTTPs over insecure networks such as public WIFI since anyone one the same local network can discover sensitive information using packet sniffing. The same goes for using WLAN networks which can engage in packet injection to serve their own ads on webpages. Doing this can be exploited in many ways such as injecting malware onto those webpages to steal users' data and private information.

The case for using HTTPS on your own websites

With each day it seems we learn that more and more information about global mass surveillance and data being stolen by malicious actors. Because of this, the strongest case to use HTTPS is that you are making your website more secure. There are however limits to using HTTPS as it is not 100% secure. It will not prevent your website from getting hacked or stop phishing emails getting sent either. It's importance is in the fact that if you have users/customers that are logging in with sensitive information (such as passwords, social security, etc.), then setting up HTTPS is the absolute minimum price and precaution that should be taken in order to protect them. And with security, you will build trust.



You can make sure that the author wrote this post by copy-pasting this signature into this Keybase page and decrypt it for proof.

The Blockchain Meets Seattle

Earlier this week I had the opportunity to be invited to a private event here in Seattle regarding Blockchain technology. The even itself took place on he 48th floor in the old Washington Mutual Tower. The fog disrupted what might have been a beautiful view. The event itself went by rather quickly but it did a great job of explaining how the Blockchain technology can be leveraged in more ways than just creating the killer app "Bitcoin". Some interesting points that were brought up were:

  • Using the Blockchain as a ledger to keep track of IP, land titles, and other assets
  • Goldman Sachs has filed a patent for crypto-security settlement on a blockchain.
  • Microsoft has declared that the Blockchain is one of the "key must win workloads" for their Azure cloud platform and business. They are also collaborating with major U.S. banks using the technology. 
  • Some governments are already beginning to invest in their own local, or private, blockchains.

What was most surprising to me was the actual plans and partners Microsoft has for their Blockchain-as-a-service (BaaS) tech they have on their Azure cloud platform. Some of those partners include:

  • Bitpay
  • Multichain
  • OpenChain
  • Coinprism
  • Augur
  • Slock.it
  • Ripple

Putting tech like this in the hands of developers easily will be a huge factor in spreading it far and wide. The event was exciting to me as I was able to speak to a number of people about an area in tech that I have been involved with for a few years now and it also showed how many people are interested in learning more. I hope to be a part of those bringing this technology to the masses over the upcoming years. 

Yesterday

"It's no use going back to yesterday, because I was a different person then."

- Lewis CarrolI


Life has changed quite a bit for me and I am ready to get back in to writing again. I miss it terribly and have made the decision to start posting some of the ideas rattling through my mind as of late. Writing really is the best way for me to clarify some of my thinking.

Looking Forward

As is standard practice in the New Year, many people enjoy writing about what they think will happen in the coming year ahead. It’s not something I’ve ever done before and so instead of just quickly throwing together a quick list and publishing it, I thought it would be better to look at a few important developments taking place in the world and see where they may be heading. So here are my thoughts on the four spaces that could have an impact on the global economy.

  • Emerging markets are going to go through quite a bit of pain due to the price of many commodities falling to earth. China seems to be going through a bit of turmoil within its economy which can have a negative ripple effect across the global economy as a whole. The country is beginning to really experience a huge amount of pain brought on from the large debt that has been built up in some of its industries. For example, since the price of steel and iron ore have fallen so much since early 2015, the construction boom has slowed down and claimed many jobs in Europe. In response, fiscal and monetary stimulus will be used as best it can in order to help bring back demand. Doing this could hinder the exchange rate for yuan which is already feeling a large amount of pressure from the outflow of capital. If the yuan is benchmarked against a basket of currencies, its value could decline which could lead to a serious erosion of value in market valuations in much of Asia. What should be watched for is just how much China’s economy and market movements effect the rest of the worlds markets. As China grows bigger, the countries presence will begin to be felt in countries that it has invested in or have invested in it, for better or for worse.

  • Bitcoins presence will also begin to be felt in both currency markets and some products and services in the financial industry. It is the easiest way to store and send monetary value using the internet and the bigger it gets, the more people will learn about it. Many venture capitalists and industry players still talk about finding that killer app that will bring it further into the mainstream, but it still may be slow going on that front. Many banks and financial institutions including Visa and Goldman Sachs are starting to use the underlying technology, called the Blockchain, with the help of many fintech startups that have built a product or service that these companies will find useful. The use of technology by these companies will help push it towards a larger consumer base to use. There should also be more massive market movements expected for two reasons: 1) the community is still divided over the block size and (with a major core developer leaving rather loudly) and 2) its presence has been felt in China with the price popping when there are problems with the market and traders reacting by quickly moving resources in Bitcoin. Big things could make some headway this year provided there are some creative uses of the technology and the community does not remain split.

  • Security and privacy will remain on everybody’s radar as important subjects to understand what is going in. Security, already a huge geopolitical issue, will become a subject that effects more and more people as technology rapidly advances. We walk around with devices that can be tracked at all seconds of the day so it should be fully expected that more security related services and products will make their way to the market. Nobody wants their phone hacked into since it carries such a great amount of details about our lives and there will be many companies that answer that concern. But the other side of that coin is the subject of privacy; a slippery slope due to so many differing opinions of how to define privacy. Right now, most people don’t understand cryptography or the mathematics behind what makes it work; nor do they know to what extent companies and states can track their every move and search query and place all that information into a large context about our lives. On the business side, in the case of retail and advertising, I side with giving the consumer the choice of whether to do or don’t want to be tracked and sold to. If they do, to what level of privacy are they willing to forgo their privacy? And in the case that consumers aren’t given that choice, tools should be made available for anyone to use to give them the privacy they expect. And in the case of security, companies and states should work together to come to a middle ground that gives both what they need. This being the year of a presidential election, it should be expected to hear this subject brought up quite a bit.

  • Lastly, Artificial Intelligence (or machine intelligence) will become a bigger part of our lives; probably even in ways that we aren’t fully expecting to happen. But where it will remain most important to us will be in our working lives. As more and more rote work is able to be automated easily by computers, we will find them working by our side and helping us achieve things we didn’t thing possible to an even greater degree than they already are. The use of machine learning will help with completing sets of tasks and jobs more efficiently. Large tech companies like Google, Facebook, Baidu, Alibaba, Amazon, and Microsoft will help pave the way to a greater use of AI by making their products more intelligent and useful. These same companies will also go a step further (some already have) by open sourcing their code so that more researchers, smaller companies, and independent developers will have access to the same tool sets, thereby producing new products and creating a healthy ecosystem. It can even be thought that these same smaller companies or products will be purchased by the larger players and seamlessly integrated since the underlying technology will be the same. It’s exciting to think about what kinds of new products can be made.

These four developments will have a large impact on the global economy and marketplace whether positive (AI related products, Bitcoin, Privacy) or negative (Commodity prices crashing, Security) and it’s interesting to think about what lays ahead for us. There are a few developments I left out that will definitely have a large impact on the world and those include the price of oil ($28), augmented and virtual reality, and a U.S. Presidential race. Should be an interesting year that lays ahead.

The Prediction Business

Businesses are in the prediction and risk business, a bold statement if there ever was one. No matter what, when running a business, there are going to be things that are known and things that are unknown. We can control the things that are known to us and act on them in clear ways. But the unknowns are a bit trickier because they are unknown. We just don’t know what they are there. For example we don’t know how our customers or the market are going to respond to a new product. Or if we should be doing a heavy amount of fundraising based on the customer growth curve and whether or not we have sustainable operations to continue heading down this path. This is the risky part of the business.

In order to better understand how to predict these variables, more companies should be collecting as much data as they can about their businesses and understand it the best they can. Data is the 21st century oil. Reading it is the 21st century literacy. Big Data is what the press is calling it. Almost all companies collect data now and because of this, it is much better to understand how it works, what is the information is trying to convey, and how the information can be used to make a forecast, tell a story, or just inform. General statistics has always been used to mine and learn from data in the past. But with faster computer speeds and access to an almost unlimited bandwidth, it is now much easier to run much more advanced algorithms over the data using what is known as machine learning. What is machine learning? It is applying statistical models to the data you or your company has so that smarter predictions can be made about the data you don’t have.

With each new set of data we encounter, new uses for algorithms must be found. And while machine learning has a great amount of potential to the way companies approach their business problems and the way entire industries operate, it can still be thought of as a branch of statistics that is to be used on big data. And the tools that machine learning bring are designed to make better use of that data.

How can we approach thinking about big data and machine learning along with it? The enormous scale of data available to firms can be challenging; using machine learning is as much about data analysis as it is about adapting to the sheer size of any particular data set. A great way to think about data is how long and wide it is. What is meant by this statement is our data set will be long depending on how many rows it happens to have. Let’s say we are analyzing a large company’s data and what it will look like; we can imagine each row being one unique customer. Depending on the size of the company there could be up to millions or even billions of customers. So with that line of thought, the more customers there are, the longer or higher our data set will be. Width then corresponds to the number of columns in the data set. So in our case, each column is considered a unique variable assigned to our customers. For example, our columns can be purchase and browser history, mouse clicks, and even text. This data set can become rather large and overbearing and this is where machine learning makes use of a tool set to better analyze wide data.

We can further refine our initial question down further by asking what machine learning is used for? The most common application is to make predictions and this is why it is becoming so important to businesses. Being able to make predictions about data that isn’t available can be used to formulate sales, marketing, operations, and financial strategy. Here are a few examples of how it is used in industry today:

  • Personalized recommendations for each customer. (Amazon product recommendations, Spotify and Pandora recommending new music, and Netflix movie recommendations)
  • Forecasting customer loyalty (How often they shop with a company down to the time and what they consistently spend their money on.)
  •  Fraud detection and credit card risk (More banks and insurance companies are using their data to make predictions about what customers may be a moral hazard)
  • Facial recognition software (Facebook makes great use of this when it recommends who should be tagged in a photo)
  •  Advertisements that create their own copy and images (M&C Saatchi partnered with Clear Channel UK and company called Postercope to create these ads.)
  • Personalized assistants (Apple’s Siri, Google’s Now, and Microsoft’s Cortana are just the big name examples of what can be accomplished. There will be many, better assistants down the road.

The common identifier is the need for a unique business process and the decision that must be acted upon to get to that accurate prediction. Each of these examples come from complex environments where a correct decision depends on many different variables. (Our wide data). And each prediction will ultimately lead to an outcome with whatever it is helping the model become continuously better.The business value of machine learning is enormous even with its limitations are taken into consideration. It is focused on prediction which means the model of the environment might be all that is needed to make the right decision.

So let’s get into how machine learning can be used in practice. Within each machine learning algorithm there are generally three broad concepts. They are:

  • Feature Extraction: This determines what data to use in the model.
  • Regularization: Used to determine how the data are weighted.
  • Cross-Validation: Tests the accuracy of the model.

What each of these concepts does is separate the “signal” from the “noise” which is common in most every data set and helps sort through the mix to get to better predictions.

Feature extraction is the process where the variables that the model will use are discovered. There are times where all features are dumped in to a model and used but more often than not this doesn’t happen due to overfitting. Features help aggregate important signals that are spread out over the data. For example, if your company runs an online music store, each feature could correspond to musical genre, record label, or even the artist’s home county. Once these data points are collected they are combined through automation that clusters the features together and the model can then analyze customer predictions. A very well-known business case is Netflix’s movie recommendation algorithm. The more each customer uses their product, the more data points they are able to collect about that user and the company is better able to predict what movie or television show the customer is interested in watching.

After we have our features chosen we must understand if the data we have been collection and what it is being combined reflects a signal or noise. So we begin by playing it safe with the model using regularization. This is a way to split the difference between a flexible model and a conservative model. For example, one effect is known as “selection” which happens when the models algorithm focuses on a smaller number of features that contain the best signal, discarding all other features. Regularization helps the model stay away from overfitting, (overfitting is when a model learns patterns from the data that ultimately are not helpful and won’t hold up in future cases) and helps it learn from both signal and noise.

In order to test the accuracy of the models predictions, a process is used called cross-validation. To test that the model is “out-of-sample”, which is when predictions are made on data we don’t have based on data we do have, our initial definition of machine learning. This is done by splitting the data into two sets called the training and test data. The model is first built using the training data and then more tests are done with the use of the test data. Keeping a clear partition between the two sets is instrumental in not over estimating how good the model actually is.  

There are many examples of machine learning being used in production that we use on a daily basis. In some cases, we might not even be aware that our technology is using it in the background. Netflix was used as an example of a business that makes great use of its data. Amazon is also extremely data driven with their product recommendations being used skillfully with each customer who shops with them. However, the company that probably uses machine learning the most right now is Alphabet, Inc. (The Company formerly known as Google.) Machine learning not only guides how their search engine works so efficiently, but is also used in Google Translate, Nest, their self-driving cars, Google Now, and many other products they offer. The more data they collect from us, the better they will be able to fine tune their algorithms in their products so that they will interact with us seamlessly.

One final intriguing example is how a digital agency is using artificial intelligence to create ‘self-writing’ campaigns in London. How it works is the ad itself is placed on a bus stop and has a camera connected to it. This camera registers commuters’ engagement based on whether they look happy, sad, or neutral. Then, an algorithm executes various responses based on the commuters’ responses to the ad. This campaign in particular only used a fake coffee brand since it was more of a test than anything. But if the proof of concept works, we may start seeing more interactive billboards out and about.

Being able to make the correct forecast and predictions for your company isn’t something that can be done with 100% accuracy, but businesses that do utilize the data they are collecting to its fullest potential find they are better able to cope with the uncertainty of variables they can control. Forecasting isn’t about getting the answer to your questions correct, because that isn’t going to happen. Forecasting is about being able to make sound judgement from the data and the algorithms used to mine that data will help anybody or business get a bit closer to the answers they are looking to find.

The Creation and Capture of Value

In order for a business to create value after cost it must create and distribute that value in the most efficient way possible. It can be considered the starting point for any and all businesses which leads to our first question to think about: how is that value created? Simply put, it is created through work. That work could be anything from administrative tasks (such as filling out the right paper work for customer orders), technical (deploying code to servers), and creative (marketing copy, product and/or logo design, etc.). The business can then create value through that work, sell or trade it to a customer base, and capture some of that value through profit. Based on this definition, we can begin to clearly see that businesses add value in more ways than just making a product and selling it. Every moving piece of that business should be moving towards the end goal of creating and capturing that value.

Now let’s start thinking about the types of value that can be created. First, let’s remember that not all types of value are created equal by any means. A value that is considered a commodity is easily replaceable in the minds of customers. For example, if your products aren’t distinguishable from your competitors, then that competitor will be primed to take your place should your business falter by any means. There are ways around this though; if your company is able to create a new and more efficient process for doing business or in possession of unique skills focused on the creation of value, then you will be able to more readily differentiate from those trying to eat your lunch. Having any of these things is a competitive advantage and should be fully utilized towards the goal of value creation.

Measuring value creation is important if only so that you understand what the value is that your business is making. The first method is by measuring revenue. Revenue tells you that the way your business creates value was worthwhile to your customer base since they are willing to pay for it. Notice how I didn’t say profit. Many businesses successfully create revenues but no profits (think Amazon), but many will not be able to do this for very long. A business needs a profit in order to survive and sooner or later a lack of making any profits will bring that business down, no matter how great the product or service. (Amazon has been able to successfully navigate this pasture by shrewdly reinvesting its revenues into future initiatives such as AWS.)

Then there is perceived and exchange value, which are interrelated. The exchange value is straightforward in that it is the amount of value exchanged between a buyer and seller for a product or service. For example, if you go to the store and purchase a pair of shoes, the price you pay for those shoes is the exchange value.  The perceived value is defined by our perceptions of usefulness of the product or service. In economics there is a consumer surplus (C.S.); when the consumer surplus is greater than zero (C.S. > 0), then the customer is better off making a purchase than not. So value is created when the perceived value of a product or service has a certain degree of usefulness, consumer surplus is greater than zero (C.S. > 0) and that same value is then exchanged to the seller.  

As we can see, it’s incredibly important for a business to create value. It won’t survive for very long if it doesn’t create value by differentiating itself in the marketplace or not making a profit in the long run. In order for a business to survive, it must capture a portion of that value it is creating. If businesses ultimately want to succeed, they must think clearly about how they are going to capture the value they are creating. Businesses that don’t do this may be leaving money on the table.

There are a number of different ways to capture value with some being more common than others. For instance, price based on value changes according to the offerings worth to the customer. What this means is businesses don’t set their prices based on what their competitors are doing. They might also discontinue the process of marking up prices based on production costs. Instead, what they are doing is looking at what their customers want and setting their prices accordingly.  What is important is that the customers’ perception of value must be discovered. This is obviously different with each customer but there are different models of discovering this missing piece. An obvious example is auctioning, which doesn’t work with all business arrangements but is incredibly effective when it does. The most common example of an auction is with online advertising, where each buyer sets the price and the seller can choose whether or not to take that offer. Of course this is controlled less and less by human interaction and is guided more efficient using algorithms to guide the software. There are some downsides such as prices that may be less than satisfactory for the seller, but must be honored regardless.

Another model is known as demand-driven pricing; what this does is let the price change due to the fluctuations in demand for a product or service. The most common example to date is Uber and it’s constantly changing prices. Although there can be a large amount of complaints if the price is too high, there are reasons for it, and Uber acts on those reasons with a mechanical efficiency. The business is a money making machine. How it works is the company raises and lowers its prices based on demand for rides. Other factors that are taken could be day of the week, whether it is a holiday or not, weather, and even city, etc. Using these variables (and many, many others), the company can maximize its profitability by knowing how many cars to have on the road and in what location at any given time. Demand-driven pricing at its finest.

A further form of value capture model enables customers to set their own prices. Although this doesn’t take place in most industries, it is especially prevalent in the travel industry where buyers can decide what price they want to pay and sellers can take it or leave it. Unlike auctions though, this transaction is mostly kept between the buyers and sellers. Doing this lets the seller maintain their prices even if they are discounting for incremental sales.

Next up is for companies to capture value by using two-sided market forces to their advantage. Although the name may not ring a bell to most people, it is a model everyone has seen in action and is used efficiently by media companies. For example, many publications are free for the general public to take (think local periodicals in many major cities) and they make their money, in turn, by charging advertisers to place ads in their publications. The money is then used to subsidize the content that is created for the publications and the process is continued. Although Vice Media is a much larger entity now, when it was just a magazine, it was free to anyone (who could find it) and the vast amount of value was captured by charging advertisers (especially American Apparel). Of course they made money with subscriptions, but most people just searched high and low for a free copy.

Then there are those businesses that use what’s known as the “price carrier” in their offerings. This is the experience that businesses will hang a price tag on while customers may not be coming through the doors just for that in the first place; they may be coming for something else entirely. Think about it, we will sometimes purchase a product or service to gain access to something else the company is offering, but has no price tag on. Starbucks is a great example of this. The majority of their customers walk through their doors for a quick coffee before work in morning, but there are those people who come in just for the wi-fi. Using it isn’t for sale, but purchasing a coffee or pastry will give you access to it. So a good question to ask yourself is whether or not your business is hanging the price tag in the right place. What would happen if you moved it to something else?

One of the most prominent examples of value capture is called the razor-and-blades model, which was pioneered by Gillette. It’s a rather simple model: customers get the razor handle for free, but must purchase blades continuously since they get dull rather easily. And charging for the blades is where the value capture happens. This model was also utilized by technology companies who sold printers and printer ink replacement cartridges. The printers are always cheaply priced; the ink is not.

The final model to talk about is one we all know well since the phone carriers are incredibly efficient at using it; it is known as bundling. How it works is the price of the new phone is subsidized by any extra hardware, software, and data features we purchase with it. The phone is cheap; the options that are bundled with the purchase are where the value capture takes place. Car dealerships also excel at this too; when you go to purchase a car, the sales people will usually bundle in many products or services you may or may not even need since those add-ons are where the money is then made.

Both value creation and value capture are incredibly important to keep your business running smoothly over a long period of time; more so than your competitors. Both are equally important; both need to be studied rigorously for the best understanding of how the complement one another. Value creation is the work that a business needs in order to create value to offer to their current and potential customer base; capturing that value is what will keep customers happy and the business running smoothly. Combining the two will be what ultimately either keeps a business alive. And that is something to constantly been thinking about when focusing on your own business.