Machine Learning - A Brave New World
Exploring today's and tomorrow’s AI landscape and how applying products like GPT-4 will affect startups' product road maps, pitch decks and fundraising strategies.
When I started messing around with machine learning in 2006, the simple predictions we were making were mainly hand-coded using some of the most sensitive data - financial behaviour.
When I moved into venture capital and funded my first AI startup, we were optimising vast amounts of proprietary training data to build custom models. The most recent data point I can point to was last week’s Y-Combinator Demo Day, where 50% of the AI startups used OpenAI/GPT 4 APIs to build their products.
As an avid fan of Mauborgne and Kim’s seminal work, The Blue Ocean Strategy, this post explores today's and tomorrow’s AI landscape and how applying products like GPT-4 will affect startups' product road maps, pitch decks and fundraising strategies. Ultimately, we predict the waters for machine learning startups will become increasingly ‘red’. Only those sourcing and mining increasingly vertically focused, narrow data sets to create precise predictions will sail the blue waters of this market.
OpenAI’s GPT-3 was initially trained on 45 terabytes of text data and had over 175 billion trainable parameters, making it likely the largest model trained. As such, the model itself had no intelligence, but it excelled at predicting the next written word. GPT-4 is a multi-modal model that can ingest images and text. One such example is that GPT-4 could be fed a sketch of a website, which can return the code needed to produce the website. The applications could be endless and seismically influential for many roles humans undertake today, which we touch on later.
Putting this current situation in another perspective, we believe that platforms like OpenAI and its future competitors have done something as significant as Microsoft Azure and Amazon Web Services' impact on cloud computing.
When it comes to the Cloud, unless you are one of the 0.1% who needs a more custom solution, there is no point going elsewhere. This isn’t a fad. Like cloud technology, there is no turning back the clock on Machine Learning as a Service (MLaaS), and it doesn’t require adopting an expensive new piece of hardware that hamstrung virtual reality. It is relatively easy to pick up, and the barriers to entry for incorporating the ChatGPT API are low. The ‘ocean’ for natural language processing is becoming increasingly red.
The window of opportunity.
One could say that machine learning has become commoditised. This may be especially true if you are trying to solve a broad problem - what is the best way to get from position A to position B. Why build a custom deep neural network that will require millions of data points to train when GPT4 can already provide one with semi-reliable accuracy?
So what does this mean for AI startups and their VC fundraising strategies? It will get more challenging, primarily if your business model uses publicly available third-party data sources. Startups that are solving specific problems with custom models using the most dimensional, non-fungible and proprietary data will continue to thrive until the MLaaS providers start crunching more specialist data sets.
We believe that investors will need to, if they don’t already, recognise that using platforms like OpenAI or Amazon’s thing shouldn’t be classed as technical debt. It could be recognised in the same way as using AWS. VCs and entrepreneurs will need to factor into their pitches and due diligence models that the use of MLaaS does not boost or drop your valuation.
The field of computer vision is still compelling due to the need for highly dimensional training data and bespoke models to make automated decisions. However, we do not perceive this to be immune from MLaaS for long and expect future iterations to target video and other more complex data sources.
The increasing mystification of neural networks.
We need to get accustomed to accepting the “black box” that is AI. As David Beer, professor of sociology at the University of York, commented in his forthcoming paper entitled The Tensions of Algorithmic Thinking: Automation, Intelligence and the Politics of Knowing, “There is a good chance that the greater the impact that artificial intelligence comes to have in our lives the less we will understand how or why”.
We admire the European Union's desire to introduce their AI Act, but this is likely to be folly. Their goal to understand AI and create a “global standard for "the development of secure, trustworthy and ethical artificial intelligence" will likely be akin to pushing water uphill. Artificial or simulated networks are called because their construction is based on the human brain and mimics how biological neurons signal each other. Crucially, scientists still know little about how our brains work. We expect this to be a feature of artificial neural networks' development. The deeper and more multi-layered neural networks become, their complexity and chance for hidden layers have grown so much that developers will be unlikely to understand the programming.
From a societal perspective, it is easy to fall into some negative thought patterns about the future of humanity with the broader inclusion of machine learning in our lives. It is not science fiction to foresee entire industries and skill sets decimated by artificial intelligence, which could upend financial markets and create global unemployment levels that we have not seen before.
Without wanting to sound like an alarmist, I suspect that the ship has sailed in trying to contain machine learning, and we may be surprised by how fast we will need to catch up and work with its impact on how it will affect us. Humans are masters at adaptation, and just like how we built our lives around the invention of electricity and, later, automobiles, we will evolve to seize the new opportunities that artificial neural networks offer us.
Bring this post back to an AI startup investment perspective; this predicts that due diligence, at a code level, will become increasingly redundant. The safe place, for now, is to focus on the accuracy, reliability and commerciality of the predictions/insights created by the product. Once again, the more concentrated your solution, the higher the chance of greater performance.
We have been an AI-focused investor for over a decade, leading and funding rounds for AI-first startups that have changed how people buy online, how brands learn more about their stakeholders and how disparate, uncorrelated big data sets can be correlated. We are well equipped for this brave new world with its turbulent blue and red oceans.
Our lens has always been set on early-stage companies solving problems in particular verticals, and this will become ever more important this decade and beyond. As MLaaS becomes inserted into more business models, we encourage entrepreneurs to consider how they can build short-to-medium-term defensibility around their products.
If you are using GPT4 in your startup tech stack, that’s great. But temper your expectations around an investor’s enthusiasm towards it. If you are building custom models, we recommend optimising training data quality, dimensionality, and rarity.
The Blue Ocean Strategy preaches the necessity for differentiation. If you are an AI-first startup, consider considering how your products can deliver just that.