AI disruptors

AI disruptors

Deepseek, Local LLMs, AI Agents, AGI

Just a couple of months ago, to many in the US, it seemed like AI progress was plateauing off.

In the last few weeks, this was completely reversed, and there was a solid plan in place for US AI dominance.

Then, a Chinese company called DeepSeek released a few models that made that plan moot.

The plan was based on a thing called LLM Scale. The idea that making bigger models using more compute and data would yield more intelligent systems. This was expensive, but the plan had that covered - with government, bank, AI lab and compute provider in sync on how it would be executed.

Deepseek's release basically said that the numbers and costs in the plan were wrong. Could be done at much lower cost.

Also, DeepSeek made their ideas open, allowing anyone anywhere in the world to reproduce what was closely guarded proprietary tech in the US.

There are other challengers to both llm scale and the plan empowering proprietary big tech in US AI - they have existed since long before the recent deepseek buzz.

These challengers were the slow moving disruptors-in-the-making that were fast catching up with the llm scale based proprietary leaders. These are unaffected directly by the deepseek news.

One such challenger is smaller, open, local models that can run on a Mac or PC or mobile that perform better and better, getting much closer to much larger state of the art models.

Another major challenger is built on agents - this term has too many variants and many interpretations, but I am referring to a very specific interpretation as the challenger. A sketch of this interpretation goes as follows:

  • An agent is something that can react to a request with some response. An agent may in turn call other agents to construct the response.
  • An AI agent is an agent, at some point, that calls an AI model.
  • An agent can also be formed by combining other agents.

There are many other such challengers in the making to solve AGI (Artificial General Intelligence), but these are all considered much longer term - they need years of research before being directly applicable to problems that llms, local models and agents can solve today. A few of these are:

  • Yann LeCun's vision - things like world model, JEPA, cat level AI, etc.
  • Numenta/Jeff Hawkins' thousand brain hypothesis
  • Symbolic and logic based AI
  • Reinforcement Learning
  • Ilya Sutskeyver's SSI (Safe SuperIntelligence)

There are millions of ideas being considered/researched/tried out (just see all the arxiv papers being published about AI everyday, or new github repos around AI ), but here are the top ideas that I believe to be most relevant to everyone working on building better AI systems:

  • LLM Scale
  • The end user experience matters more than how we get there
  • Scaling homogenous systems will always win over custom heuristics - from Sutton's "The bitter lesson"
  • Sensory (vision and hearing being the dominant senses for humans) learning is better than abstract (language based) learning
  • The choice of latent representation is important.
  • Reapplying and recombining past ideas in new ways - for example, multi-agent systems can apply ideas from ML and CS - random forests, xgboost, evolutionary algorithms, etc.
  • Unsupervised evaluators/LLM judges are important as human labelling becomes brittle and expensive
  • Relative comparison (like pairwise evaluations) is superior to absolute scoring and comparing - whether by humans or AI

Coming soon: My custom analysis and benchmarking of 100 local llms on a custom dataset.

If you enjoyed this article, please support my further work via ko-fi - the more support I get, the more I can focus on similar AI related content, research and building. Thank you!