LLM Proliferation Will Challenge Emerging Testing Market

LLM Proliferation Will Challenge Emerging Testing Market

Cover image from Meta’s llama 3 via a prompt by the author.

The advent of smaller, more efficient LLMs will result in an even more rampant LLM proliferation as they become available to run on smaller and smaller devices. At the same time, problems with models, from safety to misinformation and bias, have created a nascent market for testing LLMs. Most testing will be aimed at large platforms and significant enterprise models. Most small models will likely evade testing, which may open new threat vectors as they embed and spread.

Tiny LLM Proliferation

Researchers at UC Santa Cruz found a way to run LLMs with the power equivalent to a light bulb. Microsoft Research in Asia also announced energy-efficient 1-bit LLMs.

These small LLMs, and all of the other non-web-based LLMs that can easily be downloaded and run in tools like LLM Studio, demonstrate the multiplication of models disconnected from infrastructure.

What I mean by that is that these tools will be deployed as local, one-off instances, often by individuals, who will trust, at least to some degree, any claims made by those from whom they acquired the code (if any claims are made at all).

Let the testing begin

As these small LLMs start their inevitable spread, companies like Haize Labs are promising to test LLMs into submission, finding all the flaws in their many nooks and crannies. I’m skeptical of that claim, given that even the developers aren’t sure where all the nooks and crannies are located. And new models will change even the known flaws.

LLM Proliferation Will Challenge Emerging Testing Market — **But we don’t want to be tested!**
via Dalle-2 and Microsoft Copilot Designer

For enterprises, an investment in testing will require a target environment employed for mission-critical systems. Test it. Ensure safety. Use it day after day as it is. If it changes, it needs to be retested. Testing will become an ongoing cost associated with enterprise-quality AI systems.

Testing a target platform, however, is very different than testing hundreds or thousands of small, often open-source LLMs. Sure, some forms of automation will evolve, but testing will be voluntary. Many will likely never be tested before they become obsolete, and even when they do become obsolete, they will still be available for download. Some will avoid testing because they specifically violate the norms that testing firms seek to enforce.

I have pointed out the issue of LLM metadata and management before. Many think about the big platforms as being “The AI”—services that can be tested by developers who can be held accountable if they violate trust or law. The small LLMs are completely decoupled. They do and will significantly outnumber the large platform services from Microsoft, OpenAI, Antropic, Google, Apple, and others. Their authors may be hard to identify. They may be stored in places that don’t require metadata about how or if they have been tested.

Testing will only work with AI that stands still long enough to be tested

As the AI community deals with this two-sided conundrum, small LLMs are already common. Very common. While they may safely execute on local machines without much threat to enterprise applications, they still create information that may be copied and pasted into enterprise documents.

AI Training programs need to take these tools into account. On one hand, they can be inexpensive tools for learning. On the other hand, they can be risky partners that may offer up incorrect business information or expose unknowing end users to disturbing experiences and false information. Businesses and users should assume that most models have not been tested.

These “tiny” or “micro” AI models will appear everywhere over the next few months. I speculate that AI testing certification programs will emerge. These testing services will put a stamp on a model instance. This is likely a new business segment for AI startups. Eventually, revenue models of tested models may drive some adoption behavior, but model development will likely be wild and rampant over the next several years. Those on the bleeding edge will not want tested models any more than they want models constrained by guardrails.

Innovation in AI will continue to run afoul of safety, and the innovation of tiny, easily downloaded models that run on the smallest devices will make understanding those models’ features, capabilities, and safety a growing source of concern and confusion.

AI icon by Siipkan Creative from Noun Project (CC BY 3.0)

Did you enjoy LLM Proliferation Will Challenge Emerging Testing Market? Please leave a comment, ask a question or like the post.

For more serious insights on AI, click here.

Follow Us

LLM Proliferation Will Challenge Emerging Testing Market