View all on-demand sessions from the Intelligent Security Summit here.
Large Language Models (LLMs), or systems that understand and generate text, have been a hot topic in the field of AI lately. The release of LLMs by tech giants such as OpenAI, Google, Amazon, Microsoft and Nvidia, and open source communities demonstrates the great potential of the LLM field and represents a major step forward in its development. However, not all language models are created equal.
In this article, we’ll look at the key differences between approaches to using LLMs after they’re built, including open source products, internal use products, product platforms, and products on top of platforms. We’ll also take a closer look at the complexities of each approach, and discuss how each approach is likely to evolve over the next few years. But first the bigger picture.
Contents
What are large language models anyway?
The general applications of LLM models range from simple tasks such as answering questions, text recognition and text classification to more creative tasks such as text or code generation, research into current AI capabilities and human-like interlocutors. The creative generation is certainly impressive, but the more advanced products based on those models are yet to come.
What’s the problem with LLM technology?
The use of LLMs has exploded in recent years as newer and larger systems are developed. One reason is that a single model can be used for several tasks such as text generation, sentence completion, classification, and translation. In addition, they appear to be able to make reasonable predictions when given only a few labeled examples, the so-called few-shot learning.
Let’s take a closer look at three different development paths available for LLM models. We evaluate the potential drawbacks they may face in the future and brainstorm possible solutions.
Open source
Open-source LLMs are created as open collaboration software, where the original source code and models are made freely available for redistribution and modification. This allows AI scientists to work on the high-performance capabilities of the models and use them (for free) for their own projects, rather than limiting model development to a select group of technology companies.
A few examples are Bloom, Yalm even Sales team, which provide environments that enable rapid and scalable AI/ML development. While open source development is by definition open to contributors, it involves high development costs. Hosting, training, and even fine-tuning these models is an additional burden, requiring investment, specialized knowledge, and large amounts of dedicated GPUs.
The continued investment and open source of these technologies by technology companies may be motivated by brand-related goals, such as demonstrating the company’s leadership in the field, or by more practical goals, such as discovering alternative added value that the wider community can reach on with.
In other words, investments and human guidance are needed to make these technologies usable for business applications. Model customization can often be achieved through fine-tuning to certain amounts of human-labeled data, or through ongoing interaction with developers and the results they’ve generated from the models.
Product
The clear leader here is OpenAI, which has created the most useful models and enabled some of them through an API. But many smaller startups, such as CopyAI, JasperAI, and Contenda, are kickstarting the development of their own LLM-powered applications on top of the “model-as-a-service” offered by leaders in the field.
As these smaller companies compete for a share of their respective markets, they harness the power of supercomputer-scale models, tailoring themselves to the task at hand while using a much smaller amount of data. Their applications are usually trained to solve a single task and target a specific and much narrower market segment.
Other companies are developing their own models that compete with OpenAIs, helping to advance the science of generative AI. Examples include AI21, Coherentand GPT-J-6B by EleutheraAIwhere models generate or classify text.
Another application of language models is code generation. Companies like OpenAI and GitHub (with the GitHub Copilot plugin based on OpenAI Codex), Tabnin and Kite produce tools for automatic code generation.
Internal use
Tech giants such as Google, DeepMind and Amazon keep their own versions of LLMs – some of which are based on open-source data – in-house. They research and develop their models to advance the field of language AI; to use them as classifiers for business functions such as social media moderation and classification; or to help develop long tails for large collections of written requests, such as generating ads and product descriptions.
What are the limitations of LLMs?
We have already discussed some disadvantages, such as high development and maintenance costs. Let’s take a closer look at the more technical issues and the possible ways to solve them.
According to researchlarger models more often generate false answers, conspiracies and unreliable information than smaller ones. For example, the GPT-J model with 6B parameters was 17% less accurate than its counterpart with 125M parameters.
Since LLMs are trained on internet data, they can capture unwanted data societal prejudices regarding race, gender, ideology and religion. In this context, alignment with diverse human values ​​still remains a particular challenge.
Provide open access to those models, as in a recent one Galactica case, can also be risky. Without prior human verification, the models could inadvertently produce racist remarks or inaccurate scientific claims.
Is there a solution to improve LLMs?
Model scaling alone seems less likely to improve veracity and avoid explicit content than tuning to training goals other than text imitation.
A bias or truth-detection system with a controlled classification that analyzes content to find parts that fit the definition of “biased” for a given case may be one way to overcome this type of error. But that still leaves you with the problem of training the model.
The solution is data, or more specifically, a large amount of data that has been labeled by humans. After the system feeds in enough data samples and their associated polygon annotation to locate explicit content, parts of the dataset identified as malicious or false are removed or masked to prevent them from being used in the model’s output.
In addition to detecting biases, human evaluation can be used to assess texts for fluency and readability, natural language, grammatical errors, coherence, logic and relevance.
Not quite AGI yet
No doubt some truly impressive advances have been made in AI language models in recent years, and scientists have been able to make strides in some of the most difficult areas of the field. But despite their progress, LLMs still lack some of the most important aspects of intelligence, such as common sense, victim detection, explicit language detection, and intuitive physics.
As a result, some researchers Interrogate whether language-only training is the best way to build truly intelligent systems, regardless of how much data is used. Language functions well as a compression system to convey the essence of messages. But it is difficult to learn the specifics and contexts of human experience through language alone.
A system trained on both form and meaning — simultaneously on videos, images, sounds, and text, for example — could help advance the science of natural language understanding. In any case, it will be interesting to see where the development of robust LLM systems will take science. However, one thing is hard to doubt: the potential value of LLMs is still significantly greater than what has been achieved so far.
Fedor Zhdanov is head of ML at Toloka.
Data decision makers
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.
To read about advanced ideas and up-to-date information, best practices and the future of data and data technology, join DataDecisionMakers.
You might even consider contributing an article yourself!
Read more from DataDecisionMakers
Janice has been with businesskinda for 5 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider businesskinda team, Janice seeks to understand an audience before creating memorable, persuasive copy.