For instance, a dedicated SLM could be used to generate dynamic creative assets in real-time, focusing solely on this specific function. This contrasts with multimodal LLMs like Microsoft’s Copilot, which are trained to perform multiple tasks such as writing code or generating text-to-image.
There are a handful of SLMs in the market, including Microsoft’s Phi 2 and Orca 2 (which uses Meta’s open-sourced Llama 2), Google’s T5-Small and BERT, and GPT-Neo, a scaled-down version of OpenAI’s GPT.
These models can exist locally, as well, like on a mobile device, which is driving much of the interest around SLMs today, said Lawrence.
And while training LLMs can take months, sometimes years, according to Olaye, you can train an SLM in one week.
What are its use cases?
AT&T began using SLMs late last year for simpler tasks that require less complex reasoning, such as subdocument summarization and classification within portions of its question-and-answer Ask AT&T chat applications for internal documents, said Mark Austin, the company’s vp of data science.
“While there is a cost savings, the main focus was for speed, which is important if you’re using it to build metadata, for example, across hundreds of thousands of documents,” said Austin.
While R/GA’s brands have yet to explore SLMs for consumer-facing campaigns, limited by copyright and privacy concerns, some brands are using this tool to streamline internal processes.
For example, one brand, via an SML-powered chatbot trained on a small set of that brand’s assets, streamlined its legal process to support the rest of the business and third parties, according to Olaye, who wouldn’t share the brand specifics.
“[Brands’] legal and business affairs team take a lot of calls from people asking, ‘Can I use this asset?’ ‘Is this the right copy?’” he said. “We went into a process of automating that. Now, the bot can bypass a lot of the questions that you normally pick up the phone to talk to legal about.”
What are the limitations?
The technology is still in its infancy. While SLMs mitigate hallucinations to some degree, they may still occur, albeit less frequently than with LLMs, said Olaye.
While narrower data enhances the specificity of SLMs, they are limited in their breadth of information, which hinders the execution of complex tasks compared to multimodal LLMs.
“There’s a lot of unknown about SLMs and where exactly they fit,” said Lawrence.
SLMs are open-source, which raises concerns regarding data privacy and security and could hinder widespread adoption.
“Responsible AI use means understanding the risks and how to safely navigate them, and that includes only sharing information that is safe to share,” said Lawrence. “Just because a model is customized to train on specific data doesn’t mean it shouldn’t go through the same protections, so the same approach to responsible use should apply regardless of the model size.”
Source link