Small language models shine for domain-specific or specialized use cases, while making it easier for enterprises to balance performance, cost, and security concerns.
Credit: Andrey_Popov/Shutterstock
Since ChatGPT arrived in late 2022, large language models (LLMs) have continued to raise the bar for what generative AI systems can accomplish. For example, GPT-3.5, which powered ChatGPT, had an accuracy of 85.5% on common sense reasoning data sets, while GPT-4 in 2023 achieved around 95% accuracy on the same data sets. While GPT-3.5 and GPT-4 primarily focused on text processing, GPT-4o — released in May of 2024 — is multi-modal, allowing it to handle text, images, audio and video.
Despite the impressive advancements by the GPT family of models and other open-source large language models, Gartner, in its hype cycle for artificial intelligence in 2024, notes that “generative AI has passed the peak of inflated expectations, although hype about it continues.” Some reasons for disillusionment include the high costs associated with the GPT family …