Focusing on transformer-based, decoder-only language models with 100 million to 5 billion parameters, researchers surveyed 59 cutting-edge open-source models, examining innovations in architecture, training datasets & algorithms.
They also evaluated model abilities in areas like common-sense reasoning, in-context learning, math & coding.
To assess model performance on devices, researchers benchmarked latency and memory usage during inference.
The term “small” is inherently subjective and relative & its meaning may evolve over time as device memory continues to expand, allowing for larger “small language models” in the future.
The study established 5 billion parameters as the upper limit for small language models (SLMs). As of September 2024, 7 billion parameter large language models (LLMs) are predominantly deployed in the cloud.
Small Language Models (SLMs) are designed for resource-efficient deployment on devices like desktops, smartphones, and wearables.
The goal is to make advanced machine intelligence accessible and affordable for everyone, much like the universal nature of human cognition.
Small Language Models (SLMs) are already widely integrated into commercial devices. For example, the latest Google and Samsung smartphones feature built-in Large Language Model (LLM) services, like Gemini Nano, which allow third-party apps to access LLM capabilities through prompts and modular integrations.