AI Agents: Exploring Agentic Applications | by Cobus Greyling | Jul, 2024

29Jul

Applications based on LLMs are evolving & the next step in this progression of AI Agents are Agentic Applications. Agentic applications still have a Foundation Model as their backbone, but have more agency.

Agentic applications are AI-driven systems designed to autonomously perform tasks and make decisions based on user inputs and environmental context.

These applications leverage advanced models and tools to plan, execute, and adapt their actions dynamically.

By integrating capabilities like tool access, multi-step reasoning, and real-time adjustments, agentic applications can generate and complete complex workflows and provide intelligent solutions.

I must add that while many theories and future projections are based on speculation, I prioritise prototyping and creating working examples. This approach grounds commentary in practical experience, leading to more accurate future projections.

Generative and Language related AI are moving at a tremendous pace, as recent as 2018 the first notion of prompt engineering was introduced to combine NLP tasks and cast those as one question answering problem, within a specific context.

AS recent as Apr 2021, the term RAG as coined by a researcher, which was described as Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.

Only in January 2022 the chain-of-thought prompting technique was proposed by Google researchers.

September 2022 OpenAI introduced Whisper, an open-source acoustic model which approaches human level robustness and accuracy on speech recognition.

In 2023 we saw the progression of Large Language Models from a text-only interface, by introducing image processing and audio.

The term Foundation Model was an apt new reference to Large Language Models which, apart from generating compelling text, can also generate images, videos, speech, music, and more.

The term Foundation Model was coined by Stanford University Human-Centered Artificial Intelligence already in August 2021.

Also in 2023 we saw the rise of Small Language Models (SLMs). And even-though SLMs have a small footprint, they have advanced capabilities in reasoning, Natural Language Generation (NLG), context and dialog management, and more.

In 2023 we also saw the rise of Agents. Agents have as their backbone an LLM, while agents also have access to one or more tools to perform specific tasks.

Agents are able to answer highly ambiguous and complex questions…

Agents leverage LLMs to make a decision on which Action to take. After an Action is completed, the Agent enters the Observation step.

From Observation step, the Agent shares a Thought; if a final answer is not reached, the Agent cycles back to another Action in order to move closer to a Final Answer.

Agents are empowered by tools, these tools can include math libraries, web search, Weather APIs, and other integration points.

Agentic Applications can be seen as the next step in this progression where the agent application have more agency due to being able to browse and interpret the web, have mobile understanding and are capable of accessing multiple modalities.

Source link