The principle of Language Model (LM) Assertions is implemented into the DSPy programming framework.
The objective is to make programs more steerable, reliable and accurate in guiding and placing a framework in place for the LLM output.
According to the study, in four different text generation tests LM Assertions not only helped Generative AI Apps follow rules better but also improved task results, meeting rules up to 164% more often and generating up to 37% better responses.
When an assertion constraint fails, the pipeline can back-track and retry the failing module. LM Assertions provide feedback on retry attempts; they inject erring outputs and error messages to the prompt to introspectively self-refine outputs.
There are two types of assertions, hard and soft.
Hard Assertions represent critical conditions that, when violated after a maximum number of retries, cause the LM pipeline to halt, if so defined, signalling a non-negotiable breach of requirements.
On the other hand, suggestions denote desirable but non-essential properties; their violation triggers the self-refinement process, but exceeding a maximum number of retries does not halt the pipeline. Instead, the pipeline continues to execute the next module.
DSPy Assert
The use of dspy.Assert
is recommended during the development stage as checkers or scanners to ensure the LM behaves as expected. Hence a very descriptive way of identifying and addressing errors early in the development cycle.
Below is a basic example on how to formulate an Assert and Suggest.
dspy.Assert(your_validation_fn(model_outputs), "your feedback message", target_module="YourDSPyModuleSignature")dspy.Suggest(your_validation_fn(model_outputs), "your feedback message", target_module="YourDSPyModuleSignature")
When the assertion criteria is not met, resulting in a failure, dspy.Assert
conducts a sophisticated retry mechanism, allowing the pipeline to adjust. Hence the program or pipeline is not necessarily terminated.
On an Assert failing, the pipeline transitions to a special retry state, allowing it to reattempt a failing LM call while being aware of its previous attempts and the error message raised.
After a maximum number of self-refinement attempts, if the assertion still fails, the pipeline transitions to an error state and raises an AssertionError, terminating the pipeline.
This enables Assert to be much more powerful than conventional assert statements, leveraging the LM to conduct retries and adjustments before concluding that an error is irrecoverable.
DSPy Suggest
dspy.Suggest
is best utilised as helpers during the evaluation phase, offering guidance and potential corrections without halting the pipeline.
dspy.Suggest(len(unfaithful_pairs) == 0, f"Make sure your output is based on the following context: '{context}'.", target_module=GenerateCitedParagraph)
In contrast to asserts, suggest statements provide gentler recommendations rather than strict enforcement of conditions.
These suggestions guide the LM pipeline towards desired outcomes in specific domains. If a Suggest condition isn’t met, similar to Assert, the pipeline enters a special retry state, enabling retries of the LM call and self-refinement.
However, if the suggestion consistently fails after multiple attempts at self-refinement, the pipeline logs a warning message called SuggestionError and continues execution.
This flexibility allows the pipeline to adapt its behaviour based on suggestions while remaining resilient to less-than-optimal states or heuristic computational checks.
LM Assertions, a novel programming construct designed to enforce user-specified properties on LM outputs within a pipeline. — Source
Considering the image below, two Suggests are made, as apposed to Asserts. The first suggestions a query length, and the second creating a unique value.
dspy.Suggest(
len(query) "Query should be short and less than 100 characters",
)dspy.Suggest(
validate_query_distinction_local(prev_queries, query),
"Query should be distinct from: "
+ "; ".join(f"{i+1}) {q}" for i, q in enumerate(prev_queries)),
)
I believe much can be gleaned from this implementation of guardrails…
- The guardrails can be described in natural language and the LLM can be leveraged to self-check its responses.
- More complicated statements can be created in Python where values are parsed to perform checks.
- The flexibility of describing the guardrails lend a high order of flexibility in what can be set for specific implementations.
- The division between assertions and suggestions is beneficial, as it allows for a clearer delineation of checks.
- Additionally, the ability to define recourse adds another layer of flexibility and control to the process.
- The study’s language primarily revolves around constraining the LLM and defining runtime retry semantics.
- This approach also serves as an abstraction layer for self-refinement methods into arbitrary steps for pipelines.
⭐️ Follow me on LinkedIn for updates on Large Language Models ⭐️
I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.