26Aug

I Built an AI Content Repurposing Machine (Full Code Included) | by Hasan Aboul Hasan | Aug, 2024


Here are both the code we used above and the resources.py file, which contains some of my power prompts:

Foundation Code

The Prompts

Now, it’s time for the real deal, what you’ve been waiting for!

In this strategy, with a single click, you’re gonna be able to repurpose your content into three different pieces of content, all formatted in a json output, which will make them easier to access.

No new resources will be needed to use this method; all you need is the updated code. Here it is:

import json
from SimplerLLM.tools.generic_loader import load_content
from SimplerLLM.language.llm import LLM, LLMProvider
from resources import text_to_x_thread, text_to_summary, text_to_newsletter, format_to_json

llm_instance = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")

#This can take a youtube video link, a blog post link, a csv file, etc... as input too
file = load_content("https://learnwithhasan.com/create-ai-agents-with-python/")

#Getting the 3 inputs
x_prompt = text_to_x_thread.format(input = file.content)
newsletter_prompt = text_to_newsletter.format(input = file.content)
summary_prompt = text_to_summary.format(input = file.content)

#Generating the 3 types of social posts
x_thread = llm_instance.generate_response(prompt = x_prompt, max_tokens=1000)
with open("twitter.txt", "w", encoding='utf-8') as f:
f.write(x_thread)

newsletter_section = llm_instance.generate_response(prompt = newsletter_prompt, max_tokens=1000)
with open("newsletter.txt", "w", encoding='utf-8') as f:
f.write(newsletter_section)

bullet_point_summary = llm_instance.generate_response(prompt = summary_prompt, max_tokens=1000)
with open("summary.txt", "w", encoding='utf-8') as f:
f.write(bullet_point_summary)

#Converting them into json format
final_prompt = format_to_json.format(input_1 = x_thread,
input_2 = newsletter_section,
input_3 = bullet_point_summary)

response = llm_instance.generate_response(prompt = final_prompt, max_tokens=3000)

# Validate and write JSON with indentation for readability
try:
json_data = json.loads(response)
with open("Json_Result.json", "w", encoding='utf-8') as f:
json.dump(json_data, f, ensure_ascii=False, indent=4)
print("JSON saved successfully.")
except json.JSONDecodeError as e:
print("Error in JSON format:", e)
with open("Json_Result.json", "w", encoding='utf-8') as f:
f.write(response)

This code’s structure is very similar to the one above; we’re using the same functions here, too.

The main difference is that instead of generating only 1 type of content, we’ll use OpenAI’s GPT model to repurpose the input of choice into three different types of content. We’ll save each output in a TXT file with its respective name.

💡 Note that you can edit the prompts here too in order to get different results, as I showed you above. If you face any obstacles we’ll be here to help you on the forum!

Then, after we get the three outputs, we’ll merge them into one json output using a power prompt and save them in a json file.

However, the GPT’s response doesn’t always work and generates a correct json format. That’s why I added a try-except statement so that if it doesn’t work, it prints the code and saves it as raw text.

I can’t get into the details of fixing this, but you can check this blog post; it will definitely help you improve the results.

Now, let’s try it and see what we get!

As you can see, four new files are created; 3 of them contain the three pieces of content individually generated, and a json file which contains the final json formatted output.

Play with the prompts as you like, and you’ll get new results that match your preferences. If you face any problems, don’t hesitate to ask us for help on the forum. We’ll always be there to help you.

👉 The Code

The script works perfectly in the terminal, but why don’t we build a simple, user-friendly interface that makes it easier to run the code?

Plus, people who don’t know anything about coding will be able to use it without interacting with the code at all.

This is super simple if we combine streamlit with our power prompt below:

Act as an expert Python programmer specialized in building user-friendly UIs using Streamlit.  Create a Streamlit UI for the provided script. Make sure to comment all the code to enhance understanding, particularly for beginners. 
Choose the most suitable controls for the given script and aim for a professional, user-friendly interface.
The target audience is beginners who are looking to understand how to create user interfaces with Streamlit. The style of the response should be educational and thorough. Given the instructional nature, comments should be used extensively in the code to provide context and explanations. Output: Provide the optimized Streamlit UI code, segmented by comments explaining each part of the code for better understanding. Input: Provided script: {your input script}

This prompt is part of the premium prompt library, which is updated every month with new special prompts.

Anyway, I used the prompt, and in seconds, I created a UI for my tool with Streamlit. Here’s the code it generated:

import streamlit as st
import json
from SimplerLLM.tools.generic_loader import load_content
from SimplerLLM.language.llm import LLM, LLMProvider
from resources import text_to_x_thread, text_to_summary, text_to_newsletter, format_to_json

llm_instance = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")

st.title("Content Generation With A Single Click")

url = st.text_input("Enter the URL or File Name of your input:")

if st.button("Generate Content"):
if url:
try:
file = load_content(url)

x_prompt = text_to_x_thread.format(input=file.content)
newsletter_prompt = text_to_newsletter.format(input=file.content)
summary_prompt = text_to_summary.format(input=file.content)

x_thread = llm_instance.generate_response(prompt=x_prompt, max_tokens=1000)
newsletter_section = llm_instance.generate_response(prompt=newsletter_prompt, max_tokens=1000)
bullet_point_summary = llm_instance.generate_response(prompt=summary_prompt, max_tokens=1000)

st.subheader("Generated Twitter Thread")
st.write(x_thread)
st.markdown("---")

st.subheader("Generated Newsletter Section")
st.write(newsletter_section)
st.markdown("---")

st.subheader("Generated Bullet Point Summary")
st.write(bullet_point_summary)
st.markdown("---")

final_prompt = format_to_json.format(
input_1=x_thread,
input_2=newsletter_section,
input_3=bullet_point_summary
)
response = llm_instance.generate_response(prompt=final_prompt, max_tokens=3000)

try:
json_data = json.loads(response)
st.markdown("### __Generated JSON Result__")
st.json(json_data)
st.download_button(
label="Download JSON Result",
data=json.dumps(json_data, ensure_ascii=False, indent=4),
file_name="Json_Result.json",
mime="application/json"
)
except json.JSONDecodeError as e:
st.error(f"Error in JSON format: {e}")
st.write(response)
except Exception as e:
st.error(f"An error occurred: {e}")
else:
st.warning("Please enter a valid URL.")

The code above generates the three types of content you chose, displays them, and, at the end, the json result with a download button to download the result in 1 click.

Now, to run the code, you’ll need to save the code as ui.py, open a new terminal and run the following:

streamlit run ui.py

Of course, you can change the file’s name, but you’ll also need to change it to the new file’s name when you run it.

Once you run it, the following web page will open:

As you can see, it’s very simple and straightforward to use. You just enter the link or the name of the file and click the generate button to get all the results.

Rather than keeping the tool only for your use, let people use it and charge them for every use.

Let me explain:

If you build a neat user interface for your tool on your WordPress website (one of the easiest things to do), you can build a points system, and people would buy points to use these tools.

This is the technique I use on his Tools Page, where he charges people a certain number of points on every use, depending on the tool they’re using.

If you want to learn how to clone the same strategy and business model he’s using, check out this guide. It teaches you how to build a SAAS on WordPress and includes a premium forum where the team will be there to help you whenever needed!

If you’re looking for a free source, I also have you covered!

Here’s a Free Guide that teaches you how to start!

Good Luck!



Source link

25Aug

Feature Extraction for Time Series, from Theory to Practice, with Python | by Piero Paialunga | Aug, 2024


Time series are a special animal.

When I started my Machine Learning career I did it because I loved Physics (weird reason to start Machine Learning) and from Physics I understood that I also loved coding and data science a lot. I didn’t really care about the type of data. All I wanted was to be in front of a computer writing 10k lines of code per day.

The truth is that even when you don’t care (I still really don’t) your career will drift you to some kinds of data rather than others.

If you work at SpaceX, you probably won’t do a lot of NLP but you will do a lot of signal processing. If you work at Netflix, you might end up working with a lot of NLP and recommendation systems. If you work at Tesla you will most definitely be a Computer Vision expert and work with images.

When I started as a Physicist, and then I kept going with my PhD in Engineering, I was immediately thrown into the world of signals.
This is just the natural world of engineering: every time you have a setup and extract the information from it, at the end of the day, you treat a signal. Don’t get me wrong…



Source link

25Aug

Automating ETL to SFTP Server Using Python and SQL | by Mary Ara | Aug, 2024


Learn how to automate a daily data transfer process on Windows, from PostgreSQL database to a remote server

Photo by Shubham Dhage on Unsplash

The process of transfering files from one location to another is obviously a perfect candidate for automation. It can be daunting to do repetitively, especially when you have to perform the entire ETL (Extract, Transform, Load) process for several groups of data.

Imagine your company has their data in their data warehouse, and then they decide to contract out part of their analytics to an external data analytics supplier. This supplier is offering a bespoke analytics software that will display dashboards and reports for the core poduction team of your company.

The implication of this is that, you, as the data engineer, will have have to transfer data to this supplier daily, hourly, every 30 minutes or any other frequency decided upon by the out-sourcing contract.

This article explains in detail this ETL process that includes an SFTP upload. We will incorporate Secure File Transfer Protocol (SFTP) which is a secure means of transfering files between two remote servers, by encrypting the files using what is known as the Secure Shell (SSH) protocol.



Source link

21Aug

27 Unique Dev Challenges: A Recent Study Explored the Top Challenges Faced by LLM Developers | by Cobus Greyling | Aug, 2024


This category includes the various error messages developers encounter when working with LLM APIs.

For example, developers might face request errors and data value capacity limit errors during API calls for image editing.

Additionally, issues related to the OpenAI server, such as gateway timeout errors, may arise. This subcategory represents 7.5% of the total challenges identified.

  1. Automating Task Processing: LLMs can automate tasks like text generation and image recognition, unlike traditional software that requires manual coding.
  2. Dealing with Uncertainty: LLMs produce variable and sometimes unpredictable outputs, requiring developers to manage this uncertainty.
  3. Handling Large-Scale Datasets: Developing LLMs involves managing large datasets, necessitating expertise in data preprocessing and resource efficiency.
  4. Data Privacy and Security: LLMs require extensive data for training, raising concerns about ensuring user data privacy and security.
  5. Performance Optimisation: Optimising LLM performance, particularly in output accuracy, differs from traditional software optimisation.
  6. Interpreting Model Outputs: Understanding and ensuring the reliability of LLM outputs can be complex and context-dependent.

The results show that 54% of these questions have fewer than three replies, suggesting that they are typically challenging to address.

The introduction of GPT-3.5 and ChatGPT in November 2022, followed by GPT-4 in March 2023, really accelerated the growth of the LLM developer community, markedly increasing the number of posts and users on the OpenAI developer forum.

The challenges faced by LLM developers are multifaceted and diverse, encompassing 6 categories and 27 distinct subcategories.

Challenges such as API call costs, rate limitations, and token constraints are closely associated with the development and use of LLM services.

Developers frequently raise concerns about API call costs, which are affected by the choice of model and the number of tokens used in each request.

Rate limitations, designed to ensure service stability, require developers to understand and manage their API call frequencies. Token limitations present additional hurdles, particularly when handling large datasets or extensive context.

Therefore, LLM providers should develop tools to help developers accurately calculate and manage token usage, including cost optimisation strategies tailored to various models and scenarios. Detailed guidelines on model selection, including cost-performance trade-offs, will aid developers in making informed decisions within their budget constraints.

Additionally, safety and privacy concerns are critical when using AI services. Developers must ensure their applications adhere to safety standards and protect user privacy.

OpenAI should continue to promote its free moderation API, which helps reduce unsafe content in completions by automatically flagging and filtering potentially harmful outputs.

Human review of outputs is essential in high-stakes domains and code generation to account for system limitations and verify content correctness.

Limiting user input text and restricting the number of output tokens can also help mitigate prompt injection risks and misuse, and these measures should be integrated into best practices for safe AI deployment.

By examining relevant posts on the OpenAI developer forum, it is evident that LLM development is rapidly gaining traction, with developers encountering more complex issues compared to traditional software development.

The study aims to analyse the underlying challenges reflected in these posts, investigating trends and problem difficulty levels on the LLM developer forum.

I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

LinkedIn



Source link

20Aug

ChatGPT vs. Claude vs. Gemini for Data Analysis (Part 2): Who’s the Best at EDA? | by Yu Dong | Aug, 2024


Five criteria to compare ChatGPT, Claude, and Gemini in tackling Exploratory Data Analysis

· Context
· What is EDA
· Evaluation Criteria
· Problem Setup
· ChatGPT-4o
· Claude 3.5 Sonnet
· Gemini Advanced
· Final Results

Welcome back to the second installment of my series, ChatGPT vs. Claude vs. Gemini for Data Analysis! In this series, I aim to compare these AI tools across various data science and analytics tasks to help fellow data enthusiasts and professionals choose the best AI assistant for their needs. If you missed the first article, I compared their performance in writing and optimizing SQL queries — be sure to check it out!

Although the 2024 Olympics have concluded, our AI competition is just heating up. So far, Claude 3.5 Sonnet has taken the lead! But can it maintain its position, or will ChatGPT and Gemini catch up? 🏆

In this second article, we’ll focus on their ability to independently conduct Exploratory Data Analysis (EDA). As a data scientist, imagine the convenience of having an AI tool that can instantly provide data insights and recommendations for a new…



Source link

18Aug

The Math Behind Keras 3 Optimizers: Deep Understanding and Application | by Peng Qian | Aug, 2024


This is a bit different from what the books say.

The Math Behind Keras 3 Optimizers: Deep Understanding and Application.
The Math Behind Keras 3 Optimizers: Deep Understanding and Application. Image by DALL-E-3

Optimizers are an essential part of everyone working in machine learning.

We all know optimizers determine how the model will converge the loss function during gradient descent. Thus, using the right optimizer can boost the performance and the efficiency of model training.

Besides classic papers, many books explain the principles behind optimizers in simple terms.

However, I recently found that the performance of Keras 3 optimizers doesn’t quite match the mathematical algorithms described in these books, which made me a bit anxious. I worried about misunderstanding something or about updates in the latest version of Keras affecting the optimizers.

So, I reviewed the source code of several common optimizers in Keras 3 and revisited their use cases. Now I want to share this knowledge to save you time and help you master Keras 3 optimizers more quickly.

If you’re not very familiar with the latest changes in Keras 3, here’s a quick rundown: Keras 3 integrates TensorFlow, PyTorch, and JAX, allowing us to use cutting-edge deep learning frameworks easily through Keras APIs.



Source link

17Aug

The Azure Landing Zone for a Data Platform in the Cloud | by Mariusz Kujawski | Aug, 2024


Working with sensitive data or within a highly regulated environment requires safe and secure cloud infrastructure for data processing. The cloud might seem like an open environment on the internet and raise security concerns. When you start your journey with Azure and don’t have enough experience with the resource configuration it is easy to make design and implementation mistakes that can impact the security and flexibility of your new data platform. In this post, I’ll describe the most important aspects of designing a cloud adaptation framework for a data platform in Azure.

Image by the author

An Azure landing zone is the foundation for deploying resources in the public cloud. It contains essential elements for a robust platform. These elements include networking, identity and access management, security, governance, and compliance. By implementing a landing zone, organizations can streamline the configuration process of their infrastructure, ensuring the utilization of best practices and guidelines.

An Azure landing zone is an environment that follows key design principles to enable application migration, modernization, and development. In Azure, subscriptions are used to isolate and develop application and platform resources. These are categorized as follows:

  • Application landing zones: Subscriptions dedicated to hosting application-specific resources.
  • Platform landing zone: Subscriptions that contain shared services, such as identity, connectivity, and management resources provided for application landing zones.

These design principles help organizations operate successfully in a cloud environment and scale out a platform.

Image by the author

A data platform implementation in Azure involves a high-level architecture design where resources are selected for data ingestion, transformation, serving, and exploration. The first step may require a landing zone design. If you need a secure platform that follows best practices, starting with a landing zone is crucial. It will help you organize the resources within subscriptions and resource groups, define the network topology, and ensure connectivity with on-premises environments via VPN, while also adhering to naming conventions and standards.

Architecture Design

Tailoring an architecture for a data platform requires a careful selection of resources. Azure provides native resources for data platforms such as Azure Synapse Analytics, Azure Databricks, Azure Data Factory, and Microsoft Fabric. The available services offer diverse ways of achieving similar objectives, allowing flexibility in your architecture selection.

For instance:

  • Data Ingestion: Azure Data Factory or Synapse Pipelines.
  • Data Processing: Azure Databricks or Apache Spark in Synapse.
  • Data Analysis: Power BI or Databricks Dashboards.

We may use Apache Spark and Python or low-code drag-and-drop tools. Various combinations of these tools can help us create the most suitable architecture depending on our skills, use cases, and capabilities.

High level architecture (Image by the author)

Azure also allows you to use other components such as Snowflake or create your composition using open-source software, Virtual Machines(VM), or Kubernetes Service(AKS). We can leverage VMs or AKS to configure services for data processing, exploration, orchestration, AI, or ML.

Typical Data Platform Structure

A typical Data Platform in Azure should comprise several key components:

1. Tools for data ingestion from sources into an Azure Storage Account. Azure offers services like Azure Data Factory, Azure Synapse Pipelines, or Microsoft Fabric. We can use these tools to collect data from sources.

2. Data Warehouse, Data Lake, or Data Lakehouse: Depending on your architecture preferences, we can select different services to store data and a business model.

  • For Data Lake or Data Lakehouse, we can use Databricks or Fabric.
  • For Data Warehouse we can select Azure Synapse, Snowflake, or MS Fabric Warehouse.

3. To orchestrate data processing in Azure we have Azure Data Factory, Azure Synapse Pipelines, Airflow, or Databricks Workflows.

4. Data transformation in Azure can be handled by various services.

  • For Apache Spark: Databricks, Azure Synapse Spark Pool, and MS Fabric Notebooks,
  • For SQL-based transformation we can use Spark SQL in Databricks, Azure Synapse, or MS Fabric, T-SQL in SQL Server, MS Fabric, or Synapse Dedicated Pool. Alternatively, Snowflake offers all SQL capabilities.

Subscriptions

An important aspect of platform design is planning the segmentation of subscriptions and resource groups based on business units and the software development lifecycle. It’s possible to use separate subscriptions for production and non-production environments. With this distinction, we can achieve a more flexible security model, separate policies for production and test environments, and avoid quota limitations.

Subscriptions Organization (Image by the author)

Networking

A virtual network is similar to a traditional network that operates in your data center. Azure Virtual Networks(VNet) provides a foundational layer of security for your platform, disabling public endpoints for resources will significantly reduce the risk of data leaks in the event of lost keys or passwords. Without public endpoints, data stored in Azure Storage Accounts is only accessible when connected to your VNet.

The connectivity with an on-premises network supports a direct connection between Azure resources and on-premises data sources. Depending on the type of connection, the communication traffic may go through an encrypted tunnel over the internet or a private connection.

To improve security within a Virtual Network, you can use Network Security Groups(NSGs) and Firewalls to manage inbound and outbound traffic rules. These rules allow you to filter traffic based on IP addresses, ports, and protocols. Moreover, Azure enables routing traffic between subnets, virtual and on-premise networks, and the Internet. Using custom Route Tables makes it possible to control where traffic is routed.

Network Configuration (Image by the author)

Naming Convention

A naming convention establishes a standardization for the names of platform resources, making them more self-descriptive and easier to manage. This standardization helps in navigating through different resources and filtering them in Azure Portal. A well-defined naming convention allows you to quickly identify a resource’s type, purpose, environment, and Azure region. This consistency can be beneficial in your CI/CD processes, as predictable names are easier to parametrize.

Considering the naming convention, you should account for the information you want to capture. The standard should be easy to follow, consistent, and practical. It’s worth including elements like the organization, business unit or project, resource type, environment, region, and instance number. You should also consider the scope of resources to ensure names are unique within their context. For certain resources, like storage accounts, names must be unique globally.

For example, a Databricks Workspace might be named using the following format:

Naming Convention (Image by the author(

Example Abbreviations:

Image by the author

A comprehensive naming convention typically includes the following format:

  • Resource Type: An abbreviation representing the type of resource.
  • Project Name: A unique identifier for your project.
  • Environment: The environment the resource supports (e.g., Development, QA, Production).
  • Region: The geographic region or cloud provider where the resource is deployed.
  • Instance: A number to differentiate between multiple instances of the same resource.

Implementing infrastructure through the Azure Portal may appear straightforward, but it often involves numerous detailed steps for each resource. The highly secured infrastructure will require resource configuration, networking, private endpoints, DNS zones, etc. Resources like Azure Synapse or Databricks require additional internal configuration, such as setting up Unity Catalog, managing secret scopes, and configuring security settings (users, groups, etc.).

Once you finish with the test environment, you‘ll need to replicate the same configuration across QA, and production environments. This is where it’s easy to make mistakes. To minimize potential errors that could impact development quality, it‘s recommended to use an Infrastructure as a Code (IasC) approach for infrastructure development. IasC allows you to create cloud infrastructure as code in Terraform or Biceps, enabling you to deploy multiple environments with consistent configurations.

In my cloud projects, I use accelerators to quickly initiate new infrastructure setups. Microsoft also provides accelerators that can be used. Storing an infrastructure as a code in a repository offers additional benefits, such as version control, tracking changes, conducting code reviews, and integrating with DevOps pipelines to manage and promote changes across environments.

If your data platform doesn’t handle sensitive information and you don’t need a highly secured data platform, you can create a simpler setup with public internet access without Virtual Networks(VNet), VPNs, etc. However, in a highly regulated area, a completely different implementation plan is required. This plan will involve collaboration with various teams within your organization — such as DevOps, Platform, and Networking teams — or even external resources.

You’ll need to establish a secure network infrastructure, resources, and security. Only when the infrastructure is ready you can start activities tied to data processing development.

If you found this article insightful, I invite you to express your appreciation by clicking the ‘clap’ button or liking it on LinkedIn. Your support is greatly valued. For any questions or advice, feel free to contact me on LinkedIn.



Source link

16Aug

WeKnow-RAG


This agentic approach to RAG leverages a graph-based method with a robust data topology to enhance the precision of information retrieval. Knowledge Graphs enable searching for things and not strings by maintaining extensive collections of explicit facts structured as accurate, mutable, and interpretable knowledge triples. It employs a multi-stage retrieval process, integrating a self-assessment mechanism to ensure accuracy in responses. Domain-specific queries are handled using Knowledge Graphs, while web-retrieved data is processed through parsing and chunking techniques.

TLDR

  1. This RAG implementation is again another good example of where an Agentic Approach is followed to create a resilient, knowledge intensive conversational user interface.
  2. There is also an emergence of a Graph approach from two perspectives. A graph approach is followed for agentic flows; LangChain with LangGraph, LlamaIndex Workflows and Haystack Studio from Deepset to name a few. There is also a Graph approach to knowledge, where more time is spend on data discovery and data design to create stronger data topologies.
  3. WeKnow-RAG is also a multi-stage approach to RAG and it is in keeping with recent developments where complexity is introduced in order to create more resilient and multifaceted solutions.
  4. This approach from WeKnow-RAG includes a self-assessment mechanism.
  5. KG was used for domain specific queries, while parsing and chunking was used for web retrieved data.
  6. This implementation sees both Knowledge Base and a RAG/Chunking approach being combined for instances where data design is possible, or not possible. Often technologies are pitted against each other, in a one or the other scenario. Here it is illustrated that sometimes the answer is somewhere in the middle.

Introduction

This research considers a domain-specific KG-enhanced RAG system which is developed, designed to adapt to various query types and domains, enhancing performance in both factual and complex reasoning tasks.

A multi-stage retrieval method for web pages is introduced, utilising both sparse and dense retrieval techniques to effectively balance efficiency and accuracy in information retrieval.

A self-assessment mechanism for LLMs is implemented, enabling them to evaluate their confidence in generated answers, thereby reducing hallucinations and improving overall response quality.

An adaptive framework is presented that intelligently combines KG-based and web-based RAG methods, tailored to the characteristics of different domains and the rate of information change.

Adaptive & Intelegent Agents

WeKnow-RAG integrates Web search and Knowledge Graphs into a Retrieval-Augmented Generation (RAG) architecture, improving upon traditional RAG methods that typically rely on dense vector similarity search for retrieval.

While these conventional methods segment the corpus into text chunks and use dense retrieval systems exclusively, they often fail to address complex queries effectively.

The Challenges Identified Related To Chunking

RAG chunking implementations face several challenges:

  1. Metadata and Hybrid Search Limitations: Methods that use metadata filtering or hybrid search are restricted by the predefined scope of metadata, limiting the system’s flexibility.
  2. Granularity Issues: Achieving the right level of detail within vector space chunks is difficult, leading to responses that may be relevant but not precise enough for complex queries.
  3. Inefficient Information Retrieval: These methods often retrieve large amounts of irrelevant data, which increases computational costs and reduces the quality and speed of responses.
  4. Over-Retrieval: Excessive data retrieval can overwhelm the system, making it harder to identify the most relevant chunks.

These challenges underscore the need for more refined chunking strategies and retrieval mechanisms to improve relevance and efficiency in RAG systems.

Why KG

An effective RAG system should prioritise retrieving only the most relevant information while minimising irrelevant content.

Knowledge Graphs (KGs) contribute to this goal by offering a structured and precise representation of entities and their relationships.

Unlike vector similarity, KGs organise facts into simple, interpretable knowledge triples (e.g., entity — relationship → entity).

KGs can continuously expand with new data, and experts can develop domain-specific KGs to ensure accuracy and reliability in specialised fields. Many recent developments exemplify this approach, and research is increasingly focused on leveraging graph-based methods in this area.

In all honesty, Knowledge Graphs is an area where I want to improve my skills, as I feel my understanding of the technology is not as strong as it should be.

The following is an excerpt from the function call prompt used for querying knowledge graphs in the Music Domain. The full version includes several additional functions.

"System":"You are an AI agent of linguist talking to a human. ... For all questions you MUST use one of the functions provided. You have access to the following tools":{
"type":"function",
"function":{
"name":"get_artist_info",
"description":"Useful for when you need to get information about an artist, such as singer, band",
"parameters":{
"type":"object",
"properties":{
"artist_name":{
"type":"string",
"description":"the name of artist or band"
},
"artist_information":{
"type":"string",
"description":"the kind of artist information, such as birthplace, birthday, lifespan, all_works, grammy_count, grammy_year, band_members"
}
},
"required":[
"artist_name",
"artist_information"
]
}
}
}"...
To use these tools you must always respond in a Python function
call based on the above provided function definition of the tool! For example":{
"name":"get_artist_info",
"params":{
"artist_name":"justin bieber ",
"artist_information":"birthday"
}
}"User":{
"query"
}

Finally

WeKnow-RAG enhances LLM accuracy and reliability by combining structured knowledge from graphs with flexible dense vector retrieval. This system uses domain-specific knowledge graphs for better performance on factual and complex queries and employs multi-stage web retrieval techniques to balance efficiency and accuracy.

Additionally, it incorporates a self-assessment mechanism to reduce errors in LLM-generated answers. The framework intelligently combines KG-based and web-based methods, adapting to different domains and the pace of information change, ensuring optimal performance in dynamic environments.

According to the study, WeKnow-RAG has shown excellent results in extensive tests which stands to reason, as this approach is in alignment with what are being considered as the most promising technology and architectural approaches for the near future.

✨✨ Follow me on LinkedIn for updates on Large Language Models

I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

LinkedIn





Source link

15Aug

AI Agent Evaluation Framework From Apple | by Cobus Greyling | Aug, 2024


The notion of a World State is something I find very interesting, where certain ambient or environmental settings need to be accessed to enable certain actions.

This World State alludes to the research Apple did regarding Ferrit-UI and other research like WebVoyager. Where there is a World the agent needs to interact with. This world currently is constituted by surfaces or screens and needs to navigate browser windows, mobile phone OSs and more.

Milestones are key points which need to be executed in order to achieve or full-fill the user intent. These can also be seen as potential points of failure should it not be possible to execute.

In the example in the image above, the User intent is to send a message, while cellular service is turned off.

The Agent should first understand the User’s intent, and prompt for necessary arguments from the User. After collecting all arguments with the help of the search_contacts tool, the Agent attempted to send the message, figured out it needs to enable cellular service upon failure, and retried.

To evaluate this trajectory, we find the best match for all Milestones against Message Bus and World State in each turn while maintaining topological order.

This is an excellent example of how, for an Agent to be truly autonomous, it needs to be in control of its environment.

Despite the paradigm shift towards a more simplified problem formulation, the stateful, conversational and interactive nature of task oriented dialog remains, and poses a significant challenge for systematic and accurate evaluation of tool-using LLMs.

Stateful

Apple sees state as not only the conversational dialog turns or dialog state, but also the state of the environment in which the agents live.

This includes implicit state dependencies between stateful tools, allowing the agent to track and alter the world state based on its world or common-sense knowledge, which is implicit from the user query.

Something else I find interesting in this study is the notion of a Knowledge Boundary, which inform the user simulator what it should and should not know, providing partial access to expected result, combating hallucination. This is analogous to in and out of domain questions.

Milestones and Minefields, which define key events that must or must not happen in a trajectory, allowing us to evaluate any trajectory with rich intermediate and final execution signals.

For the conversational user interface, there are two scenarios defined…

Single / Multiple Tool Call

The one scenario is where there is a single conversation or dialog/user turn, with multiple tool calling procedures in the background.

Hence the user issues a single request which is not demanding from a NLU dialog state management perspective, but demands heavy lifting in the background.

Single / Multiple User Turn

In other scenarios there might only be a single tool call event or milestone, but multiple dialog turns are required to establish the user intent, disambiguate where necessary, collect relevant and required information from the user, etc.



Source link

13Aug

What to Study if you Want to Master LLMs | by Ivo Bernardo | Aug, 2024


What foundational concepts should you study if you want to understand Large Language Models?

Image by solenfeyissa @ Unsplash.com

Most of the code we use to interact with LLMs (Large Language Models) is hidden behind several APIs — and that’s a good thing.

But if you are like me, and want to understand the ins and outs of these magical models, there’s still hope for you. Currently, apart from the researchers working on developing and training new LLMs, there’s mostly two types of people playing with these types of models:

  • Users, that interact via applications such as ChatGPT or Gemini.
  • Data scientists and developers that work with different libraries, such as llangchain, llama-index or even using Gemini or OpenAI apis, that simplify the process of building on top of these models.

The problem is — and you may have felt it — that there is a fundamental knowledge in text mining and natural language processing that is completely hidden away in consumer products or APIs. And don’t take me wrong — they are great for developing cool use cases around these technologies. But, if you want to a have deeper knowledge to build complex use cases or manipulate LLMs a bit better, you’ll need to check the fundamentals — particularly when the models behave as you…



Source link

Protected by Security by CleanTalk