Implementing the DBCopilot with LangChain

To Nha Notes | March 27, 2025, 10:12 a.m.

we will see agents in action, using the following LangChain components:

  • create_sql_agent: An agent designed to interact with relational databases
  • SQLDatabaseToolkit: A toolkit to provide the agent with the required non-parametric knowledge
  • OpenAI: An LLM to act as the reasoning engine behind the agent, as well as the generative engine to produce conversational results
  1. Initialize all the components and establish the connection to the Chinook database, using the SQLDatabase LangChain component (which uses SQLAlchemy under the hood and is used to connect to our database):
from langchain.agents import create_sql_agent
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from langchain.llms.openai import OpenAI
from langchain.agents import AgentExecutor
from langchain.agents.agent_types import AgentType
from langchain.chat_models import ChatOpenAI
llm = OpenAI()
db = SQLDatabase.from_uri('sqlite:///chinook.db')
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent_executor = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)
  1. Before running the agent, let’s first inspect its available tools:
    [tool.name for tool in toolkit.get_tools()]
    

Here is the output:

['sql_db_query', 'sql_db_schema', 'sql_db_list_tables', 'sql_db_query_checker']

Those tools have the following capabilities:

  • sql_db_query: This takes as input a detailed and correct SQL query, and it outputs a result from the database. If the query is not correct, an error message will be returned.
  • sql_db_schema: This takes as input a comma-separated list of tables, and it outputs the schema and sample rows for those tables.
  • sql_db_list_tables: This takes as input an empty string, and it outputs a comma-separated list of tables in the database.
  • sql_db_query_checker: This tool double-checks whether the query is correct before executing it.
  1. Let’s now execute our agent with a simple query to describe the playlisttrack table:
    agent_executor.run("Describe the playlisttrack table")
    

As you can see, with a simple question in natural language, our agent was able to understand its semantics, translate it into an SQL query, extract the relevant information, and use it as context to generate the response.

But how was it able to do all of that? Under the hood, the SQL agent comes with a default prompt template, which makes it tailored to this type of activity. Let’s see the default template of the LangChain component:

print(agent_executor.agent.llm_chain.prompt.template)

Here is the output obtained:

You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct sqlite query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 10 results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the below tools. Only use the information returned by the below tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.
DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.
If the question does not seem related to the database, just return "I don't know" as the answer.
sql_db_query: Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', using sql_db_schema to query the correct table fields.
sql_db_schema: Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. 
Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: 'table1, table2, table3'
sql_db_list_tables: Input is an empty string, output is a comma separated list of tables in the database.
sql_db_query_checker: Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [sql_db_query, sql_db_schema, sql_db_list_tables, sql_db_query_checker]
Action Input: the input to the action
...
Question: {input}
Thought: I should look at the tables in the database to see what I can query.  Then I should query the schema of the most relevant tables.
{agent_scratchpad}

 

We can also see how the agent is able to correlate more than one table within the database:

agent_executor.run('what is the total number of tracks and the average length of tracks by genre?')
References

Referenced in the Building LLM Powered Applications  ebook by Valentina Alto