Build Your First AI Agent: A Step-by-Step Python Tutorial
This tutorial walks you through building a minimal but real AI agent from scratch. You'll write Python code that uses an LLM (OpenAI API), a couple of tools, and a reasoning loop. By the end, you’ll have a working agent that can plan, use tools, and answer questions—all on your machine.
Prerequisites: Basic Python, an OpenAI API key, and a terminal. No prior AI agent experience required.
1. Introduction​
You will build an AI agent that can:
- Understand natural language requests
- Decide when to call tools (calculator, web search mock, file reader)
- Execute those tools and observe the results
- Continue reasoning until it gives a final answer
Why AI agents matter: They extend LLMs from pure text generators into systems that can act—look up information, perform calculations, interact with APIs, and complete multi‑step tasks. This is the foundation for production assistants, support bots, and autonomous workflows.
2. What Is an AI Agent (Brief)​
An AI agent is a system with three core parts:
- LLM (Language Model) – the reasoning brain
- Tools – external capabilities the LLM can invoke (APIs, databases, code execution)
- Loop – a cycle where the LLM decides which tool to call, the tool runs, the result is fed back, and the LLM decides the next step
Think of it as: LLM + Tools + Loop.
A chatbot only produces text; an agent does things.
3. Architecture Overview​
User Input
↓
LLM (Reasoning)
↓
Decision → Tool Call? → Yes → Execute Tool → Observation → LLM (again)
↘ No → Final Answer → Output
The loop continues until the LLM emits a final answer (no more tool calls). Each tool execution is an observation that enriches the context.
We'll implement this with OpenAI’s function‑calling (tools API), which gives us structured tool calls and makes the loop simple.
4. Tech Stack​
- Python 3.10+
- OpenAI Python SDK (
openai) - A few built‑in Python modules (
json,math)
You can swap the OpenAI API for any LLM provider that supports function calling (or even raw prompt‑based tool definitions). We'll stick with OpenAI for simplicity.
5. Step 1: Setup Environment​
Install dependencies:
pip install openai
API key:
Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY="sk-..." # Linux/macOS
# or on Windows: set OPENAI_API_KEY=sk-...
Alternatively, you can provide it directly in the code (not recommended for production).
6. Step 2: Create Basic LLM Call​
Start with a bare minimum script to verify your API connection and see a response.
import openai
import os
openai.api_key = os.getenv("OPENAI_API_KEY")
response = openai.ChatCompletion.create(
model="gpt-4o-mini", # cheapest model that supports function calling
messages=[{"role": "user", "content": "Hello, what is 2+2?"}]
)
print(response.choices[0].message.content)
Run it. You should see something like "2+2 equals 4." This proves your setup works.
7. Step 3: Add Tools​
We'll define three simple tools. Each is a Python function with a clear description, so the LLM knows when to use them.
import json
import math
# ----- Tool definitions -------------------------------------------------------
def calculator(expression: str) -> str:
"""Safely evaluate a mathematical expression."""
# Use only allowed characters and functions
allowed_names = {"__builtins__": None, "math": math}
try:
result = eval(expression, allowed_names, {})
return str(result)
except Exception as e:
return f"Error: {e}"
def web_search_mock(query: str) -> str:
"""Simulate a web search (replace with a real search API)."""
# In a real agent, you'd call SerpAPI, Tavily, etc.
return (
f"Mock result for: {query}. "
"The current weather in London is 15°C, cloudy."
)
def read_file(filepath: str) -> str:
"""Read content from a local file."""
try:
with open(filepath, "r") as f:
return f.read()
except FileNotFoundError:
return "Error: file not found."
except Exception as e:
return f"Error: {e}"
# Tool schemas for the LLM (OpenAI function calling format)
TOOLS = [
{
"type": "function",
"function": {
"name": "calculator",
"description": "Evaluate a math expression. Pass a string like '2+2'.",
"parameters": {
"type": "object",
"properties": {
"expression": {"type": "string", "description": "The expression"}
},
"required": ["expression"]
}
}
},
{
"type": "function",
"function": {
"name": "web_search_mock",
"description": "Search the web for current information.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "read_file",
"description": "Read the contents of a file from the local filesystem.",
"parameters": {
"type": "object",
"properties": {
"filepath": {"type": "string", "description": "Path to file"}
},
"required": ["filepath"]
}
}
}
]
# Map tool names to Python functions
AVAILABLE_TOOLS = {
"calculator": calculator,
"web_search_mock": web_search_mock,
"read_file": read_file,
}
These tools are minimal but demonstrate real capabilities. In production you'd add error handling, async calls, and external APIs.
8. Step 4: Build Agent Loop​
The agent loop repeatedly:
- Sends messages (user prompt + previous tool results) to the LLM
- If the LLM returns a final answer, print it and exit
- If the LLM returns a tool call, execute the tool and append the result as a new message
- Go back to step 1
def run_agent(user_query: str, max_iterations: int = 10):
"""
Main agent loop.
"""
messages = [{"role": "user", "content": user_query}]
iterations = 0
while iterations < max_iterations:
response = openai.ChatCompletion.create(
model="gpt-4o-mini",
messages=messages,
tools=TOOLS,
tool_choice="auto", # LLM decides whether to use a tool
)
message = response.choices[0].message
# If no tool calls, we have a final answer
if not message.tool_calls:
print("Agent:", message.content)
return message.content
# Process each tool call (some responses may contain multiple)
messages.append(message) # add LLM message (with tool_calls) to history
for tool_call in message.tool_calls:
tool_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
print(f"Tool called: {tool_name}({arguments})")
# Execute the tool
if tool_name in AVAILABLE_TOOLS:
result = AVAILABLE_TOOLS[tool_name](**arguments)
else:
result = f"Error: unknown tool '{tool_name}'"
# Append tool result as a message from 'tool' role
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result,
})
iterations += 1
print("Agent: Maximum iterations reached.")
return None
Key points:
- We use
tool_choice="auto"so the model can also answer directly if no tool is needed. - The LLM can request multiple tools in one response; we handle them all.
- Each tool result goes back into the conversation, allowing the LLM to reconsider.
9. Step 5: Run Your First Agent​
Save the complete script as agent.py and run it:
if __name__ == "__main__":
query = input("You: ")
run_agent(query)
Example interaction:
You: What is the square root of 144 plus today's temperature in London?
Tool called: calculator({'expression': 'math.sqrt(144)'})
Tool called: web_search_mock({'query': 'current temperature London'})
Agent: The square root of 144 is 12.0, and the current temperature in London is 15°C. So, 12.0 + 15 = 27.0.
Here the agent correctly planned: it needed both a calculation and a web search, then combined them.
Try another:
You: Read the file 'notes.txt' and summarize it.
Tool called: read_file({'filepath': 'notes.txt'})
Agent: The file contains a short note: "Meeting at 3pm, bring laptop." The summary is a single-sentence meeting reminder.
Expected behavior:
The agent automatically decides which tools to call, in what order, and integrates the results into a coherent answer. The loop stops when the LLM doesn't request a tool.
10. Common Mistakes​
- Overcomplicating early agent design – Start with a simple loop and 1‑2 tools. Don't build complex memory or multi‑step planning before understanding the basic cycle.
- Confusing agent with chatbot – A chatbot only chats; an agent acts (calls tools). If your system never invokes a function, it's not yet an agent.
- Ignoring tool boundaries – Let the LLM reason, but always validate tool inputs before executing. Never pass raw LLM output directly to dangerous operations (e.g.,
evalwithout strict sandboxing). - No max iterations – LLMs can get stuck in loops; always set a
max_iterationsguard. - Not feeding tool errors back – If a tool fails, tell the LLM so it can try a different approach.
11. Next Steps​
Your agent now handles multiple tools and basic reasoning. To go further:
- Replace mock tools with real APIs: weather, database queries, ticket creation.
- Add memory – persist conversation history or long‑term user preferences.
- Explore multi‑agent systems – multiple agents with different tools collaborating.
- Use the Model Context Protocol (MCP) – standardized tool servers that any agent can talk to, eliminating custom glue code.
- Orchestrate complex workflows – break down multi‑step tasks into reliable pipelines.
For more advanced patterns, check out Agent Testing, Agent Monitoring, and Agent Evaluation on AgentDevPro.
This tutorial is a foundation. Build on it, break it, and make it your own.