Build Your First AI Agent: A Step-by-Step Python Tutorial

This tutorial walks you through building a minimal but real AI agent from scratch. You'll write Python code that uses an LLM (OpenAI API), a couple of tools, and a reasoning loop. By the end, you’ll have a working agent that can plan, use tools, and answer questions—all on your machine.

Prerequisites: Basic Python, an OpenAI API key, and a terminal. No prior AI agent experience required.

1. Introduction

You will build an AI agent that can:

Understand natural language requests
Decide when to call tools (calculator, web search mock, file reader)
Execute those tools and observe the results
Continue reasoning until it gives a final answer

Why AI agents matter: They extend LLMs from pure text generators into systems that can act—look up information, perform calculations, interact with APIs, and complete multi‑step tasks. This is the foundation for production assistants, support bots, and autonomous workflows.

2. What Is an AI Agent (Brief)

An AI agent is a system with three core parts:

LLM (Language Model) – the reasoning brain
Tools – external capabilities the LLM can invoke (APIs, databases, code execution)
Loop – a cycle where the LLM decides which tool to call, the tool runs, the result is fed back, and the LLM decides the next step

Think of it as: LLM + Tools + Loop.

A chatbot only produces text; an agent does things.

3. Architecture Overview

User Input
    ↓
LLM (Reasoning)
    ↓
Decision → Tool Call? → Yes → Execute Tool → Observation → LLM (again)
                 ↘ No → Final Answer → Output

The loop continues until the LLM emits a final answer (no more tool calls). Each tool execution is an observation that enriches the context.

We'll implement this with OpenAI’s function‑calling (tools API), which gives us structured tool calls and makes the loop simple.

4. Tech Stack

Python 3.10+
OpenAI Python SDK (openai)
A few built‑in Python modules (json, math)

You can swap the OpenAI API for any LLM provider that supports function calling (or even raw prompt‑based tool definitions). We'll stick with OpenAI for simplicity.

5. Step 1: Setup Environment

Install dependencies:

pip install openai

API key:
Set your OpenAI API key as an environment variable:

export OPENAI_API_KEY="sk-..."   # Linux/macOS
# or on Windows: set OPENAI_API_KEY=sk-...

Alternatively, you can provide it directly in the code (not recommended for production).

6. Step 2: Create Basic LLM Call

Start with a bare minimum script to verify your API connection and see a response.

import openai
import os

openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.ChatCompletion.create(
    model="gpt-4o-mini",  # cheapest model that supports function calling
    messages=[{"role": "user", "content": "Hello, what is 2+2?"}]
)

print(response.choices[0].message.content)

Run it. You should see something like "2+2 equals 4." This proves your setup works.

7. Step 3: Add Tools

We'll define three simple tools. Each is a Python function with a clear description, so the LLM knows when to use them.

import json
import math

# ----- Tool definitions -------------------------------------------------------

def calculator(expression: str) -> str:
    """Safely evaluate a mathematical expression."""
    # Use only allowed characters and functions
    allowed_names = {"__builtins__": None, "math": math}
    try:
        result = eval(expression, allowed_names, {})
        return str(result)
    except Exception as e:
        return f"Error: {e}"

def web_search_mock(query: str) -> str:
    """Simulate a web search (replace with a real search API)."""
    # In a real agent, you'd call SerpAPI, Tavily, etc.
    return (
        f"Mock result for: {query}. "
        "The current weather in London is 15°C, cloudy."
    )

def read_file(filepath: str) -> str:
    """Read content from a local file."""
    try:
        with open(filepath, "r") as f:
            return f.read()
    except FileNotFoundError:
        return "Error: file not found."
    except Exception as e:
        return f"Error: {e}"

# Tool schemas for the LLM (OpenAI function calling format)
TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "calculator",
            "description": "Evaluate a math expression. Pass a string like '2+2'.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string", "description": "The expression"}
                },
                "required": ["expression"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "web_search_mock",
            "description": "Search the web for current information.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Search query"}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "read_file",
            "description": "Read the contents of a file from the local filesystem.",
            "parameters": {
                "type": "object",
                "properties": {
                    "filepath": {"type": "string", "description": "Path to file"}
                },
                "required": ["filepath"]
            }
        }
    }
]

# Map tool names to Python functions
AVAILABLE_TOOLS = {
    "calculator": calculator,
    "web_search_mock": web_search_mock,
    "read_file": read_file,
}

These tools are minimal but demonstrate real capabilities. In production you'd add error handling, async calls, and external APIs.

8. Step 4: Build Agent Loop

The agent loop repeatedly:

Sends messages (user prompt + previous tool results) to the LLM
If the LLM returns a final answer, print it and exit
If the LLM returns a tool call, execute the tool and append the result as a new message
Go back to step 1

def run_agent(user_query: str, max_iterations: int = 10):
    """
    Main agent loop.
    """
    messages = [{"role": "user", "content": user_query}]
    iterations = 0

    while iterations < max_iterations:
        response = openai.ChatCompletion.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=TOOLS,
            tool_choice="auto",  # LLM decides whether to use a tool
        )
        message = response.choices[0].message

        # If no tool calls, we have a final answer
        if not message.tool_calls:
            print("Agent:", message.content)
            return message.content

        # Process each tool call (some responses may contain multiple)
        messages.append(message)  # add LLM message (with tool_calls) to history
        for tool_call in message.tool_calls:
            tool_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)

            print(f"Tool called: {tool_name}({arguments})")

            # Execute the tool
            if tool_name in AVAILABLE_TOOLS:
                result = AVAILABLE_TOOLS[tool_name](**arguments)
            else:
                result = f"Error: unknown tool '{tool_name}'"

            # Append tool result as a message from 'tool' role
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            })

        iterations += 1

    print("Agent: Maximum iterations reached.")
    return None

Key points:

We use tool_choice="auto" so the model can also answer directly if no tool is needed.
The LLM can request multiple tools in one response; we handle them all.
Each tool result goes back into the conversation, allowing the LLM to reconsider.

9. Step 5: Run Your First Agent

Save the complete script as agent.py and run it:

if __name__ == "__main__":
    query = input("You: ")
    run_agent(query)

Example interaction:

You: What is the square root of 144 plus today's temperature in London?
Tool called: calculator({'expression': 'math.sqrt(144)'})
Tool called: web_search_mock({'query': 'current temperature London'})
Agent: The square root of 144 is 12.0, and the current temperature in London is 15°C. So, 12.0 + 15 = 27.0.

Here the agent correctly planned: it needed both a calculation and a web search, then combined them.

Try another:

You: Read the file 'notes.txt' and summarize it.
Tool called: read_file({'filepath': 'notes.txt'})
Agent: The file contains a short note: "Meeting at 3pm, bring laptop." The summary is a single-sentence meeting reminder.

Expected behavior:
The agent automatically decides which tools to call, in what order, and integrates the results into a coherent answer. The loop stops when the LLM doesn't request a tool.

10. Common Mistakes

Overcomplicating early agent design – Start with a simple loop and 1‑2 tools. Don't build complex memory or multi‑step planning before understanding the basic cycle.
Confusing agent with chatbot – A chatbot only chats; an agent acts (calls tools). If your system never invokes a function, it's not yet an agent.
Ignoring tool boundaries – Let the LLM reason, but always validate tool inputs before executing. Never pass raw LLM output directly to dangerous operations (e.g., eval without strict sandboxing).
No max iterations – LLMs can get stuck in loops; always set a max_iterations guard.
Not feeding tool errors back – If a tool fails, tell the LLM so it can try a different approach.

11. Next Steps

Your agent now handles multiple tools and basic reasoning. To go further:

Replace mock tools with real APIs: weather, database queries, ticket creation.
Add memory – persist conversation history or long‑term user preferences.
Explore multi‑agent systems – multiple agents with different tools collaborating.
Use the Model Context Protocol (MCP) – standardized tool servers that any agent can talk to, eliminating custom glue code.
Orchestrate complex workflows – break down multi‑step tasks into reliable pipelines.

For more advanced patterns, check out Agent Testing, Agent Monitoring, and Agent Evaluation on AgentDevPro.

This tutorial is a foundation. Build on it, break it, and make it your own.

1. Introduction​

2. What Is an AI Agent (Brief)​

3. Architecture Overview​

4. Tech Stack​

5. Step 1: Setup Environment​

6. Step 2: Create Basic LLM Call​

7. Step 3: Add Tools​

8. Step 4: Build Agent Loop​

9. Step 5: Run Your First Agent​

10. Common Mistakes​

11. Next Steps​