Feedback

Chat Icon

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

Building Advanced Agents: Introduction
76%

Pass 1: A Bare-Minimum Chat Loop against a Local Ollama Model

In the previous chapter we built a small management CLI that wrapped 4 endpoints (list, show, pull, ps). That work taught us how the SDK exposes typed responses, how to iterate a streaming endpoint (the pull progress events), and how to wire a script into uv as a real command. We're now going to use the same project, the same config.py, and the same Client we already have, but build something more interactive: a chat REPL we can actually talk to.

A REPL (read-eval-print loop) is the quickest way to develop a feel for a model. You feed it a prompt, you read what comes back, you adjust, you try again. When you're picking between two models for a job, or testing whether a system message changes behavior in the direction you want, a REPL beats a notebook and beats curl.

In the beginning we'll build a basic chat loop that simply works, then we'll add more features to make it more powerful. We've been using relatively small models so far, but for the advanced features we'll add later, we'll need a larger model. A GPU is required at this level.

Step 1: Build the Client

Inside pass_1_repl_basic.py, we create the Ollama client by pointing it at a host we configure in config.py:

client = Client(host=OLLAMA_HOST)

That client is the object we'll use to send chat requests.

Step 2: The REPL Loop

A REPL is just a set of instructions that we wrap in while True so it keeps going until the user asks to quit:

while True:
    user = input("You > ")

    if user.strip() == "/bye":
        break

We pick /bye as the exit word (same convention as the Ollama CLI). When the user types it, we break out of the loop and the program ends.

Step 3: Send One Message, Print One Reply

This is the heart of the script:

response = client.chat(
    model=OLLAMA_MODEL,
    messages=[{"role": "user"

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

Enroll now to unlock all content and receive all future updates for free.