Feedback

Chat Icon

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

What Is Ollama?
14%

What Ollama Is, and What It Is Not

This is the part that confuses newcomers, so it is worth pinning down at the start of this book. Local AI is not one tool, it is a layered stack, and Ollama only occupies one of those layers.

The following table shows these different layers, examples of what they do, and where they fit in the stack.

LayerWhat it doesExamples
Model weightsWhat the model learned, stored as a fileLlama, Qwen, Gemma, Mistral (distributed as GGUF, safetensors, etc.)
Inference engineDoes the actual math on CPU or GPUllama.cpp, MLX
Runtime / serverManages models, exposes APIs, handles loading and unloadingOllama, LM Studio, llama-server, vLLM
Client / interface

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

Enroll now to unlock all content and receive all future updates for free.