Dette er en gammel utgave av dokumentet!
**LET'S DEFINE THE PIECES**
Everything you need to know falls into one of three categories:
===== 1. MODELS (LLMs) =====
These are the brains.
They are NOT software.
They are NOT programs.
They are NOT plugins.
They are neural networks stored in a single file, usually:
model.gguf
model.safetensors
model.bin
Examples of models:
LLaMA-3 8B
LLaMA-3 70B
Mistral 7B
Mixtral 8x7B
Qwen 2 7B
Phi-3
A model file contains:
Neurons
Synapses
Every learned pattern
All the intelligence
A model does NOT:
Have a UI
Open PDFs
Connect to NASA
Run indexing
Provide a chat window
It only takes text in → text out.
===== 2. MODEL RUNTIMES (ENGINES) =====
These are the programs that LOAD and RUN the model's brain.
Think of a “runtime” as the machine that runs a model file.
Runtimes include:
Ollama
Terminal-based
Local API
Can fine-tune
Good for automation
Good for pipelines
Very flexible
Acts like a backend server