Ollama
December 10, 2024

Ollama

AILLMLocalPrivacyCLI

Run LLMs locally with zero hassle. Ollama makes running Llama, Mistral, and other models as easy as running Docker containers.

VIEW ON GITHUB

Ollama is the easiest way to run large language models on your own machine. No cloud, no API keys, no BS.

Quick Deploy

```bash # macOS/Linux - one line install curl -fsSL https://ollama.com/install.sh | sh

# Windows - download installer from ollama.com

# Pull and run a model ollama run llama2 ollama run mistral ollama run codellama ```

API Usage

```bash # Ollama runs a local API server curl http://localhost:11434/api/generate -d '{ "model": "llama2", "prompt": "Why is the sky blue?" }' ```

Python Integration

```python import requests

response = requests.post('http://localhost:11434/api/generate', json={ 'model': 'llama2', 'prompt': 'Write a haiku about coding' }) print(response.json()['response']) ```

Use Cases

Private AI Assistant: Run ChatGPT-like conversations without sending data to the cloud. Perfect for sensitive work.

Local Development: Test AI features without burning API credits.

Offline Coding Help: Get code suggestions even without internet.

Custom Chatbots: Build AI apps that run entirely on-premise.

Learning & Experimentation: Try different models without cost.

Popular Models

- `llama2` - Meta's general purpose model - `mistral` - Fast and capable - `codellama` - Optimized for code - `vicuna` - Great for chat - `neural-chat` - Intel's optimized model

Hardware

- 7B models: 8GB RAM minimum - 13B models: 16GB RAM - 70B models: 64GB+ RAM