LocalAI
November 30, 2024

LocalAI

AIAPISelf-HostedOpenAIDocker

Drop-in OpenAI API replacement that runs locally. Use your existing code with local models - just change the endpoint.

VIEW ON GITHUB

LocalAI is a self-hosted OpenAI-compatible API. Your existing OpenAI code works with local models - just change the URL.

Quick Deploy

```bash # Docker (easiest) docker run -p 8080:8080 localai/localai

# Or with specific model docker run -p 8080:8080 \ -v $PWD/models:/models \ localai/localai ```

Use with OpenAI SDK

```python import openai

# Just change the base URL! openai.api_base = "http://localhost:8080/v1" openai.api_key = "not-needed"

response = openai.ChatCompletion.create( model="llama2", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) ```

Use Cases

Cost Savings: Stop paying for API calls. Run unlimited queries locally.

Privacy: Keep sensitive data on-premise.

Development: Test AI features without burning credits.

Offline Apps: Build apps that work without internet.

Custom Models: Use fine-tuned models with familiar APIs.

Enterprise: Self-hosted AI for compliance requirements.

Supported APIs

- Chat completions (/v1/chat/completions) - Completions (/v1/completions) - Embeddings (/v1/embeddings) - Image generation (/v1/images/generations) - Audio transcription (/v1/audio/transcriptions)

Compatible Models

- LLaMA, LLaMA 2 - Mistral, Mixtral - Falcon - GPT4All models - Stable Diffusion (for images) - Whisper (for audio)

Pro Tips

- Use GPU for 10x+ speedup - Quantized models use less RAM - Multiple models can run simultaneously - Check model compatibility before downloading