Embeddable AI
A fast, lightweight 3mb inference server to supercharge apps with local AI.
OpenAI-Compatible
Nitro is a drop-in replacement for OpenAI's REST API
POST
http://localhost:3928/v1/chat/completions
curl http://localhost:3928/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
]
}'
POST
https://api.openai.com/v1/chat/completions
curl https://api.openai.com/v1/chat/completions
-H "Content-Type: application/json"
-H "Authorization: Bearer $OPENAI_API_KEY"
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
]
}'
Lightweight
Nitro is an extremely lightweight library built for app developers to run local AI
Nitro
3mb
Local AI
193mb
Ollama
332mb
Cross-Platform
Nitro runs on cross-platform on CPU and GPU architectures
GPUs
CPUs
Multi-modal
Nitro integrates best-of-class open source AI libraries
Think
Llama2, Mistral, CausalML,...
Imagine
Coming Soon
Vision
Coming Soon
Speech
Coming Soon
Build with Nitro
Start running local AI models in your app within 10 seconds. Available as an npm, pip package, or binary.
Developer Docs
Web App

Desktop App
Nitro's Architecture
Nitro is 100% open source and licensed under the AGPLv3 license. We build upon the shoulders of giants at llama.cpp and Drogon.
OpenAI-compatible API
Nitro
Authentication
Coming Soon
Batching
Multi-threading
Model Management
Model Engines
LLMs
Llama.cpp
TensorRT-LLM
Coming Soon
Speech
Whisper.cpp
Vision
StableDiffusion
Coming Soon