AI Models
Choose from cloud models via OpenRouter, GitHub Copilot, or local models via Ollama.
Omnilib connects to AI models through three providers: OpenRouter for cloud models, GitHub Copilot for Copilot-integrated models, and Ollama for local models running on your own hardware. You can switch models at any time from the chat input or configure a default in Settings > AI.
Token Limits by Tier
Your subscription tier determines how many AI tokens you can use per month across all providers:
| Tier | Monthly token limit |
|---|---|
| Free | 100K tokens |
| Researcher | 1M tokens |
| Pro | 5M tokens |
Token usage is shown in Settings > Billing. If you reach your limit, you can purchase credit packs to continue using AI for the remainder of the month without upgrading your subscription.
OpenRouter
OpenRouter is the default provider. It routes your requests to the underlying model you select and handles billing separately from your Omnilib subscription (except for free-tier models, which are covered).
Free Tier Models
These models are available to all Omnilib users at no additional cost:
- openrouter/free — Auto-routes to the best available free model at the time of request
- google/gemini-3-flash-preview — Google's fast, capable preview model
Free models count against your monthly Omnilib token limit.
Paid Models
Paid models require an OpenRouter API key. Enter your key in Settings > AI > OpenRouter. Usage is billed by OpenRouter at their standard rates and does not count against your Omnilib token limit.
Available paid models include:
| Model | Best for |
|---|---|
anthropic/claude-sonnet-4.5 | Balanced intelligence and speed |
anthropic/claude-haiku-4.5 | Fast, efficient tasks |
anthropic/claude-opus-4.6 | Deep reasoning and complex analysis |
openai/gpt-5.4-nano | Lightweight, high-speed tasks |
openai/gpt-5.4-mini | Fast general-purpose tasks |
openai/gpt-5.2 | Strong all-round capability |
openai/codex | Code generation and completion |
x-ai/grok-4.1-fast | Speed-optimized reasoning |
qwen/qwen3-coder | Code-specialized model |
Additional models appear in the selector as OpenRouter adds them.
Connecting OpenRouter
- Go to Settings > AI > Model Providers
- Select OpenRouter
- Paste your OpenRouter API key
- Save — paid models become available in the model selector immediately
GitHub Copilot
Connect GitHub Copilot to use models available through your Copilot subscription (Individual, Business, or Enterprise).
Connecting GitHub Copilot
- Go to Settings > AI > Model Providers
- Select GitHub Copilot
- Click Connect via GitHub — an OAuth device flow opens
- Follow the prompts to authorize Omnilib in your browser
- Return to Omnilib — available models are fetched automatically from the Copilot API
The model list updates dynamically whenever GitHub adds or removes models from the Copilot offering. You do not need to manually update Omnilib to see new models.
Ollama (Local Models)
Desktop only. Ollama integration is not available in the web app.
Run AI models entirely on your own hardware. No data leaves your machine, there is no per-token cost, and models work offline.
Ollama must be running at localhost:11434. Omnilib detects it automatically when the desktop app starts.
Hardware-Aware Recommendations
When you open the model download panel, Omnilib inspects your system's GPU, CPU, and RAM and highlights models that will run well on your hardware. Recommended models appear at the top of the list with an estimated performance rating.
Downloading Models
- Go to Settings > AI > Local Models (Ollama)
- Browse the model list — recommended models are highlighted based on your hardware
- Click Download next to a model to start downloading
- A progress bar shows download progress and estimated time remaining
- Once downloaded, the model appears in the model selector
Supported Models
Omnilib's local model panel includes Qwen 3.5 across a range of sizes:
| Model | Size | Best for |
|---|---|---|
qwen3.5:0.8b | ~500 MB | Very fast, lightweight tasks |
qwen3.5:1.8b | ~1.1 GB | Quick responses on any hardware |
qwen3.5:7b | ~4.5 GB | Good balance of quality and speed |
qwen3.5:14b | ~8.5 GB | High quality, needs 16 GB RAM |
qwen3.5:35b | ~20 GB | Near-cloud quality, needs 32 GB RAM |
You can also download any model supported by Ollama by entering a custom tag (e.g., llama3.2:3b) in the custom model field.
Managing Local Models
- Click Remove next to a downloaded model to delete it and free disk space
- Omnilib shows disk usage for each model
- Switch between local models instantly — no restart required
Configuring the Default Model
Go to Settings > AI and select a default model. This model is pre-selected when you open a new chat. You can always override it per-conversation using the model selector in the chat input.
Related
- AI Chat — Use models in conversation
- AI Modes — Chat, Agent, and Plan mode
- Agent Behaviors — Shape how the AI uses any model
- Introduction — Subscription tiers overview