AI Models

Choose from cloud models via OpenRouter, GitHub Copilot, or local models via Ollama.

Omnilib connects to AI models through three providers: OpenRouter for cloud models, GitHub Copilot for Copilot-integrated models, and Ollama for local models running on your own hardware. You can switch models at any time from the chat input or configure a default in Settings > AI.

Token Limits by Tier

Your subscription tier determines how many AI tokens you can use per month across all providers:

TierMonthly token limit
Free100K tokens
Researcher1M tokens
Pro5M tokens

Token usage is shown in Settings > Billing. If you reach your limit, you can purchase credit packs to continue using AI for the remainder of the month without upgrading your subscription.

OpenRouter

OpenRouter is the default provider. It routes your requests to the underlying model you select and handles billing separately from your Omnilib subscription (except for free-tier models, which are covered).

Free Tier Models

These models are available to all Omnilib users at no additional cost:

  • openrouter/free — Auto-routes to the best available free model at the time of request
  • google/gemini-3-flash-preview — Google's fast, capable preview model

Free models count against your monthly Omnilib token limit.

Paid models require an OpenRouter API key. Enter your key in Settings > AI > OpenRouter. Usage is billed by OpenRouter at their standard rates and does not count against your Omnilib token limit.

Available paid models include:

ModelBest for
anthropic/claude-sonnet-4.5Balanced intelligence and speed
anthropic/claude-haiku-4.5Fast, efficient tasks
anthropic/claude-opus-4.6Deep reasoning and complex analysis
openai/gpt-5.4-nanoLightweight, high-speed tasks
openai/gpt-5.4-miniFast general-purpose tasks
openai/gpt-5.2Strong all-round capability
openai/codexCode generation and completion
x-ai/grok-4.1-fastSpeed-optimized reasoning
qwen/qwen3-coderCode-specialized model

Additional models appear in the selector as OpenRouter adds them.

Connecting OpenRouter

  1. Go to Settings > AI > Model Providers
  2. Select OpenRouter
  3. Paste your OpenRouter API key
  4. Save — paid models become available in the model selector immediately

GitHub Copilot

Connect GitHub Copilot to use models available through your Copilot subscription (Individual, Business, or Enterprise).

Connecting GitHub Copilot

  1. Go to Settings > AI > Model Providers
  2. Select GitHub Copilot
  3. Click Connect via GitHub — an OAuth device flow opens
  4. Follow the prompts to authorize Omnilib in your browser
  5. Return to Omnilib — available models are fetched automatically from the Copilot API

The model list updates dynamically whenever GitHub adds or removes models from the Copilot offering. You do not need to manually update Omnilib to see new models.

Ollama (Local Models)

Desktop only. Ollama integration is not available in the web app.

Run AI models entirely on your own hardware. No data leaves your machine, there is no per-token cost, and models work offline.

Ollama must be running at localhost:11434. Omnilib detects it automatically when the desktop app starts.

Hardware-Aware Recommendations

When you open the model download panel, Omnilib inspects your system's GPU, CPU, and RAM and highlights models that will run well on your hardware. Recommended models appear at the top of the list with an estimated performance rating.

Downloading Models

  1. Go to Settings > AI > Local Models (Ollama)
  2. Browse the model list — recommended models are highlighted based on your hardware
  3. Click Download next to a model to start downloading
  4. A progress bar shows download progress and estimated time remaining
  5. Once downloaded, the model appears in the model selector

Supported Models

Omnilib's local model panel includes Qwen 3.5 across a range of sizes:

ModelSizeBest for
qwen3.5:0.8b~500 MBVery fast, lightweight tasks
qwen3.5:1.8b~1.1 GBQuick responses on any hardware
qwen3.5:7b~4.5 GBGood balance of quality and speed
qwen3.5:14b~8.5 GBHigh quality, needs 16 GB RAM
qwen3.5:35b~20 GBNear-cloud quality, needs 32 GB RAM

You can also download any model supported by Ollama by entering a custom tag (e.g., llama3.2:3b) in the custom model field.

Managing Local Models

  • Click Remove next to a downloaded model to delete it and free disk space
  • Omnilib shows disk usage for each model
  • Switch between local models instantly — no restart required

Configuring the Default Model

Go to Settings > AI and select a default model. This model is pre-selected when you open a new chat. You can always override it per-conversation using the model selector in the chat input.