Copilot CLI BYOK and Local Model Support¶

Connect Copilot CLI to your own model provider — Ollama, Azure OpenAI, Anthropic, or any OpenAI-compatible endpoint — for cost control, data residency compliance, and fully air-gapped workflows.

Released April 7, 2026, Copilot CLI BYOK lets you replace GitHub-hosted model routing with your own provider (GitHub Changelog). Four environment variables configure it.

Configuration¶

Set these variables before launching copilot (GitHub Docs):

Variable	Required	Description
`COPILOT_PROVIDER_BASE_URL`	Yes	Base URL of your provider's API endpoint
`COPILOT_MODEL`	Yes	Model identifier (or use `--model` flag)
`COPILOT_PROVIDER_TYPE`	No	`openai` (default), `azure`, or `anthropic`
`COPILOT_PROVIDER_API_KEY`	No	API key — omit for unauthenticated local endpoints

Supported providers¶

OpenAI-compatible providers (Ollama, vLLM, Foundry Local, OpenAI):

export COPILOT_PROVIDER_BASE_URL=http://localhost:11434
export COPILOT_MODEL=llama3.2

Azure OpenAI:

export COPILOT_PROVIDER_BASE_URL=https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT
export COPILOT_PROVIDER_TYPE=azure
export COPILOT_PROVIDER_API_KEY=YOUR-AZURE-API-KEY
export COPILOT_MODEL=YOUR-DEPLOYMENT-NAME

Anthropic:

export COPILOT_PROVIDER_TYPE=anthropic
export COPILOT_PROVIDER_BASE_URL=https://api.anthropic.com
export COPILOT_PROVIDER_API_KEY=YOUR-ANTHROPIC-API-KEY
export COPILOT_MODEL=claude-opus-4-5

Offline mode¶

COPILOT_OFFLINE=true disables telemetry and restricts the CLI to talk only to the configured provider (GitHub Changelog). Paired with a local Ollama instance, this gives you fully air-gapped workflows.

Isolation is only complete when COPILOT_PROVIDER_BASE_URL also points to a local or on-prem endpoint. A remote URL sends prompts and code context over the network to that provider, even in offline mode (GitHub Docs).

GitHub authentication¶

GitHub login is optional with BYOK. With only provider credentials, you run the full local agent without a Copilot subscription. Adding GitHub authentication re-enables /delegate, GitHub Code Search, and the GitHub MCP server (GitHub Changelog).

Model requirements¶

Any model must support tool calling and streaming. GitHub recommends a 128k context window for complex tasks. All built-in sub-agents (explore, task, code-review) inherit the provider configuration automatically — there is no per-agent routing (GitHub Changelog).

Trade-offs¶

Consideration	Detail
Capability ceiling	Local models have lower reasoning quality than hosted frontier models for complex tasks
Single provider	Only one provider config is active at a time — no native multi-provider routing per task
Failure handling	Invalid config surfaces actionable errors; the CLI never silently falls back to GitHub-hosted models
Discovery	`copilot help providers` prints quick setup instructions in-terminal

Example¶

Air-gapped compliance scenario: a financial services team must keep all code on-premises.

# Pull a capable local model
ollama pull llama3.2

# Configure Copilot CLI for local-only operation
export COPILOT_PROVIDER_BASE_URL=http://localhost:11434
export COPILOT_MODEL=llama3.2
export COPILOT_OFFLINE=true

# Launch — no GitHub auth required, no external network calls
copilot

The team loses /delegate and the GitHub MCP server but gains full data-residency compliance. For tasks that need stronger reasoning, swap COPILOT_MODEL and COPILOT_PROVIDER_BASE_URL to a hosted provider and drop COPILOT_OFFLINE. Those sessions then route externally, with the same CLI experience.

Key Takeaways¶

Four env vars configure any OpenAI-compatible provider, Azure OpenAI, or Anthropic
COPILOT_OFFLINE=true restricts network to the configured provider — full isolation requires a local provider URL
GitHub auth is optional; adding it re-enables cloud features like /delegate and GitHub MCP
All built-in sub-agents inherit provider config; per-task routing requires managing env vars externally
Invalid provider config surfaces clear errors — no silent fallback to GitHub-hosted models