Setting up Cline with Claude, GPT-4o, and OpenRouter as fallback providers

Published 2026-04-22 by Owner

Cline supports half a dozen model providers out of the box, but the default flow assumes you pick one and stick with it. That’s fine until Anthropic has a partial outage on a Tuesday afternoon, your API call hangs, and you’re staring at a frozen agent. Multi-provider setup with a fallback is a 15-minute config change that’s saved me twice.

The basic config

In VSCode, open Cline’s settings panel (the gear icon in the Cline sidebar) and pick API Provider. The dropdown shows: Anthropic, OpenRouter, AWS Bedrock, GCP Vertex, OpenAI, OpenAI-compatible, Gemini, Mistral, DeepSeek, LM Studio, Ollama, and a few others.

For a daily-driver setup, I run two providers configured: Anthropic direct, and OpenRouter as a fallback that gives access to most other models behind one key.

Provider: Anthropic
Model: claude-3-5-sonnet-20241022
Anthropic API Key: sk-ant-...

Save. Test by sending Cline a one-line task. If the response comes back, the primary path works.

Adding OpenRouter as the fallback

OpenRouter is a single API that proxies to most models — Claude, GPT-4o, Gemini, Llama 3.1, Mistral, DeepSeek, and dozens of others. One account, one key, pay-per-token. For a fallback, this matters because if Anthropic is down, you can switch to GPT-4o through OpenRouter without reconfiguring anything.

Provider: OpenRouter
Model: anthropic/claude-3.5-sonnet
OpenRouter API Key: sk-or-v1-...

The OpenRouter model identifier mirrors Anthropic’s, but routes through OpenRouter’s infrastructure. If Anthropic’s API is degraded, OpenRouter sometimes still resolves through their cached path.

The trick: save this as a separate Cline configuration profile so you can switch in one click rather than re-typing the key.

Switching profiles in one click

Cline 3.x supports multiple “API configurations” — saved provider+model combos. In the settings panel, click Save as new profile. Name them clearly:

anthropic-direct — primary, Claude 3.5 Sonnet via Anthropic
openrouter-claude — fallback, same model via OpenRouter
openrouter-gpt4o — fallback, GPT-4o for when Anthropic is the problem
local-ollama — last resort, Llama 3.1 70B on your own machine

When the primary fails, you click the profile dropdown at the top of the chat panel and switch. No re-entering keys, no losing the conversation.

When to use which fallback

The realistic failure modes I’ve hit:

Anthropic API rate limit / 529 overload. Switch to openrouter-claude — same model, different infrastructure path. About 70% of the time this works.

Anthropic full outage. Switch to openrouter-gpt4o. Different model, different vendor. Quality drops a little for refactoring tasks (Sonnet is more careful), but you can keep working.

Office Wi-Fi flakiness with cloud providers. Switch to local-ollama. Quality drops a lot, but for boilerplate generation and simple completions, Llama 3.1 70B running on a 64GB Mac is workable.

Hard deadline with a complex task. Stay on Anthropic and add a backup terminal session running with a different account/key — overkill for normal work, but I did this twice in March before a customer demo.

OpenRouter pricing reality

OpenRouter takes a small markup on top of underlying provider rates. For Claude 3.5 Sonnet via OpenRouter, you pay roughly 5% more than direct Anthropic. For some other models the markup is higher.

If you’re using OpenRouter only as fallback (low actual volume), the markup doesn’t matter. If you switch to OpenRouter as your primary because you like the convenience of one key for all models, you’re paying $5–$10/month extra at moderate usage. Worth it for some, not for others.

Local Ollama setup as last-resort

Install Ollama from ollama.com. Then:

ollama pull llama3.1:70b      # ~40GB download
ollama pull qwen2.5-coder:32b # alternative coding model

In Cline:

Provider: Ollama
Base URL: http://localhost:11434
Model: llama3.1:70b

Quality drops for anything beyond simple tasks. Llama 3.1 70B is roughly equivalent to GPT-3.5 for code tasks — useful for “fill in this getter method” but not for “design this system.” Qwen 2.5 Coder 32B is better for coding specifically and runs faster, but is still well below Sonnet for hard tasks.

The reason to have local Ollama configured isn’t quality — it’s that it works on a plane.

A few configurations that sound good and aren’t:

Bedrock or Vertex unless you’re at an enterprise. The IAM setup for either is non-trivial. If your company is paying for AWS or GCP and wants you to route through them for billing/audit reasons, fine. Otherwise the direct provider is simpler.

OpenAI direct alongside OpenRouter. Pick one or the other for OpenAI models. Two configs that do the same thing is just clutter.

LM Studio if you have Ollama. They cover the same use case. Ollama is more CLI-friendly; LM Studio has a UI. Use whichever you prefer, not both.

The full settings.json snippet

If you’d rather configure via JSON than the UI (faster to share or version-control):

{
  "cline.apiConfigurations": [
    {
      "id": "anthropic-direct",
      "apiProvider": "anthropic",
      "apiModelId": "claude-3-5-sonnet-20241022"
    },
    {
      "id": "openrouter-claude",
      "apiProvider": "openrouter",
      "openRouterModelId": "anthropic/claude-3.5-sonnet"
    },
    {
      "id": "openrouter-gpt4o",
      "apiProvider": "openrouter",
      "openRouterModelId": "openai/gpt-4o"
    },
    {
      "id": "local-ollama",
      "apiProvider": "ollama",
      "ollamaModelId": "llama3.1:70b"
    }
  ]
}

API keys go in the secure store, not in this JSON. Cline prompts for them on first use of each profile.

What this gets you

The whole setup is maybe 15 minutes the first time. After that, switching providers is two clicks when something breaks. The amortized payoff: not losing 45 minutes the next time the primary API has a bad afternoon. Two saves in a month covers the setup cost many times over.