Secrets, sandboxes, and network isolation when using AI coding tools

Published 2026-05-11 by Owner

Using AI coding tools does not automatically compromise your security posture, but it changes the attack surface in ways most developers haven’t mapped yet. Three separate threat axes deserve distinct mitigations. Conflating them leads to either complacency (“I don’t send secrets to the model, so I’m fine”) or paranoia (“I can’t use AI tools at all for work code”). Neither is correct.

The three threat axes

Axis 1 — Exfiltration via logs

When you send a prompt to a cloud-based AI coding tool, the provider receives the text. Whether they log it, train on it, or retain it depends on their tier and policy — and the policy can differ between your personal account and an enterprise subscription.

Claude Code (Anthropic) and Cursor both offer enterprise plans where training on customer data is contractually off by default. On free or individual-tier plans, policy varies. GitHub Copilot Business excludes training data by default; Copilot Individual does not.

The practical risk is not that someone at Anthropic reads your prompt. The practical risk is that a future model trained on retained prompts could surface proprietary logic, internal API shape, or business logic patterns to a completely unrelated user who prompts the right way. The mitigations:

Check the privacy tier of your active subscription before using the tool on proprietary code.
If you’re on a free tier and working with confidential code, upgrade or use a self-hosted model.
Never paste secrets, credentials, or anything from a .env file into a prompt — the tool will include it in the request body regardless of your privacy tier.

Axis 2 — Exfiltration via tool calls

This is the less-understood axis. Modern AI coding tools can execute tools: run shell commands, call MCP servers, read files, write files. When a tool calls an external MCP server, that server receives context. If the MCP server is third-party (a marketplace plugin you installed), you’ve handed context to a party whose data practices you probably haven’t reviewed.

The concrete threat: an MCP server installed for “project search” sends each tool invocation, including the code context it was given, back to a third-party endpoint. Most MCP servers are open source and easy to audit, but most developers don’t audit them.

Shell hooks are a similar vector. Claude Code, Cline, and others support pre/post hooks that run shell commands. A malicious hook in a cloned repo — or one introduced by a supply-chain attack on a package in your project — can exfiltrate the code that gets passed to the AI tool’s context window.

Mitigations:

Audit MCP servers before installing. Read the source. If it makes outbound HTTP calls that aren’t to the tool’s documented API, investigate.
Review shell hooks in .claude/settings.json, .cline/hooks, or wherever your tool stores them. Treat them like reviewing a postinstall script.
Run tools with minimal permission grants. Claude Code’s permission model lets you allow specific paths and commands; don’t approve Bash(*) unless you understand what that grants.

Axis 3 — Supply chain via generated code

The model doesn’t know what packages currently exist on npm. It knows what packages existed when it was trained. If a package name it suggests was published after training — or if a malicious package was published with a name that looks like a common utility — the model will suggest an install command you might run without checking.

This is not theoretical. Researchers have demonstrated that models confidently suggest packages that don’t exist (hallucinated) and that typosquatted names land in install suggestions. The model suggesting npm install node-fetch-v3-compat is not evidence that such a package exists or is safe.

Secrets hygiene

The primary rule is obvious and still frequently violated: never commit .env files. The secondary rule is more nuanced: even if a secret never reaches git, it may reach the model.

The default behavior for most AI coding tools is to read files from the current working directory for context. If you have a .env file in your project root and the tool expands its context window to relevant files, your secrets may be included in the prompt. Check your tool’s context-inclusion behavior:

Cursor’s “codebase indexing” will index .env if it’s not in .cursorignore. Add it.
Cline’s autonomous file-reading will read any file it decides is relevant. Use .clineignore.
Claude Code will ask before reading individual files in interactive mode, but in automated/headless mode, the permission profile controls this.

A .gitignore entry is necessary but not sufficient — it only prevents git from tracking the file. Your tool needs its own ignore list.

If you’ve already leaked a secret to a model:

Rotate it immediately. Treat it as compromised the moment it left your machine.
Check the provider’s data retention policy to understand whether you can request deletion of a specific conversation.
Add the path to your tool’s ignore list so it doesn’t recur.

Rotation beats any other remediation step. The question is not whether the provider retained it; the question is whether the credential still works if they did.

Environment variable discipline more broadly:

Use separate credentials per environment. A production database credential should never appear in a development context, let alone in a prompt. If your dev setup requires knowing the production connection string, fix the setup before worrying about the AI tool.

Sandbox modes per tool

Several tools now offer explicit sandbox or restricted modes. Knowing what each actually does matters — “sandbox” means different things.

Codex CLI defaults to a sandboxed execution mode where shell commands run inside a network-restricted container. The --full-auto flag disables this. If you’re using Codex for automated tasks, run with the default (sandboxed) rather than --full-auto unless you have a specific need and understand what the tool will execute.

Claude Code has a permission mode system. At first run, it prompts for permission grants. The conservative configuration:

Allow specific paths: Read(src/**), Write(src/*) rather than Read(**), Write(**)
Allow specific commands: Bash(bun test), Bash(bun run lint) rather than Bash(*)
Review the resulting .claude/settings.json before committing it

The permissions file gets committed to the repo, which means a PR adding Bash(*) to permissions is a meaningful security event. Review it like you would a change to CI configuration.

Cursor’s Privacy Mode prevents the contents of your files from being sent to Cursor’s servers for training or storage. It does not prevent the current chat context from being sent for inference — it has to be, to generate a response. Privacy Mode is about retention, not transmission.

Understanding this distinction prevents the false confidence of “I turned on Privacy Mode so nothing leaves my machine.” Inference requests leave your machine by definition. The question is what the provider does with them afterward.

Network isolation when possible

For the highest-sensitivity work, the most reliable mitigation is preventing outbound model API calls from reaching external infrastructure entirely.

Self-hosted local models (Ollama with Qwen2.5-Coder, LM Studio with a coding model, llama.cpp) keep all inference on your machine. The quality gap with frontier models is real — local models at typical consumer hardware sizes are meaningfully worse at complex reasoning tasks — but for a highly constrained context (secret codebase, regulatory requirement, no external network), the tradeoff is worth taking.

Aider, Cline, and Claude Code can all be configured to point at a local inference endpoint. For Aider:

aider --model ollama/qwen2.5-coder:32b --openai-api-base http://localhost:11434/v1

For Claude Code, set the API endpoint to a self-hosted or corporate-proxied endpoint via the config:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://your-internal-proxy.corp/anthropic"
  }
}

Corporate AI proxy — several enterprise setups route all model API calls through an internal proxy that logs requests (for compliance), strips secrets (via pattern matching on common secret formats), and enforces access control. If your organization has one, use it. If you’re building security tooling and considering one, this is the architecture that makes AI tools viable for regulated industries.

Running through a proxy adds latency but provides the audit trail that lets you answer “did anything sensitive get sent to the model this sprint” from logs rather than from memory.

A near-miss

A few months ago, a side project needed a utility to parse a specific AWS event format. The agent (Claude Code in plan-then-act mode) proposed this at the end of a scaffolding task:

I'll install the helper library that handles this format natively:

npm install aws-event-bridge-utils

It looked plausible. aws-event-bridge-utils follows the naming pattern of a dozen real AWS SDK packages. Before running it, two seconds of checking:

npm info aws-event-bridge-utils

npm error 404 Not Found - GET https://registry.npmjs.org/aws-event-bridge-utils

The package doesn’t exist. The model hallucinated a plausible name. If someone had registered aws-event-bridge-utils on npm with malicious code, the install command would have run it — and depending on the project’s dependencies, that code might have had access to the active AWS credential chain.

What stopped it: habit. npm info <package> before npm install <package> is the single-step gate that catches both hallucinated packages and potential typosquatting. It adds three seconds.

The pattern generalizes: treat every package the model suggests as unverified until you’ve confirmed it exists and has reasonable download counts, publication history, and code content. Models are confident about packages whether or not they exist, and their confidence is not calibrated to reality.

A useful addition to this workflow: check the package’s first publish date. If the model was trained before that date, the model cannot have seen it — which means it either hallucinated a coincidentally correct name or is suggesting something it inferred from patterns rather than remembered from training. Either case warrants closer scrutiny than a package with five years of version history.

For higher-stakes contexts, add socket.dev or Snyk’s dependency scanning to the install step in CI so that even packages that do exist get checked against known malicious indicators.

What the security surface actually looks like

Mapping these three axes to everyday practice:

Logs: check your subscription tier, add ignore rules for secrets files, rotate anything that was ever pasted into a prompt
Tool calls: audit MCP servers before install, review permission grants before committing them, treat hook scripts like install scripts
Supply chain: npm info before npm install, lock your dependency files, add dependency scanning to CI

None of these are new security practices. They’re adaptations of existing practices to a changed context. The AI coding tool changes the rate at which you interact with your codebase and the rate at which new code and dependencies appear — it doesn’t change what good security hygiene looks like, it just raises the stakes for having it.

The one genuinely novel element is the MCP/tool-call axis. Before these tools grew plugin ecosystems, the attack surface was limited to what you typed into a chat window. Now the tool executes code, reads files, and calls external services on your behalf. The security perimeter has to expand to cover what the tool does, not just what you explicitly send.

Treat your AI coding tool’s permission file the same way you treat your CI configuration: version it, review changes in PRs, and question additions that expand scope. A change that adds Bash(*) to your Claude Code settings is worth the same scrutiny as a change that adds a new workflow step to your CI file — because both, when misconfigured, can run arbitrary code in your repository’s context.