ADR 0015 — Explicit non-use of Claude API в URL-import stack

Date: 2026-05-22
Status: Accepted
Feature: URL-import
Affects: url_import_spec.md § III.2

Context

Изначально Claude был proposed в стеке URL-import:

Primary VLM для vision analysis
Text LLM для component enrichment
Code gen для TSX generation

Two independent reasons exclude Claude entirely:

Reason 1: Conflict of interest

Этот спек был написан AI ассистентом (Claude). Claude рекомендовал использовать Claude API — это conflict of interest, не моя позиция давать competitive advice.

"Я (Claude) изначально рекомендовал Claude Sonnet 4.6 как primary VLM — это conflict of interest."

После research юзера обнаружили: Gemini 2.5 Flash-Lite 30× дешевле при сопоставимом качестве для этой задачи. Объективно better choice для cost-conscious requirement.

Reason 2: Anthropic ToS (Feb 2026)

"You may not use outputs from Anthropic services to train, fine-tune,
or develop AI/ML models that compete with our services, or extract
embeddings or representations for downstream model training."

URL-import имеет data flywheel (ADR 0013) — production extractions train in-house модель (ADR 0014). Если Claude в стеке — даже как inference-only, риск ToS violation:

Production extraction calls Claude → response stored в shadow_dataset
Training pipeline reads shadow_dataset → может tренировать на Claude outputs
Even unintentional → ToS violation

Чистая solution: Claude not в стеке вообще. Никакого риска contamination.

Decision

Claude API исключён entirely из URL-import pipeline:

Layer	Was proposed	Decided
Primary VLM	Claude Sonnet 4.6 vision	Gemini 2.5 Flash-Lite
Text LLM enrichment	Claude Haiku	Gemini 2.5 Flash-Lite
Code gen	Claude (any)	Cerebras (Llama 3.1 70B) free → paid
Distillation teacher	NEVER Claude	Qwen3-VL-235B Apache 2.0

Verification

Pipeline должен periodically verify Claude не contaminate stack:

// Pre-deployment check (CI hook)
function verifyNoClaudeImports(packageJson) {
  const forbidden = ['@anthropic-ai/sdk', 'anthropic'];
  for (const dep of forbidden) {
    if (packageJson.dependencies?.[dep] || packageJson.devDependencies?.[dep]) {
      throw new Error(`Forbidden dependency: ${dep}. См ADR 0015.`);
    }
  }
}
 
// Runtime check (network egress monitor)
function alertOnClaudeApiCalls() {
  if (request.host.includes('anthropic.com')) {
    alert.send('ADR 0015 violation: Claude API call detected');
  }
}

Consequences

Pros:

Zero ToS exposure
Zero conflict-of-interest concerns в documentation
Forces competitive cost analysis (resulted в Gemini Flash-Lite at 30× lower cost)
Pipeline transparent — caller знает что не Claude

Cons:

Loses Claude quality в edge cases (Claude известен superior в some long-context reasoning)
- Mitigated: Gemini Flash-Lite adequate для structured extraction tasks
- Mitigated: Qwen3-VL-32B in-house model improves over time

Verification ongoing

CI hook checks package.json
Network egress monitoring (anthropic.com domain)
Quarterly audit of dependency tree

Caveat: ARNO core может use Claude отдельно

Этот ADR scopes только URL-import feature. ARNO core (editor, MD smart editor, etc) может independently choose Claude если требуется. URL-import pipeline isolated.

Alternatives rejected

A. Use Claude API, document conflict of interest

❌ ToS exposure remains
❌ Spec author bias persists in design

B. Use Claude initially, replace with Qwen LATER

❌ Migration cost
❌ Production data contaminated с Claude outputs (forever stored in shadow_dataset)
❌ Training pipeline could accidentally train on contaminated data

C. Use OpenAI вместо Claude

❌ Same ToS issues (OpenAI Usage Policy similar)
❌ 5-10× дороже Gemini Flash-Lite

Cross-references

Main spec § III.2 — НЕ используем list
Main spec § 0.10 — conflict of interest disclosure
ADR 0014 — Qwen teacher (positive replacement)

ADR 0014 — Legal-clean distillation: Qwen teacher, не Anthropic/OpenAI ADR 0016 — V1 staging area вместо direct git commit