- Date: 2026-05-22
- Status: Accepted
- Feature: URL-import
- Affects: url_import_spec.md § III.2
Context
Изначально Claude был proposed в стеке URL-import:
- Primary VLM для vision analysis
- Text LLM для component enrichment
- Code gen для TSX generation
Two independent reasons exclude Claude entirely:
Reason 1: Conflict of interest
Этот спек был написан AI ассистентом (Claude). Claude рекомендовал использовать Claude API — это conflict of interest, не моя позиция давать competitive advice.
"Я (Claude) изначально рекомендовал Claude Sonnet 4.6 как primary VLM — это conflict of interest."
После research юзера обнаружили: Gemini 2.5 Flash-Lite 30× дешевле при сопоставимом качестве для этой задачи. Объективно better choice для cost-conscious requirement.
Reason 2: Anthropic ToS (Feb 2026)
"You may not use outputs from Anthropic services to train, fine-tune,
or develop AI/ML models that compete with our services, or extract
embeddings or representations for downstream model training."URL-import имеет data flywheel (ADR 0013) — production extractions train in-house модель (ADR 0014). Если Claude в стеке — даже как inference-only, риск ToS violation:
- Production extraction calls Claude → response stored в
shadow_dataset - Training pipeline reads shadow_dataset → может tренировать на Claude outputs
- Even unintentional → ToS violation
Чистая solution: Claude not в стеке вообще. Никакого риска contamination.
Decision
Claude API исключён entirely из URL-import pipeline:
| Layer | Was proposed | Decided |
|---|---|---|
| Primary VLM | Claude Sonnet 4.6 vision | Gemini 2.5 Flash-Lite |
| Text LLM enrichment | Claude Haiku | Gemini 2.5 Flash-Lite |
| Code gen | Claude (any) | Cerebras (Llama 3.1 70B) free → paid |
| Distillation teacher | NEVER Claude | Qwen3-VL-235B Apache 2.0 |
Verification
Pipeline должен periodically verify Claude не contaminate stack:
// Pre-deployment check (CI hook)
function verifyNoClaudeImports(packageJson) {
const forbidden = ['@anthropic-ai/sdk', 'anthropic'];
for (const dep of forbidden) {
if (packageJson.dependencies?.[dep] || packageJson.devDependencies?.[dep]) {
throw new Error(`Forbidden dependency: ${dep}. См ADR 0015.`);
}
}
}
// Runtime check (network egress monitor)
function alertOnClaudeApiCalls() {
if (request.host.includes('anthropic.com')) {
alert.send('ADR 0015 violation: Claude API call detected');
}
}Consequences
Pros:
- Zero ToS exposure
- Zero conflict-of-interest concerns в documentation
- Forces competitive cost analysis (resulted в Gemini Flash-Lite at 30× lower cost)
- Pipeline transparent — caller знает что не Claude
Cons:
- Loses Claude quality в edge cases (Claude известен superior в some long-context reasoning)
- Mitigated: Gemini Flash-Lite adequate для structured extraction tasks
- Mitigated: Qwen3-VL-32B in-house model improves over time
Verification ongoing
- CI hook checks package.json
- Network egress monitoring (anthropic.com domain)
- Quarterly audit of dependency tree
Caveat: ARNO core может use Claude отдельно
Этот ADR scopes только URL-import feature. ARNO core (editor, MD smart editor, etc) может independently choose Claude если требуется. URL-import pipeline isolated.
Alternatives rejected
A. Use Claude API, document conflict of interest
- ❌ ToS exposure remains
- ❌ Spec author bias persists in design
B. Use Claude initially, replace with Qwen LATER
- ❌ Migration cost
- ❌ Production data contaminated с Claude outputs (forever stored in shadow_dataset)
- ❌ Training pipeline could accidentally train on contaminated data
C. Use OpenAI вместо Claude
- ❌ Same ToS issues (OpenAI Usage Policy similar)
- ❌ 5-10× дороже Gemini Flash-Lite
Cross-references
- Main spec § III.2 — НЕ используем list
- Main spec § 0.10 — conflict of interest disclosure
- ADR 0014 — Qwen teacher (positive replacement)