Every metric below was computed by running actual prompts through the engine and measuring the output. Numbers are committed to the repo.
| Domain | Standard CR | Pro CR | Developer CR | Fact preservation |
|---|---|---|---|---|
| Financen=40 | 90% | |||
| Social / Marketingn=40 | 90% | |||
| Programmingn=40 | 90% | |||
| Legaln=40 | 89% | |||
| Medicaln=40 | 89% |
Developer tier applies domain-specific abbreviation packs (MRR, ARR, NDA, IP, SLA, BP…) + spaCy clause pruning + parenthetical removal. All processing is local — no external API calls.
| Tool | Raw JSON tokens | DSL tokens (turn 1) | CR turn 1 |
|---|---|---|---|
search_web | 111 | 34 | 69.4% |
get_user | 98 | 30 | 69.4% |
create_order | 133 | 33 | 75.2% |
run_code | 119 | 41 | 65.5% |
query_database | 139 | 40 | 71.2% |
send_email | 144 | 42 | 70.8% |
| Total (10 tools) | 1,220 | 373 | 69.4% |
Type elision: 61% of parameters have their type annotation dropped because it is inferrable from the name alone (user_id → str, include_orders → bool, limit → int). Enum fields are rendered inline as active|inactive|suspended instead of a full JSON array.
All token counts use tiktoken cl100k_base (OpenAI's tokenizer, used by GPT-4 and Claude approximation). Older experiments used a words×1.3 heuristic — clearly marked.
(1 - compressed_tokens / original_tokens) × 100. Measured on the prompt sent to the LLM, not the response.
Regex extraction of numbers, URLs, named entities, and quoted strings from the original. We verify each appears in the compressed output (or its decoded form). FPR = retained / total × 100.
Pro output is passed through: (1) domain pack — 60+ domain-specific abbreviations detected via keyword scoring, (2) spaCy dependency parse — non-restrictive relative clauses and low-information adverbial clauses removed, (3) parenthetical pruner — non-numeric parentheticals ≥4 words removed. Fully local, no external API calls.
Prompt benchmark: domain_experiment_v4.py ·
Tool benchmark: tool_compression_experiment.py ·
Engine: engine_v4.py · API: POST /compress-tools