June 22, 2026 Agentic Workflows

Headroom and Ponytail: Cutting Cloud Tokens Without Dumbing Down Agents

RTK shell filtering, local-offload lanes A–D, and Ponytail lite YAGNI rules — how I keep Cursor usable across 100+ projects.

cursor
ollama
tokens
agents

The bill is the architecture problem

I orchestrate 100+ project folders from one hub repo. Every git status, find, and log dump sent to a cloud model is money and context rot. The fix isn’t “use AI less” — it’s classify first, offload locally, hand the cloud a TLDR.

Headroom + RTK — shrink shell output

RTK wraps everyday commands (git, grep, docker, tests) and filters noise — failures and summaries only. Headroom proxies agent traffic and compresses context at the shell boundary. Together they routinely cut 60–90% of tokens that never should have left the machine.

Rule of thumb: if the answer fits in ten lines, the cloud agent shouldn’t see 500.

Local-offload lanes A / B / C / D

local-offload.py classify "<task>" routes work:

Lane	Engine	Use
A	scripts only	grep, recipes, no LLM
B	qwen 7b on 4060	fast crunch
C	deepseek-r1 32b	JSON plans
D	LLaDA diffusion	long prose drafts

Cloud Cursor reads CLOUD_TLDR.md + artifact path — never re-runs the gather.

Ponytail lite — stop over-engineering

~/.cursor/rules/ponytail.mdc enforces YAGNI: no abstractions nobody asked for, deletion over addition, question complex requests before building.

The orchestrator hub gets ponytail lite exemption for multi-file agent infrastructure — that’s explicit infrastructure, not scope creep.

Inkwell failed me today (honest note)

I tried generating this post with local 7b and 14b via the Inkwell prompt. Both ignored the assigned topic and wrote generic “AI content creation” essays. I wrote this note manually from real stack docs. Lesson: lane D/heavy needs tighter prompt enforcement or QUILL chronicle input — the pipeline isn’t fire-and-forget yet.

Next step

Wire generate-davidcole-notes.sh to reject responses that don’t mention the POST TOPIC slug — cheap local validation before any MDX lands on the site.