Documenting 100+ Projects Without Burning Cloud Tokens
How I document an entire workspace locally first — gather scripts, R1 planning, 14b patches — and only hand summaries to the cloud agent.
- documentation
- ollama
- cursor
- local-first
Documenting 100+ Projects Without Burning Cloud Tokens
As a solo developer or small studio operator, managing documentation for numerous projects can be daunting, especially when you’re trying to keep costs down. In my latest workflow overhaul, I’ve streamlined the process of documenting over 100 projects by leveraging local-first tools and minimizing cloud token usage. This approach not only saves money but also ensures that your project details are secure and accessible offline.
Local-First Documentation Pipeline
To tackle this challenge, I built a custom Python script called project-doc-gather.py to gather all necessary documentation from my workspace domains 01–05. The script then hands off the data to DeepSeek R1 for planning and qwen2.5:14b for applying JSON patches. This process is designed to be efficient, with a Cloud Cursor agent receiving only handoff artifacts, which saves around 85-90% of tokens compared to dumping raw grep data into the cloud.
The pipeline operates on three lanes:
- Lane A: Grep and context extraction without LLM involvement.
- Lane B: Use of Ollama with 7b/14b models for warm starts, optimizing token usage.
- Lane C: Heavy queue offloading to MSI systems when necessary.
On June 15th, I ran a full documentation pass across all workspace domains, resulting in approximately 72K local prompt tokens and 34 minutes of GPU time. This phase alone demonstrates the efficiency gains from this approach, as it significantly reduces cloud costs while maintaining high-quality documentation.
What I’d Do Differently
Next pass: auto-apply file_patches from the 14b JSON so cloud turns are handoff-only. I’d also route doc tasks through project-doc-local.sh in classify instead of the generic fast lane.