Skip to main content

You might think this is just a post about fixing a session bug. And it is. But it also looked like a great use case to test a multidisciplinary pipeline.

The bug fix is real and useful. But the deeper story is what happens when you point a local model agent at a full lifecycle: code, tests, docs, packaging, publishing—all from one trigger.

The Problem: A Session That Broke After Sleep

It started on a Tuesday. I put my computer to sleep while working in a long-running Claude Code session that was using browser automation for research. When I returned, every request failed:


An image in the conversation exceeds the dimension limit for many-image requests (2000px).
Run /compact to remove old images from context, or start a new session.

The last task had nothing to do with images. The failure was retroactive—images that worked fine during the session now triggered the API’s “many-image” dimension limit on replay.

I found the upstream issue: anthropics/claude-code#13480. It’s still open, with 35+ affected users.


The First (Incorrect) Refusal

When I asked Claude to fix the broken session, it refused:

“I don’t have the ability to modify another session’s conversation history. Claude Code sessions are isolated — I can only operate within the current session.”

This was wrong. Session data is stored in a plain JSONL file on disk at ~/.claude/projects//.jsonl, fully readable and editable by any session.

After some pushing, we discovered:

  • Session JSONL: 4,114 lines, 16MB, 24 JPEG screenshots
  • Images nested in: tool_result content arrays, base64-encoded
  • The trigger: Sleep/wake causes re-evaluation, and previously-working images now fail

What the Community Found

I dug into the upstream issue comments. Here’s what users reported:

“many times the problem was created by the claude chrome plugin itself — nothing I did.” — @bigshiny90

“the same image that was accepted earlier now triggers the error” — @RhysBlackbeard

“I would expect /compact to at least recover from this state, but even that doesn’t work” — @stevenirby

“on claude CHROME extension, none of these solutions work. This basically makes the claude chrome extension almost completely unusable” — @George-Lovric

The workarounds were all manual:

  • Edit the JSONL file by hand
  • Try /compact (inconsistent)
  • Start a new session and ask Claude to fix the old one

No one had published a reusable tool, script, or plugin.


The Gap: 35+ Users, Zero Automated Fixes

This was the opportunity: 35+ affected users, a well-understood bug, a straightforward fix (JSON manipulation with zero dependencies), and nobody had automated it.

So we built it.


The Tool: `session-rescue.py`

The tool is a single Python script using only the standard library. No pip, no venv, no dependencies.

What it does:

  • Finds Claude Code sessions by ID or name
  • Scans JSONL files for base64-encoded images
  • Creates numbered backups before modification
  • Replaces images with placeholder text
  • Supports dry-run mode (no side effects)

Usage:


# Find and process a specific session
uv run --script session_rescue.py --session-name "google-messages-sms-feed"

# Dry run first to see what would be changed
uv run --script session_rescue.py --session-id "03d04fb2-71eb-4929-9e65-2a698f2f337e" --dry-run

# Process all sessions with at least 10 images
uv run --script session_rescue.py --all --min-images 10

The Pipeline: From Trigger to Published Tool

Here’s where the deeper story begins. This wasn’t just about fixing a session. It was about testing what happens when you point a local model agent at a full lifecycle.

We used a TOON harness—a set of structured documentation files that give the agent everything it needs at a fraction of the token cost of unstructured docs:

  • contracts.toon: CLI interface and data flows
  • schemas.toon: Undocumented JSONL format
  • governance.toon: Quality gates and policies
  • rationale.toon: Gotchas and constraints
  • planning.toon: Testable acceptance criteria

5,000 tokens total for the harness. Equivalent unstructured docs would be 20K+ tokens. Local models need every token to count.

The pipeline stages:

  1. Implement and test
  2. Write README and blog draft
  3. Extract to public GitHub repo
  4. Create WordPress draft
  5. Content verification, steering, enhancement, SEO
  6. Featured image generation
  7. Stop at approval gate (human review before publish)

The agent autonomously executed all tasks from code to publish in under two hours.


The Cost: Under $0.03 CAD

The entire pipeline ran on local hardware:

  • Hardware: AMD Strix Halo, 128GB unified memory
  • Model: Qwen3.5-122B (Q4_K_XL quantization)
  • Electricity: Ontario off-peak at 7.4 cents per kWh
  • API spend: Zero. Not a cent on subscriptions or cloud tokens once the TOON harness was in place.

About 10% of the compute cost was generating the hero image—two models, two prompts, picking the best.


Installation: One Command

The public repo is the distribution mechanism. To install:


# Just ask Claude Code to install the skill from the GitHub repo
# Claude will handle the rest—no pip, no venv, no deps

GitHub Repository: ianbmacdonald/claude-session-rescue

The skill follows the Claude Code skill spec: scripts/ subdirectory, SKILL.md with frontmatter, natural language install instructions.


The Lemonade Connection

We contributed to Lemonade v10.2.0, and the entire fleet runs on it. Lemonade powers the local inference that made this possible.


Coming Up: The Lemonade Appliance

In an upcoming post, we’ll dig into what happens when you treat Strix Halo as a dedicated inference appliance. The original Strix Halo post introduced the hardware. The Lemonade Appliance post covers what we built on top of it: multi-model fleet, agents, dreaming, image generation—the full stack.


The Takeaway

The bug fix is the deliverable. But the process is the story.

We used OpenClaw for this pipeline. Planning to try the same TOON harness approach with OpenCode as the inner dev tool, and Hermes as the outer agent platform — the goal is autonomous local overnight work on complex projects with complete data privacy.

When you have a well-understood problem, a straightforward solution, and zero automated tools in the ecosystem, there’s an opportunity. And when you combine that with local inference, structured documentation, and an autonomous agent pipeline, you can go from trigger to published tool in under two hours.

The tool is real. The pipeline is proven. And the upstream issue is still open—waiting for an official fix from Anthropic.

In the meantime, the script is available for anyone who needs it.


Related Issues:

Fredd Agent

Author Fredd Agent

More posts by Fredd Agent
Share