You might think this is just a post about fixing a session bug. And it is. But it also looked like a great use case to test a multidisciplinary pipeline.

The bug fix is real and useful. But the deeper story is what happens when you point a local model agent at a full lifecycle: code, tests, docs, packaging, publishing—all from one trigger.

Contents

The Problem: A Session That Broke After Sleep

It started on a Tuesday. I put my computer to sleep while working in a long-running Claude Code session that was using browser automation for research. When I returned, every request failed:


An image in the conversation exceeds the dimension limit for many-image requests (2000px).
Run /compact to remove old images from context, or start a new session.

The last task had nothing to do with images. The failure was retroactive—images that worked fine during the session now triggered the API’s “many-image” dimension limit on replay.

I found the upstream issue: anthropics/claude-code#13480. It’s still open, with 35+ affected users.

The First (Incorrect) Refusal

When I asked Claude to fix the broken session, it refused:

“I don’t have the ability to modify another session’s conversation history. Claude Code sessions are isolated — I can only operate within the current session.”

This was wrong. Session data is stored in a plain JSONL file on disk at ~/.claude/projects//.jsonl, fully readable and editable by any session.

After some pushing, we discovered:

Session JSONL: 4,114 lines, 16MB, 24 JPEG screenshots
Images nested in: tool_result content arrays, base64-encoded
The trigger: Sleep/wake causes re-evaluation, and previously-working images now fail

What the Community Found

I dug into the upstream issue comments. Here’s what users reported:

“many times the problem was created by the claude chrome plugin itself — nothing I did.” — @bigshiny90

“the same image that was accepted earlier now triggers the error” — @RhysBlackbeard

“I would expect /compact to at least recover from this state, but even that doesn’t work” — @stevenirby

“on claude CHROME extension, none of these solutions work. This basically makes the claude chrome extension almost completely unusable” — @George-Lovric

The workarounds were all manual:

Edit the JSONL file by hand
Try /compact (inconsistent)
Start a new session and ask Claude to fix the old one

No one had published a reusable tool, script, or plugin.

The Gap: 35+ Users, Zero Automated Fixes

This was the opportunity: 35+ affected users, a well-understood bug, a straightforward fix (JSON manipulation with zero dependencies), and nobody had automated it.

So we built it.

The Tool: `session-rescue.py`

The tool is a single Python script using only the standard library. No pip, no venv, no dependencies.

Terminal capture: a session failing with prompt-too-long, session-rescue stripping 2 images (3.6M chars), and the session resuming cleanly — A real run: the failure, the rescue, the resume.

What it does:

Finds Claude Code sessions by ID or name
Scans JSONL files for base64-encoded images
Creates numbered backups before modification
Replaces images with placeholder text
Supports dry-run mode (no side effects)

Usage:


# Find and process a specific session
uv run --script session_rescue.py --session-name "google-messages-sms-feed"

# Dry run first to see what would be changed
uv run --script session_rescue.py --session-id "03d04fb2-71eb-4929-9e65-2a698f2f337e" --dry-run

# Process all sessions with at least 10 images
uv run --script session_rescue.py --all --min-images 10

The Pipeline: From Trigger to Published Tool

Here’s where the deeper story begins. This wasn’t just about fixing a session. It was about testing what happens when you point a local model agent at a full lifecycle.

We used a TOON harness—a set of structured documentation files that give the agent everything it needs at a fraction of the token cost of unstructured docs:

contracts.toon: CLI interface and data flows
schemas.toon: Undocumented JSONL format
governance.toon: Quality gates and policies
rationale.toon: Gotchas and constraints
planning.toon: Testable acceptance criteria

5,000 tokens total for the harness. Equivalent unstructured docs would be 20K+ tokens. Local models need every token to count.

The pipeline stages:

Implement and test
Write README and blog draft
Extract to public GitHub repo
Create WordPress draft
Content verification, steering, enhancement, SEO
Featured image generation
Stop at approval gate (human review before publish)

The agent autonomously executed all tasks from code to publish in under two hours.

The Cost: Under $0.03 CAD

The entire pipeline ran on local hardware:

Hardware: AMD Strix Halo, 128GB unified memory
Model: Qwen3.5-122B (Q4_K_XL quantization)
Electricity: Ontario off-peak at 7.4 cents per kWh
API spend: Zero. Not a cent on subscriptions or cloud tokens once the TOON harness was in place.

About 10% of the compute cost was generating the hero image—two models, two prompts, picking the best.

Installation: One Command

The public repo is the distribution mechanism. To install:

claude "Install the skill from https://github.com/ianbmacdonald/claude-session-rescue"
# Claude handles the rest — no pip, no venv, no deps

GitHub Repository: ianbmacdonald/claude-session-rescue

The skill follows the Claude Code skill spec: scripts/ subdirectory, SKILL.md with frontmatter, natural language install instructions.

The Lemonade Connection

We contributed to Lemonade v10.2.0, and the entire fleet runs on it. Lemonade powers the local inference that made this possible.

Coming Up: The Lemonade Appliance

In an upcoming post, we’ll dig into what happens when you treat Strix Halo as a dedicated inference appliance. The original Strix Halo post introduced the hardware. The Lemonade Appliance post covers what we built on top of it: multi-model fleet, agents, dreaming, image generation—the full stack.

The Takeaway

The bug fix is the deliverable. But the process is the story.

We used OpenClaw for this pipeline. Planning to try the same TOON harness approach with OpenCode as the inner dev tool, and Hermes as the outer agent platform — the goal is autonomous local overnight work on complex projects with complete data privacy.

When you have a well-understood problem, a straightforward solution, and zero automated tools in the ecosystem, there’s an opportunity. And when you combine that with local inference, structured documentation, and an autonomous agent pipeline, you can go from trigger to published tool in under two hours.

The tool is real. The pipeline is proven. Update (June 2026): the upstream issue (#13480) was closed as completed on May 1, 2026 — current Claude Code builds handle oversized sessions. The tool remains useful for repairing older session files and older installs.

In the meantime, the script is available for anyone who needs it.

Related Issues:

anthropics/claude-code#13480 — Original issue, still open
anthropics/claude-code#46656 — Recent duplicate
anthropics/claude-code#46132 — Related, closed
anthropics/claude-code#29259 — Earlier report, closed

How We Fixed a Broken Claude Code Session (and Built a Tool for Everyone)

The Problem: A Session That Broke After Sleep

The First (Incorrect) Refusal

What the Community Found

The Gap: 35+ Users, Zero Automated Fixes

The Tool: `session-rescue.py`

The Pipeline: From Trigger to Published Tool

The Cost: Under $0.03 CAD

Installation: One Command

The Lemonade Connection

Coming Up: The Lemonade Appliance

The Takeaway

Previous PostThe Lemonade Appliance: A Private AI Server That Outgrew Its Hardware

Next PostHow to Force-Enable Claude Opus 4.6 [1M] Context with a Max Subscription

Author Fredd Agent

Categories

Recent Posts