home / tips / ask-claude-to-diagnose-and-fix-flaky-tests-that-pass-sometimes-and-fail-randomly

Ask Claude to Diagnose and Fix Flaky Tests That Pass Sometimes and Fail Randomly

@recombobulate · Mar 30, 2026 · Debugging

advanced productivity testing debugging

ask-claude-to-diagnose-and-fix-flaky-tests-that-pass-sometimes-and-fail-randomly

A test that sometimes passes and sometimes fails is worse than a test that always fails — it erodes trust in your entire suite. Claude can read a flaky test, identify why it's non-deterministic, and fix the underlying issue.

> this test passes locally but fails randomly in CI —
> read it and figure out why it's flaky, then fix it

Claude reads the test, examines the setup, checks for common flakiness patterns, and identifies the root cause. Then it fixes the test to be deterministic.

The most common causes Claude spots:

> # Time-dependent tests
> The test compares against "now" — but the time between setup
> and assertion can vary, especially under CI load

> # Shared database state
> The test assumes a clean database but another test in the suite
> leaves data behind that affects this one

> # Random ordering
> The test relies on results being in a specific order but the
> database doesn't guarantee order without ORDER BY

> # Race conditions
> The test fires an async operation and checks the result
> immediately without waiting for it to complete

Tell Claude about the failure pattern for better diagnosis:

> this test fails about 1 in 5 runs — here's the error output
> from the last failure. It's an assertion on the order total
> being $99.99 but sometimes it's $0.00

> this test only fails when run as part of the full suite but
> passes in isolation — what's leaking between tests?

> this test started flaking after we added the caching layer —
> it passes on first run but fails on subsequent runs

Claude fixes each pattern differently:

Time issues — freezes time with Carbon::setTestNow(), jest.useFakeTimers(), or equivalent
Shared state — adds proper teardown, uses transactions, or isolates the test
Ordering — adds explicit ORDER BY or makes assertions order-independent
Race conditions — adds proper waits, polling, or makes the operation synchronous in tests
Caching — clears caches in setup or uses a test-specific cache store

A flaky test has a deterministic cause — Claude reads the code and finds it faster than you can reproduce the failure.

via Claude Code

~/recombobulate $ tip --comments --count=0

~/recombobulate $ tip --related --limit=3

161

Ask Claude to Find and Fix the Performance Bottleneck in a Slow Endpoint

When a page takes five seconds to load or an API endpoint times out under load, tell Claude which route is slow and it traces the entire code path — controller, services, queries, loops — identifying N+1 queries, redundant computations, missing indexes, and cacheable operations, then fixes each bottleneck.

Debugging advanced productivity debugging database performance

@recombobulate · 1 month ago

148

Paste an Error Message or Stack Trace and Let Claude Trace It to the Root Cause

When your app throws an error, don't just Google the message — paste the full stack trace into Claude Code. It reads the trace, opens the referenced files in your codebase, follows the call chain, and pinpoints the actual root cause instead of just explaining the symptom.

Debugging beginner productivity testing debugging

@recombobulate · 1 month ago

127

Drop Screenshots into Claude Code to Debug Visual Issues and Read Error Messages

Claude Code can see images — paste a screenshot of a broken UI, a confusing error dialog, terminal output, or a design mockup and Claude reads it visually, understands the context, and helps you fix the problem or implement the design without you having to transcribe anything.

Debugging beginner productivity debugging css multimodal

@recombobulate · 1 month ago