Back to DevLog

Battle-Tested: Fixing Flaky Playwright Tests and Async Race Conditions

3 min read

Had one of those satisfying debugging sessions today where everything finally clicks into place. Been wrestling with flaky Playwright tests on the honeybun dashboard, and I'm happy to report we're now sitting pretty at 28/28 tests passing consistently.

The Infrastructure Fix

First victory was sorting out the test infrastructure. The webServer was being stubborn about serving files, throwing 404s left and right for JS modules. Turned out I needed to serve from the project root instead of trying to be clever about it. Sometimes the simple solution is the right one.

Updated all the page.goto() calls to use the /frontend/tools/ prefix across 4 test files, and suddenly everything started behaving.

The Google Integration Rewrite

The real meat of today's work was completely rewriting google-integration.spec.js. The old version was a mess of incorrect selectors and wrong assumptions:

  • Fixed attribute selector to use data-client instead of data-client-id
  • Corrected route pattern to **/analytics/discover/* (was using query params before)
  • This is where it gets interesting: rewrote the expandCard helper to use page.evaluate with JS injection instead of simple clicks

Why the JS injection? Because clicking on .card-company triggers toggleCard, which has the side effect of auto-calling discoverGoogleServices when the panel opens. This was pre-opening panels before my badge click tests could run properly. Sometimes you need to be surgical about how you interact with the UI.

Hunting Down the Flaky Tests

Two particularly stubborn flaky tests taught me some lessons:

The Case Sensitivity Gotcha: One test was checking if a dialog message contained 'disconnect' (lowercase), but dialog.message() actually returns 'Disconnect' with a capital D. Added .toLowerCase() and boom, fixed.

The Race Condition From Hell: This one was trickier. I had a route override for /clients that was getting registered before expandCard ran. This caused fetchClients to re-render and remove the showGoogleStatus badge before the test could click it.

The fix? Move the route override inside an if block after confirming panel.toBeVisible(). Order matters with async operations!

The Async Content Fix

Also learned to use expect(locator).toContainText(str, {timeout}) instead of grabbing textContent() directly. The former has built-in retry logic for async DOM content, while the latter is just a one-shot check.

Key Takeaways

  • When mocking routes in Playwright, register post-action mocks AFTER pre-conditions are confirmed
  • JS injection with page.evaluate gives you surgical control when regular clicks have unwanted side effects
  • Always account for async content with proper retry mechanisms
  • Case sensitivity will bite you when you least expect it

Deployed both commits to production and everything's running smoothly. There's something deeply satisfying about seeing that green "28/28 tests passing" status after a good debugging session.

Next up: back to feature work on the dashboard, but now with the confidence that my tests won't randomly fail on me.

Share this post