Conversation
…script Automates E2E failure triage with three new components: - scripts/download-e2e-artifacts.sh: reusable script to download CI artifacts - .claude/skills/e2e-triage/SKILL.md: 7-step triage skill (classify flaky vs real bug, create PRs or issues) - .github/workflows/e2e-triage.yml: workflow_run trigger that auto-runs Claude Opus on E2E failure Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 1aa72dcd8a2b
Post "Claude is triaging..." when triage starts and a structured summary with PR/issue links when it completes. The skill now writes triage-summary.json which the workflow parses with jq for the Slack message. Falls back to a warning if no summary is produced. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 8e5dcc6ef8ab
…ifications - Build Slack payload via jq (payload-file-path) instead of interpolating raw text into inline JSON, which broke on quotes/newlines in summaries - Add secrets.E2E_SLACK_WEBHOOK_URL guard to "Build Slack summary" and "Notify Slack - triage complete" steps (matching the "started" step) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 7c1914052967
PR SummaryHigh Risk Overview Adds Written by Cursor Bugbot for commit 3edf654. Configure here. |
There was a problem hiding this comment.
Pull request overview
This PR adds automated E2E test failure triage infrastructure. When E2E tests fail in CI, a new workflow_run-triggered workflow invokes Claude Code (Opus) to download artifacts, analyze failures, classify them as flaky (agent non-determinism) or real bugs, and automatically create PRs for flaky fixes or GitHub issues for real bugs. It also includes Slack notifications at each stage.
Changes:
- New
e2e-triage.ymlGitHub Actions workflow that auto-triggers on E2E test failures, runs Claude Code to triage, and sends Slack notifications with structured summaries. - New
download-e2e-artifacts.shscript to download and restructure E2E test artifacts from GitHub Actions by run ID, URL, or "latest" failed run. - New
.claude/skills/e2e-triage/SKILL.mdproviding structured instructions for Claude to classify and act on E2E failures, plus README updates documenting the new workflow and artifact download process.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
.github/workflows/e2e-triage.yml |
New workflow triggered by E2E test failures; downloads artifacts, runs Claude Code triage, sends Slack notifications |
scripts/download-e2e-artifacts.sh |
New script to download, restructure, and annotate E2E artifacts from GitHub Actions |
.claude/skills/e2e-triage/SKILL.md |
Skill instructions for Claude to analyze failures, classify them, and create PRs/issues |
e2e/README.md |
Documents the new triage workflow, skill, and artifact download script |
| # Move contents up: e2e-artifacts-claude-code/* -> claude-code/ | ||
| if [ -d "$agent" ]; then | ||
| # Agent dir already exists (shouldn't happen, but be safe) | ||
| cp -r "$wrapper"/* "$agent"/ 2>/dev/null || true |
There was a problem hiding this comment.
In the fallback cp -r branch (when $agent dir already exists), the original $wrapper directory is not removed after copying its contents. This means both e2e-artifacts-claude-code/ and claude-code/ would coexist, and the wrapper directory would appear in the agents_found listing on line 87, producing incorrect metadata.
Add rm -rf "$wrapper" after the cp -r to clean up the wrapper directory.
| cp -r "$wrapper"/* "$agent"/ 2>/dev/null || true | |
| cp -r "$wrapper"/* "$agent"/ 2>/dev/null || true | |
| rm -rf "$wrapper" |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
| cp -r "$wrapper"/* "$agent"/ 2>/dev/null || true | ||
| else | ||
| mv "$wrapper" "$agent" | ||
| fi |
There was a problem hiding this comment.
Stale wrapper directories left after copy branch
Medium Severity
When the cp -r branch is taken (agent directory already exists), the original e2e-artifacts-* wrapper directory is never removed. This leaves stale wrapper directories that get included in agents_found (via ls -d */) and written into .run-info.json. Downstream, the triage skill iterates "each agent subdirectory in the artifact root," so it would scan these stale wrappers as if they were additional agents, potentially causing duplicate failure reports.
Additional Locations (1)
Rewrite SKILL.md with dual-mode support (auto-detected via WORKFLOW_RUN_ID env var): local mode runs tests with mise and re-runs failures up to 3 times, CI mode triggers e2e-isolated.yml workflows for re-run verification. Classification now uses re-run results as the primary signal (all fail = real-bug, mixed results = flaky). Workflow changes: actions permission upgraded to write for gh workflow run, timeout increased to 60m for re-run polling, Claude prompt updated with CI mode hint and re-run instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 9f75c3effd9b
Local mode now presents findings interactively and applies fixes directly in the working tree instead of creating branches/PRs/issues: - Step 4a: findings report, proposed fixes, user approval gate, in-place fixes - Step 4b: unchanged CI behavior (batched PR for flaky, issues for real bugs) - Step 5: local mode gets simpler summary table, no triage-summary.json Entire-Checkpoint: 4e1d9cf59d52
Consistent test failures can be test infrastructure bugs (e2e/ code), not product bugs (cmd/entire/cli/). Update classification signals, fix lists, and action sections to distinguish the two. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 28c90fcc7266
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ba6877944a6c
Replace duplicated artifact-reading steps in e2e-triage Step 1 with a reference to debug-e2e's Debugging Workflow (steps 2-5), keeping the collect list so classification inputs remain clear. Add Related Skills section to README. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: eb14496bde1e
Entire-Checkpoint: bb778fbab533


No description provided.