CI/CD Workflow Autofix Guide

📅 Published on: 2025-01-05👤 By: RepoBird Team
ci cd autofix
github actions
workflow failures
automatic fixes
pipeline fixing
claude code
ai debugging
pull request workflows
continuous integration
automated testing
workflow automation
fix pr
devops automation

Overview

RepoBird's CI/CD Workflow Autofix feature monitors your GitHub Actions workflows and automatically fixes failures on pull requests. When a PR workflow fails, RepoBird analyzes the failure logs, identifies the root cause, and creates a fix PR - all within minutes.

Key Benefits:

  • Save Time: Eliminate 15-20% of developer time spent debugging CI/CD issues
  • Fast Turnaround: Most fixes created within 5 minutes of failure detection
  • Intelligent Analysis: Powered by Claude Sonnet 4.5, the strongest coding model for real software engineering tasks
  • Full Control: Configure which workflows to fix, timeout limits, and retry behavior

Scope: Pull Request Workflows Only

Important: The autofix feature only handles workflow failures triggered by pull_request events. This ensures:

  • Fixes go through code review before merging
  • No direct modifications to production branches
  • Clear audit trail for all automated changes
  • Reduced risk of cascading failures

Getting Started

Step 1: Enable CI/CD Autofix

  1. Navigate to your repository page at /repos/[repoId]
  2. Find the "CI/CD Autofix" section
  3. Toggle "Enable CI/CD Autofix" to ON
  4. The secret setup wizard will appear

Step 2: Configure Secrets

RepoBird needs two API keys to function:

Option A: Automatic Setup (Recommended)

  1. Select "Auto-Configure Secrets"
  2. Choose an existing RepoBird API key from the dropdown (or create a new one)
  3. Enter your Anthropic API key
  4. Click "Configure Secrets"
  5. RepoBird adds both secrets to your GitHub repository automatically
  6. Wait for validation (✓ icons appear when secrets detected)

Option B: Manual Setup

  1. Select "Manual Setup Instructions"
  2. Follow the step-by-step instructions
  3. Go to GitHub: Repository → Settings → Secrets → Actions
  4. Add REPOBIRD_API_KEY (value provided with copy button)
  5. Add ANTHROPIC_API_KEY (your own Anthropic API key)
  6. Return to RepoBird - validation auto-refreshes every 10 seconds
  7. Wait for ✓ icons to confirm secrets are configured

Step 3: Install Workflow File

After secrets are validated:

  1. Click "Create Workflow PR" button
  2. RepoBird creates a PR with .github/workflows/repobird-cicd-autofix.yml
  3. Review the workflow file in GitHub
  4. Merge the PR to activate CI/CD autofix
  5. Feature is now active - next workflow failure triggers autofix

Why PR and not direct commit?

  • Workflow files have elevated privileges and access to secrets
  • PR allows code review (security best practice)
  • Creates transparency and audit trail
  • Follows industry standards (Dependabot, Renovate use PRs)

How PR creation works:

  • The workflow pushes fix changes to a new branch
  • RepoBird uses its GitHub App token to create the PR automatically
  • No manual repository settings required!

Configuration Options

Fix Modes

RepoBird offers two modes for applying fixes:

Separate PR (Default, Recommended)

  • Creates a new PR with the fix targeting the same base branch
  • Pros: Clean separation, clear audit trail, safer, independent review
  • Cons: Additional PR to manage, potential merge conflicts
  • Best for: Most teams - provides transparency and control

Same PR

  • Pushes fix commits directly to the failing PR branch
  • Pros: Single PR workflow, no additional PRs, faster resolution
  • Cons: Mixes original work with autofix changes, less clear attribution
  • Best for: Solo developers or small teams with high trust

How to configure: Select fix mode from dropdown in repository settings

Workflow Filters

Control which workflows trigger autofix:

All Workflows (Default)

  • Monitors all GitHub Actions workflows in the repository
  • Simplest setup - no configuration needed

Selected Workflows

  • Choose specific workflows using include/exclude lists
  • Smart classification helps identify critical workflows:
    • 🔴 Critical: Builds, tests, deployments (typically want autofix)
    • 🟡 Standard: Linting, formatting, security scans
    • 🟢 Optional: Scheduled tasks, notifications

Configuration UI:

  • Toggle between "All workflows" and "Selected workflows"
  • View list of all workflows with last run status
  • Add workflows to include/exclude lists
  • System automatically detects new workflows

Timeout and Budget Limits

Timeout (1-60 minutes)

  • Default: 15 minutes
  • Maximum time Claude can spend analyzing and fixing
  • Longer timeouts for complex issues
  • Failed fixes due to timeout still create partial PRs

Budget ($1-$50)

  • Default: $5 per fix attempt
  • Maximum API cost for a single fix
  • Prevents runaway costs on difficult issues
  • Typical fix uses $0.50-$2 in API calls

Retry Limits

  • Default: 1 retry (2 total attempts)
  • Maximum: 3 retries (4 total attempts)
  • Each retry consumes full timeout and budget
  • Example: 3 retries with 60min timeout and $5 budget = 3 hours and $15 potential cost
  • System stops after max retries exceeded

Cooldown Period

Purpose: Prevent rapid retry loops on persistently failing workflows

Default: 30 minutes between fix attempts for the same workflow Configurable: 15-120 minutes Behavior: If same workflow fails again during cooldown, autofix is skipped

How It Works

Autofix Lifecycle

  1. Triggering Workflow Failure

    • Developer opens/updates PR
    • GitHub Actions workflow runs and fails
    • GitHub sends workflow_run.completed webhook to RepoBird
  2. Non-Reproducible Check

    • RepoBird clones the PR branch to repobird/cicd-fix-{id}
    • Runs clone validation workflow (same workflow, no changes)
    • If passes: Failure was non-reproducible (flaky test, race condition)
      • Deletes cloned branch
      • Posts comment: "⚠️ Could not reproduce failure"
      • Exits without running Claude
    • If fails: Confirms reproducible failure, proceeds to analysis
  3. AI Analysis

    • Claude Code analyzes failure logs, workflow file, and relevant code
    • Identifies root cause (timeout, dependency conflict, syntax error, etc.)
    • Generates fix with explanation
  4. Fix PR Creation

    • Separate PR mode: Creates new PR targeting same base branch
      • Branch: repobird/cicd-fix-{workflow_run_id}
      • Title: "Fix CI/CD: [workflow name] - [brief description]"
    • Same PR mode: Pushes commits to existing PR branch
    • PR includes detailed description explaining the fix
  5. Fix Validation

    • Fix validation workflow runs on the fix PR
    • If passes: Success! Posts comment with link to fix PR
    • If fails: Triggers retry (if within retry limit)
    • Validation results posted as comments on original PR

GitHub Comments

RepoBird posts informative comments on your PR:

Success:

✅ CI/CD Autofix: Fix PR created

Triggering workflow: ci.yml (Run #123)
Fix PR: #457
Issue: Test timeout in integration tests
Solution: Increased timeout from 30s to 60s

Estimated fix time: 3 minutes

Non-Reproducible:

⚠️ Could not reproduce CI/CD failure

Triggering workflow: ci.yml (Run #123)
Clone validation passed without any changes.

Possible causes:
- Flaky test (intermittent failure)
- Race condition
- External service dependency
- Time-based test logic

Recommendation: Re-run the workflow or investigate test stability.

Timeout:

⏱️ CI/CD Autofix: Timeout exceeded

Triggered by: ci.yml (Run #123)
Time spent: 15 minutes (limit reached)
Partial fix PR: #458 (draft)

Claude ran out of time before completing the fix.
Consider increasing timeout limit in repository settings.

Budget Exceeded:

💰 CI/CD Autofix: Budget exceeded

Triggered by: ci.yml (Run #123)
API cost: $5.00 (limit reached)
Partial fix PR: #459 (draft)

Fix required more API calls than budget allows.
Consider increasing budget limit in repository settings.

Validation Failed:

❌ CI/CD Autofix: Validation failed

Fix PR: #460
Validation workflow: ci.yml (Run #124)
Retry count: 1 / 1

The fix did not resolve the workflow failure.
Retrying with updated analysis...

Understanding Results

Fix Statuses

View fix history on your repository page:

  • Success ✅: Fix PR created and validation passed
  • Timeout ⏱️: Analysis exceeded time limit (partial PR may exist)
  • Budget Exceeded 💰: API costs exceeded limit (partial PR may exist)
  • Validation Failed ❌: Fix PR created but validation workflow failed
  • No Changes ⚠️: Failure was non-reproducible
  • Error 🔴: System error (check logs or contact support)

Metrics and Time Savings

Repository Dashboard shows:

  • Total fixes attempted this period
  • Success rate percentage
  • Estimated developer hours saved
  • Most common error types
  • Fix history (last 10 runs)

Drill-Down Details:

  • Click any fix run to see full details
  • View failure logs, fix diff, and validation results
  • Token usage and API cost breakdown
  • Retry history

Tier Limits

Free Tier

  • 3 CI/CD fixes per period (resets monthly)
  • All configuration options available
  • Full feature access

Pro Tier

  • 300 CI/CD fixes per period (resets monthly)
  • All configuration options available
  • Priority support

What happens when limit reached?

  • New workflow failures are not auto-fixed
  • Email notification sent to repository owner
  • Manual fixes still possible
  • Limit resets at start of next billing period

Does retry count against limit?

  • Yes, each retry attempt consumes 1 fix from your quota
  • Example: Original fix + 2 retries = 3 fixes consumed

Best Practices

When to Use Separate PR Mode

  • Team environments with code review requirements
  • Repositories with strict merge policies
  • When you want clear attribution of autofix changes
  • Compliance requirements for audit trails

When to Use Same PR Mode

  • Solo developer projects
  • Rapid iteration environments
  • When original author wants single PR workflow
  • Low-risk repositories

Workflow Filter Recommendations

  • Start with "All workflows" to see what gets fixed
  • After 1-2 weeks, review fix history
  • Exclude workflows that consistently fail autofix (custom scripts, external dependencies)
  • Include high-value workflows (tests, builds, linting)

Timeout and Budget Guidelines

  • Simple fixes (syntax, formatting): 5-10 min, $1-2
  • Medium complexity (test failures, dependency updates): 15-30 min, $3-5
  • Complex issues (integration failures, configuration): 30-60 min, $5-10
  • Monitor actual usage in metrics dashboard and adjust

Managing Retries

  • Default 1 retry is usually sufficient for transient issues
  • Increase retries for repositories with complex test suites
  • Set cooldown period to prevent rapid retry loops
  • Review validation failures to identify systemic issues

Troubleshooting

Secrets Not Detected

Symptom: ✗ icons persist after adding secrets manually

Solutions:

  • Wait 10 seconds for auto-refresh
  • Click "Refresh" button manually
  • Verify secret names are exact: REPOBIRD_API_KEY and ANTHROPIC_API_KEY
  • Check GitHub repository → Settings → Secrets → Actions
  • Ensure you have admin permissions on the repository

Workflow File PR Not Created

Symptom: "Create Workflow PR" button disabled or errors

Solutions:

  • Verify both secrets validated (✓ ✓)
  • Check GitHub App has Actions write permissions
  • Ensure repository is not archived
  • Check for existing PR: repobird/add-cicd-autofix-workflow

Autofix Not Triggering

Symptom: Workflow fails but no fix PR created

Solutions:

  • Verify CI/CD Autofix toggle is ON
  • Check workflow file PR was merged
  • Confirm workflow trigger is pull_request (not push, schedule, etc.)
  • Check tier limits not exceeded
  • Verify workflow is not in exclude list
  • Check cooldown period not active

Non-Reproducible Failures

Symptom: Frequently seeing "Could not reproduce failure" comments

Solutions:

  • Investigate test stability (flaky tests)
  • Check for race conditions in test suite
  • Review external service dependencies (APIs, databases)
  • Consider adding retries to flaky tests
  • Use timestamps or fixed seeds for time-based logic

Fix PRs Not Resolving Issues

Symptom: Fix validation workflows consistently fail

Solutions:

  • Review fix PR content - may need manual intervention
  • Check if issue requires configuration outside code (GitHub settings, secrets, environment variables)
  • Increase timeout for complex issues
  • Provide more context in workflow failure logs
  • Consider manual fix for systemic architectural issues

Budget or Timeout Limits Hit

Symptom: Frequent partial fix PRs with budget/timeout errors

Solutions:

  • Increase limits in repository settings
  • Simplify test suite if possible
  • Check for infinite loops or recursive issues
  • Review if issue requires human architectural decisions
  • Consider excluding problematic workflows

Getting Help

  • Documentation: Visit /docs for more guides
  • Support: Contact support@repobird.com
  • GitHub Issues: Report bugs at our GitHub repository
  • Community: Join our Discord for discussions and tips

Limitations

Workflow Triggers Not Supported (by design):

The following workflow trigger types are intentionally not supported to ensure safety and maintain proper review processes:

  • Push events to main/develop/production branches
  • Scheduled workflows (cron jobs)
  • Manual workflow triggers (workflow_dispatch)
  • Deployment or release workflows

These triggers require manual intervention due to higher risk, production impact, and need for human review.

Related Documentation