CI/CD Workflow Autofix Guide

📅 Published on: 2025-01-05👤 By: RepoBird Team
RepoBird
AI Development
Cicd
Workflow
CICD

Overview

RepoBird's CI/CD Workflow Autofix feature monitors your GitHub Actions workflows and automatically fixes failures on pull requests. When a PR workflow fails, RepoBird analyzes the failure logs, identifies the root cause, and creates a fix PR - all within minutes.

Key Benefits:

  • Save Time: Eliminate 15-20% of developer time spent debugging CI/CD issues
  • Fast Turnaround: Most fixes created within 5 minutes of failure detection
  • Intelligent Analysis: Powered by Claude Sonnet 4.5, the strongest coding model for real software engineering tasks
  • Full Control: Configure which workflows to fix, timeout limits, and retry behavior

Scope: Pull Request Workflows Only

Important: The autofix feature only handles workflow failures triggered by pull_request events. This ensures:

  • Fixes go through code review before merging
  • No direct modifications to production branches
  • Clear audit trail for all automated changes
  • Reduced risk of cascading failures

Getting Started

Step 1: Enable CI/CD Autofix

  1. Navigate to your repository page at /repos/[repoId]
  2. Find the "CI/CD Autofix" section
  3. Toggle "Enable CI/CD Autofix" to ON
  4. The secret setup wizard will appear

Step 2: Configure Secrets

RepoBird needs two API keys to function:

Option A: Automatic Setup (Recommended)

  1. Select "Auto-Configure Secrets"
  2. Choose an existing RepoBird API key from the dropdown (or create a new one)
  3. Enter your Anthropic API key
  4. Click "Configure Secrets"
  5. RepoBird adds both secrets to your GitHub repository automatically
  6. Wait for validation (✓ icons appear when secrets detected)

Option B: Manual Setup

  1. Select "Manual Setup Instructions"
  2. Follow the step-by-step instructions
  3. Go to GitHub: Repository → Settings → Secrets → Actions
  4. Add REPOBIRD_API_KEY (value provided with copy button)
  5. Add ANTHROPIC_API_KEY (your own Anthropic API key)
  6. Return to RepoBird - validation auto-refreshes every 10 seconds
  7. Wait for ✓ icons to confirm secrets are configured

Step 3: Install Workflow File

After secrets are validated:

  1. Click "Create Workflow PR" button
  2. RepoBird creates a PR with .github/workflows/repobird-cicd-autofix.yml
  3. Review the workflow file in GitHub
  4. Merge the PR to activate CI/CD autofix
  5. Feature is now active - next workflow failure triggers autofix

Why PR and not direct commit?

  • Workflow files have elevated privileges and access to secrets
  • PR allows code review (security best practice)
  • Creates transparency and audit trail
  • Follows industry standards (Dependabot, Renovate use PRs)

How PR creation works:

  • The workflow pushes fix changes to a new branch
  • RepoBird uses its GitHub App token to create the PR automatically
  • No manual repository settings required!

Configuration Options

Fix Modes

RepoBird offers two modes for applying fixes:

Separate PR (Default, Recommended)

  • Creates a new PR with the fix targeting the same base branch
  • Pros: Clean separation, clear audit trail, safer, independent review
  • Cons: Additional PR to manage, potential merge conflicts
  • Best for: Most teams - provides transparency and control

Same PR

  • Pushes fix commits directly to the failing PR branch
  • Pros: Single PR workflow, no additional PRs, faster resolution
  • Cons: Mixes original work with autofix changes, less clear attribution
  • Best for: Solo developers or small teams with high trust

How to configure: Select fix mode from dropdown in repository settings

Workflow Filters

Control which workflows trigger autofix:

All Workflows (Default)

  • Monitors all GitHub Actions workflows in the repository
  • Simplest setup - no configuration needed

Selected Workflows

  • Choose specific workflows using include/exclude lists
  • Smart classification helps identify critical workflows:
    • 🔴 Critical: Builds, tests, deployments (typically want autofix)
    • 🟡 Standard: Linting, formatting, security scans
    • 🟢 Optional: Scheduled tasks, notifications

Configuration UI:

  • Toggle between "All workflows" and "Selected workflows"
  • View list of all workflows with last run status
  • Add workflows to include/exclude lists
  • System automatically detects new workflows

Timeout and Budget Limits

Timeout (1-60 minutes)

  • Default: 15 minutes
  • Maximum time Claude can spend analyzing and fixing
  • Longer timeouts for complex issues
  • Failed fixes due to timeout still create partial PRs

Budget ($1-$50)

  • Default: $5 per fix attempt
  • Maximum API cost for a single fix
  • Prevents runaway costs on difficult issues
  • Typical fix uses $0.50-$2 in API calls

Retry Limits

  • Default: 1 retry (2 total attempts)
  • Maximum: 3 retries (4 total attempts)
  • Each retry consumes full timeout and budget
  • Example: 3 retries with 60min timeout and $5 budget = 3 hours and $15 potential cost
  • System stops after max retries exceeded

Cooldown Period

Purpose: Prevent rapid retry loops on persistently failing workflows

Default: 30 minutes between fix attempts for the same workflow Configurable: 15-120 minutes Behavior: If same workflow fails again during cooldown, autofix is skipped

How It Works

Autofix Lifecycle

  1. Triggering Workflow Failure

    • Developer opens/updates PR
    • GitHub Actions workflow runs and fails
    • GitHub sends workflow_run.completed webhook to RepoBird
  2. Non-Reproducible Check

    • RepoBird clones the PR branch to repobird/cicd-fix-{id}
    • Runs clone validation workflow (same workflow, no changes)
    • If passes: Failure was non-reproducible (flaky test, race condition)
      • Deletes cloned branch
      • Posts comment: "⚠️ Could not reproduce failure"
      • Exits without running Claude
    • If fails: Confirms reproducible failure, proceeds to analysis
  3. AI Analysis

    • Claude Code analyzes failure logs, workflow file, and relevant code
    • Identifies root cause (timeout, dependency conflict, syntax error, etc.)
    • Generates fix with explanation
  4. Fix PR Creation

    • Separate PR mode: Creates new PR targeting same base branch
      • Branch: repobird/cicd-fix-{workflow_run_id}
      • Title: "Fix CI/CD: [workflow name] - [brief description]"
    • Same PR mode: Pushes commits to existing PR branch
    • PR includes detailed description explaining the fix
  5. Fix Validation

    • Fix validation workflow runs on the fix PR
    • If passes: Success! Posts comment with link to fix PR
    • If fails: Triggers retry (if within retry limit)
    • Validation results posted as comments on original PR

GitHub Comments

RepoBird posts informative comments on your PR:

Success:

✅ CI/CD Autofix: Fix PR created

Triggering workflow: ci.yml (Run #123)
Fix PR: #457
Issue: Test timeout in integration tests
Solution: Increased timeout from 30s to 60s

Estimated fix time: 3 minutes

Non-Reproducible:

⚠️ Could not reproduce CI/CD failure

Triggering workflow: ci.yml (Run #123)
Clone validation passed without any changes.

Possible causes:
- Flaky test (intermittent failure)
- Race condition
- External service dependency
- Time-based test logic

Recommendation: Re-run the workflow or investigate test stability.

Timeout:

⏱️ CI/CD Autofix: Timeout exceeded

Triggered by: ci.yml (Run #123)
Time spent: 15 minutes (limit reached)
Partial fix PR: #458 (draft)

Claude ran out of time before completing the fix.
Consider increasing timeout limit in repository settings.

Budget Exceeded:

💰 CI/CD Autofix: Budget exceeded

Triggered by: ci.yml (Run #123)
API cost: $5.00 (limit reached)
Partial fix PR: #459 (draft)

Fix required more API calls than budget allows.
Consider increasing budget limit in repository settings.

Validation Failed:

❌ CI/CD Autofix: Validation failed

Fix PR: #460
Validation workflow: ci.yml (Run #124)
Retry count: 1 / 1

The fix did not resolve the workflow failure.
Retrying with updated analysis...

Understanding Results

Fix Statuses

View fix history on your repository page:

  • Success ✅: Fix PR created and validation passed
  • Timeout ⏱️: Analysis exceeded time limit (partial PR may exist)
  • Budget Exceeded 💰: API costs exceeded limit (partial PR may exist)
  • Validation Failed ❌: Fix PR created but validation workflow failed
  • No Changes ⚠️: Failure was non-reproducible
  • Error 🔴: System error (check logs or contact support)

Metrics and Time Savings

Repository Dashboard shows:

  • Total fixes attempted this period
  • Success rate percentage
  • Estimated developer hours saved
  • Most common error types
  • Fix history (last 10 runs)

Drill-Down Details:

  • Click any fix run to see full details
  • View failure logs, fix diff, and validation results
  • Token usage and API cost breakdown
  • Retry history

Tier Limits

Free Tier

  • 3 CI/CD fixes per period (resets monthly)
  • All configuration options available
  • Full feature access

Pro Tier

  • 300 CI/CD fixes per period (resets monthly)
  • All configuration options available
  • Priority support

What happens when limit reached?

  • New workflow failures are not auto-fixed
  • Email notification sent to repository owner
  • Manual fixes still possible
  • Limit resets at start of next billing period

Does retry count against limit?

  • Yes, each retry attempt consumes 1 fix from your quota
  • Example: Original fix + 2 retries = 3 fixes consumed

Best Practices

When to Use Separate PR Mode

  • Team environments with code review requirements
  • Repositories with strict merge policies
  • When you want clear attribution of autofix changes
  • Compliance requirements for audit trails

When to Use Same PR Mode

  • Solo developer projects
  • Rapid iteration environments
  • When original author wants single PR workflow
  • Low-risk repositories

Workflow Filter Recommendations

  • Start with "All workflows" to see what gets fixed
  • After 1-2 weeks, review fix history
  • Exclude workflows that consistently fail autofix (custom scripts, external dependencies)
  • Include high-value workflows (tests, builds, linting)

Timeout and Budget Guidelines

  • Simple fixes (syntax, formatting): 5-10 min, $1-2
  • Medium complexity (test failures, dependency updates): 15-30 min, $3-5
  • Complex issues (integration failures, configuration): 30-60 min, $5-10
  • Monitor actual usage in metrics dashboard and adjust

Managing Retries

  • Default 1 retry is usually sufficient for transient issues
  • Increase retries for repositories with complex test suites
  • Set cooldown period to prevent rapid retry loops
  • Review validation failures to identify systemic issues

Troubleshooting

Secrets Not Detected

Symptom: ✗ icons persist after adding secrets manually

Solutions:

  • Wait 10 seconds for auto-refresh
  • Click "Refresh" button manually
  • Verify secret names are exact: REPOBIRD_API_KEY and ANTHROPIC_API_KEY
  • Check GitHub repository → Settings → Secrets → Actions
  • Ensure you have admin permissions on the repository

Workflow File PR Not Created

Symptom: "Create Workflow PR" button disabled or errors

Solutions:

  • Verify both secrets validated (✓ ✓)
  • Check GitHub App has Actions write permissions
  • Ensure repository is not archived
  • Check for existing PR: repobird/add-cicd-autofix-workflow

Autofix Not Triggering

Symptom: Workflow fails but no fix PR created

Solutions:

  • Verify CI/CD Autofix toggle is ON
  • Check workflow file PR was merged
  • Confirm workflow trigger is pull_request (not push, schedule, etc.)
  • Check tier limits not exceeded
  • Verify workflow is not in exclude list
  • Check cooldown period not active

Non-Reproducible Failures

Symptom: Frequently seeing "Could not reproduce failure" comments

Solutions:

  • Investigate test stability (flaky tests)
  • Check for race conditions in test suite
  • Review external service dependencies (APIs, databases)
  • Consider adding retries to flaky tests
  • Use timestamps or fixed seeds for time-based logic

Fix PRs Not Resolving Issues

Symptom: Fix validation workflows consistently fail

Solutions:

  • Review fix PR content - may need manual intervention
  • Check if issue requires configuration outside code (GitHub settings, secrets, environment variables)
  • Increase timeout for complex issues
  • Provide more context in workflow failure logs
  • Consider manual fix for systemic architectural issues

Budget or Timeout Limits Hit

Symptom: Frequent partial fix PRs with budget/timeout errors

Solutions:

  • Increase limits in repository settings
  • Simplify test suite if possible
  • Check for infinite loops or recursive issues
  • Review if issue requires human architectural decisions
  • Consider excluding problematic workflows

Getting Help

  • Documentation: Visit /docs for more guides
  • Support: Contact support@repobird.com
  • GitHub Issues: Report bugs at our GitHub repository
  • Community: Join our Discord for discussions and tips

Limitations

Workflow Triggers Not Supported (by design):

The following workflow trigger types are intentionally not supported to ensure safety and maintain proper review processes:

  • Push events to main/develop/production branches
  • Scheduled workflows (cron jobs)
  • Manual workflow triggers (workflow_dispatch)
  • Deployment or release workflows

These triggers require manual intervention due to higher risk, production impact, and need for human review.

Related Documentation