Skip to content

feat(titan-plugin-git): add get_diff_stat step for structured diff statistics #89

@finxo

Description

@finxo

Summary

This issue proposes adding a new step get_diff_stat to the titan-plugin-git plugin. This step will execute git diff --stat between a base branch and a head branch, parsing the output into structured metadata (e.g., list of changed files, insertion counts, deletion counts).

Detailed Description

Currently, the GitClient in titan-plugin-git has methods for full diffs (get_diff, get_branch_diff), but these can be token-heavy when used in AI workflows or automations that only require an overview of changes. While users can run git diff --stat manually via CLI, having this as a formal workflow step allows for better automation and integration with other steps.

The new get_diff_stat step should:

  1. Accept optional base_branch (defaulting to the configured main branch) and head_branch (defaulting to HEAD).
  2. Run git diff base...head --stat.
  3. Parse the raw output into a structured format.
  4. Return a WorkflowResult containing metadata useful for downstream steps (e.g., validating PR size, generating reports, or conditionally triggering AI reviews).

Proposed Metadata Structure:

  • diff_stat: Raw output string.
  • changed_files: List of file paths.
  • files_changed: Integer count.
  • total_insertions: Integer count.
  • total_deletions: Integer count.

Code Snippets

Proposed implementation for plugins/titan-plugin-git/titan_plugin_git/steps/diff_stat_step.py:

# plugins/titan-plugin-git/titan_plugin_git/steps/diff_stat_step.py
def get_diff_stat(ctx: WorkflowContext) -> WorkflowResult:
    """
    Get diff statistics between branches.
    
    Params:
        base_branch: Base branch (default: main_branch from config)
        head_branch: Head branch (default: current branch)
    
    Returns metadata:
        diff_stat: Raw output of git diff --stat
        changed_files: List of changed file paths
        files_added: int
        files_modified: int
        files_deleted: int
        total_insertions: int
        total_deletions: int
    """
    base = ctx.get("base_branch", ctx.git.main_branch)
    head = ctx.get("head_branch", "HEAD")

    # Get stat output
    stat_output = ctx.git._run_command([
        "git", "diff", f"{base}...{head}", "--stat"
    ])

    # Parse it to extract structured data
    changed_files = []
    for line in stat_output.splitlines()[:-1]:  # Skip summary line
        file_path = line.split('|')[0].strip()
        changed_files.append(file_path)

    # Parse summary line: "5 files changed, 170 insertions(+), 14 deletions(-)"
    # ... parsing logic ...

    return Success("Diff stat retrieved", metadata={
        "diff_stat": stat_output,
        "changed_files": changed_files,
        "files_changed": len(changed_files),
        # ... more structured data
    })

Example usage in a workflow:

# Example: Validar que no se cambien más de 20 archivos en un PR
steps:
  - id: check_diff
    plugin: git
    step: get_diff_stat

  - id: validate_size
    plugin: project
    step: validate_pr_size
    params:
      max_files: 20

Metadata

Metadata

Assignees

Labels

featureNew feature or functionality

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions