Skip to content

RHOAIENG-50709: add ADR for CI/CD on firewalled clusters#793

Open
ktdreyer wants to merge 1 commit intomainfrom
kdreyer/github-ci-runner-adr
Open

RHOAIENG-50709: add ADR for CI/CD on firewalled clusters#793
ktdreyer wants to merge 1 commit intomainfrom
kdreyer/github-ci-runner-adr

Conversation

@ktdreyer
Copy link
Contributor

@ktdreyer ktdreyer commented Mar 4, 2026

Present a phased approach for running GitHub Actions inside firewalled OpenShift.

Phase 1 deploys a standalone runner with no CRDs or cluster-level permissions.

Phase 2 upgrades to Actions Runner Controller (ARC) if IT approves CRDs.

https://issues.redhat.com/browse/RHOAIENG-50709

@ktdreyer ktdreyer requested a review from kahowell March 4, 2026 16:51
@ktdreyer ktdreyer changed the title add ADR for CI/CD on firewalled clusters RHOAIENG-51655: add ADR for CI/CD on firewalled clusters Mar 4, 2026
@ktdreyer ktdreyer changed the title RHOAIENG-51655: add ADR for CI/CD on firewalled clusters RHOAIENG-50709: add ADR for CI/CD on firewalled clusters Mar 4, 2026
@github-actions

This comment has been minimized.

@ktdreyer ktdreyer force-pushed the kdreyer/github-ci-runner-adr branch 2 times, most recently from d761513 to 27a7abe Compare March 4, 2026 17:06
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@ambient-code ambient-code bot added this to the Merge Queue milestone Mar 4, 2026
@ktdreyer
Copy link
Contributor Author

ktdreyer commented Mar 4, 2026

Re: "Missing Status field", I'm addressing that in #797

@ambient-code ambient-code bot removed this from the Merge Queue milestone Mar 5, 2026
@ambient-code
Copy link

ambient-code bot commented Mar 5, 2026

Merge Readiness — Blockers Found

Check Status Detail
CI pass
Merge conflicts pass
Review comments FAIL 1 inline threads on docs/internal/adr/0007-cicd-deployment-strategy.md
Jira hygiene pass
Staleness pass
Diff overlap risk pass

This comment is auto-generated by the PR Overview workflow and will be updated when the PR changes.

@ktdreyer ktdreyer force-pushed the kdreyer/github-ci-runner-adr branch from 27a7abe to ef47bf7 Compare March 5, 2026 21:03
@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Test comment

@ktdreyer ktdreyer force-pushed the kdreyer/github-ci-runner-adr branch from ef47bf7 to 974cfc1 Compare March 5, 2026 21:06
@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

PR 793 review posted separately - see below

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Claude Code Review

Summary

PR 793 adds ADR-0007 documenting the CI/CD deployment strategy for firewalled OpenShift clusters. The content is well-structured and clearly explains the phased approach (standalone runner to ARC). However, the document deviates from the project ADR template and README-defined workflow in several ways that reduce navigability and completeness.


Issues by Severity

Blocker Issues: None.

Critical Issues: None.


Major Issues

M1. Missing Status fielddocs/internal/adr/0007-cicd-deployment-strategy.md (lines 1–6)

The ADR README defines an explicit lifecycle: Proposed → Accepted → Deprecated → Superseded. Every accepted ADR (0001–0006) includes a Status line in its header. ADR-0007 omits this field entirely, making it impossible to determine whether this is a proposal or a ratified decision.

Fix: add **Status:** Proposed (or Accepted if already ratified) to the header block.

M2. Considered Options placed after Decisiondocs/internal/adr/0007-cicd-deployment-strategy.md (lines 55–76 vs. 16–54)

The template and all accepted ADRs (0001–0005) list Considered Options before the Decision Outcome so readers understand the option space before seeing the rationale. Here the decision is presented first and rejected options follow — a structural inversion of the established pattern.

Fix: move the Considered Options section above Decision.

M3. README index not updateddocs/internal/adr/README.md

The README table ends at ADR-0005 (ADR-0006 is also absent). Per the ADR workflow defined in the README, the table should be updated when an ADR is proposed.

Fix: append both ADR-0006 (Ambient Runner SDK Architecture, 2026-02-10) and ADR-0007 (this ADR, 2026-03-04) to the table.


Minor Issues

m4. Missing Technical Story fielddocs/internal/adr/0007-cicd-deployment-strategy.md (lines 1–6)

The template requires a Technical Story field. RHOAIENG-50709 is in the PR description but absent from the ADR header — the link is lost after merge.

Fix: add **Technical Story:** https://issues.redhat.com/browse/RHOAIENG-50709

m5. Missing Decision Drivers sectiondocs/internal/adr/0007-cicd-deployment-strategy.md

The two driving properties (Immediacy and Visibility) are described in the Context prose but not extracted into a dedicated Decision Drivers section, making it harder to evaluate future alternatives against the same criteria.

m6. Consequences is a top-level sectiondocs/internal/adr/0007-cicd-deployment-strategy.md (line 62)

Per the template, Consequences (Positive/Negative/Risks) should be a subsection under Decision Outcome, not a top-level sibling.

m7. Missing Implementation Notes and Validation sectionsdocs/internal/adr/0007-cicd-deployment-strategy.md

These template sections are valuable post-rollout. Add them with [TBD] placeholders now so they get filled in after the runner is deployed.


Positive Highlights

  • Content is excellent. The phased approach (Phase 1 to Phase 2 gated on IT approval) is clearly motivated and trade-offs are honestly assessed. The standalone runner SPOF risk and ARC concurrency problem are both surfaced proactively.
  • The concurrency key recommendation for serializing deploy jobs is a concrete, actionable mitigation — not just an abstract concern flagged and left open.
  • Rejected options are documented with reasons. ArgoCD and VPN/SSH are dismissed with explicit rationale (developer UX, firewall complexity), which is exactly what ADRs are for.
  • Risks of Remaining on Phase 1 section is a thoughtful addition — useful for future decision-makers evaluating whether to invest in Phase 2.

Recommendations

  1. Add **Status:** Proposed to the header (Major — fixes ADR lifecycle tracking gap).
  2. Move Considered Options above Decision (Major — restores logical reading order per template).
  3. Update docs/internal/adr/README.md to include ADR-0006 and ADR-0007 (Major — keeps the index accurate).
  4. Add **Technical Story:** with the RHOAIENG Jira link (Minor — preserves traceability post-merge).
  5. Add stub Implementation Notes and Validation sections with [TBD] placeholders (Minor — follows template).

Reviewed by Claude Code using repository standards from .claude/context/ and .claude/patterns/

Present a phased approach for running GitHub Actions inside firewalled
OpenShift.

Phase 1 deploys a standalone runner with no CRDs or cluster-level
permissions.

Phase 2 replaces that with an upgrade to Actions Runner Controller
(ARC), if IT approves CRDs.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@ktdreyer ktdreyer force-pushed the kdreyer/github-ci-runner-adr branch from 974cfc1 to 59f3703 Compare March 5, 2026 21:08
@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Summary

PR adds ADR-0007, a phased CI/CD deployment strategy for firewalled OpenShift clusters using self-hosted GitHub Actions runners. The architectural reasoning is sound and the trade-off analysis is honest. The document is well-written, but deviates from the project ADR template structure and has a security guidance gap around credential storage.

No blocker or critical issues.

Major Issues:

  1. Significant deviations from ADR template structure - docs/internal/adr/template.md requires sections that are entirely missing: Technical Story (should link to RHOAIENG-50709, present in PR body but absent from the doc), Decision Drivers, Implementation Notes, and Validation. ADR-0006 also includes a Status field already flagged by @MPRIC and tracked in issue 797.

  2. No guidance on GitHub App private key storage - the document mentions GitHub App credentials (App ID + private key) but gives no guidance on how the private key will be stored. Per this repo's security standards, secrets must be stored in Kubernetes Secrets - not baked into the Deployment manifest, ConfigMaps, or source control. Suggested addition: "The GitHub App private key is stored as a Kubernetes Secret in the runner namespace and mounted into the runner pod. It is never checked into source control or embedded in image layers."

  3. Serial execution framed as both a Pro and a Risk without disambiguation - Phase 1 Pros lists "Serialized deploys" as a positive, while Risks of Remaining on Phase 1 lists "No concurrency" as a negative. These describe the same property. A brief note - e.g. "This is the same serialization listed as a Pro in Phase 1; the risk here is throughput, not correctness" - would remove the ambiguity.

Minor Issues:

  1. Section heading References vs Links - the template and ADR-0006 use Links. Minor inconsistency worth aligning.

  2. "with Gemini 3 Pro" in Authors field - no other ADR credits an AI tool as co-author. Worth a deliberate team decision.

  3. Monitoring gap without a follow-up action - "We need monitoring to detect this" has no owner, issue reference, or timeline.

Positive Highlights:

  • Honest risk documentation. The "Risks of Remaining on Phase 1" section is unusually candid - single point of failure, manual recovery, no HA - making this a genuinely useful decision record.
  • Phase gate is well-structured. The "Phase 2 depends on IT approving CRDs" condition is explicit and the fallback path is clear.
  • The concurrency: YAML example is exactly right. cancel-in-progress: false and framing it as "a safe starting point, not a long-term solution" prevents a footgun for whoever implements Phase 2.
  • Options rejected with clear reasoning. Both rejected options (ArgoCD, VPN overlay) have concrete reasons tied to the stated requirements.

Recommendations:

  1. Add missing template sections (Decision Drivers, Technical Story, Status, Implementation Notes, Validation). Coordinate the Status field with issue 797.
  2. Add a sentence on GitHub App private key storage referencing Kubernetes Secrets.
  3. Disambiguate the serial execution framing in the Risks section.
  4. Rename the References section to Links for template consistency.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Claude Code Review - PR 793 adds ADR-0007 documenting a phased CI/CD deployment strategy for firewalled OpenShift clusters.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Full review below - Part 1 of 3: BLOCKER ISSUES: None. CRITICAL ISSUES: None. MAJOR ISSUES: 1. Missing secure credential storage guidance (docs/internal/adr/0007-cicd-deployment-strategy.md, line 31). The document describes GitHub App credentials (App ID + private key) but does not specify how they must be stored. In Kubernetes context they must be stored as Kubernetes Secrets. Without this guidance an implementer could store the key insecurely (ConfigMap, baked into image, etc). Suggested addition: The GitHub App private key and App ID MUST be stored as a Kubernetes Secret and mounted into the runner pod. They must NOT be stored in a ConfigMap, baked into the container image, or committed to source control. Violates: security-standards.md.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Part 2 of 3 - MINOR ISSUES: 1. Missing Status field (line 3): Every other ADR (0001-0006) includes Status: Accepted after the title block. Suggested: add Status: Accepted. 2. Missing Technical Story link: RHOAIENG-50709 is in the PR description but not the document. Add: Technical Story: https://issues.redhat.com/browse/RHOAIENG-50709. 3. Temporal language (line 23): 'This week I have already moved forward with trialing Phase 1' will confuse future readers. Replace with a date-anchored statement e.g. 'As of 2026-03-04, Phase 1 has been trialled in the target environment.' 4. Inconsistent bullet formatting (lines 97-98): Two bullets have a leading space before the asterisk, unlike every other bullet. 5. AI co-authorship non-standard (line 5): Authors: Ken Dreyer (with Gemini 3 Pro) - no other ADR attributes an AI assistant. Consider removing the parenthetical.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Part 3 of 3 - POSITIVE HIGHLIGHTS: Pragmatic phased approach: Phase 1 deploys with zero IT dependencies - exactly the right call. Proactive risk identification: The 'Risks of Remaining on Phase 1' section goes beyond template requirements and candidly enumerates single-point-of-failure, no-alerting, and manual recovery concerns - genuinely useful for future operators. Concurrency addressed: Proactively identifies that ARC reintroduces concurrent deploy races and proposes the GitHub Actions concurrency key, with honest acknowledgment it negates most of ARC's benefit. Security-conscious design: Outbound-only model, namespace-scoped ServiceAccount, and short-lived GitHub App tokens align with project security standards. Clear option rejection: ArgoCD and VPN options rejected with specific technical reasons. RECOMMENDATIONS: 1. (Major) Add guidance that GitHub App credentials must be stored as Kubernetes Secrets. 2. (Minor) Add Status and Technical Story fields to the header. 3. (Minor) Replace 'This week...' with a date-anchored statement. 4. (Minor) Fix leading-space bullets on lines 97-98. 5. (Minor) Consider removing '(with Gemini 3 Pro)' from Authors. --- Reviewed with Claude Code (claude-sonnet-4-6) using amber.review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants