feat: add ambient-control-plane with Kubernetes reconcilers and gRPC watch#815
feat: add ambient-control-plane with Kubernetes reconcilers and gRPC watch#815markturansky wants to merge 8 commits intomainfrom
Conversation
Add SessionWatcher for real-time session event streaming via gRPC, session watch types, and a watch example. Adds google.golang.org/grpc and ambient-api-server proto dependencies. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix defer cancel() blocker: store timeoutCancel on watcher, called in Stop() - Replace deprecated grpc.DialContext/WithBlock with grpc.NewClient - Replace fragile string trimming in deriveGRPCAddress with url.Parse + net.JoinHostPort - Extract hardcoded port 4434 to grpcDefaultPort constant - Run go mod tidy to correct indirect markers on direct deps 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Introduces a Go-based control plane that watches the ambient-api-server and reconciles state into Kubernetes, following the controller pattern. Supports three modes: kube (K8s reconciliation), local (direct process spawning), and test (tally only). Key features: - gRPC watch streams for real-time event processing - Multi-mode operation with environment-based configuration - Session, Project, and ProjectSettings reconcilers - Local mode with AG-UI proxy and process management - Comprehensive test coverage and documentation
7817469 to
377176e
Compare
|
test |
|
Claude Code Review Summary This PR adds the ambient-control-plane component - a new Go service that watches ambient-api-server via gRPC streams and reconciles desired state into Kubernetes CRDs. The architecture is well-designed: informer + cache + dispatcher is sound, exponential-backoff reconnection is correct, and write-back echo detection via UpdatedAt timestamps is elegant. The kubeclient and tally unit tests are thorough. The new Deployment manifest ships without a container SecurityContext, the core reconciler logic lacks unit tests, and the Makefile silently removes image-tag override support. Issues by Severity Blocker Issues: None. Critical Issues 1. Deployment manifest missing container SecurityContext Major Issues 2. Core reconciler logic has no unit tests 3. MODE validation rejects documented local mode 4. Makefile removes IMAGE_TAG override support 5. PostgreSQL pod-level SecurityContext removed without justification Minor Issues 6. Duplicate deploy script added without extension 7. WatchEvent.Object uses any - component standard violation 8. Documentation import path mismatch 9. Direct type assertion in test code Positive Highlights
Recommendations (priority order)
|
SummaryThis PR introduces the Issues by SeverityBlocker Issues1. Missing container SecurityContext on ambient-control-plane Deployment
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]Critical Issues2.
3.
Major Issues4. Removal of IMAGE_TAG override from root Makefile
5. Pod-level SecurityContext removed from PostgreSQL Deployment
6. Direct type assertion in test code
7. New component not wired into
Minor Issues8. Silent error discard in buildRestConfig
9. No liveness/readiness probes on Deployment
10. Trailing blank lines in RBAC kustomization
Positive Highlights
Recommendations
Review generated by Claude Code (claude-sonnet-4-6) using amber.review |
SDK go.mod updated from 1.21 to 1.24.0 (required by gRPC deps); sync CLI module to match. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…dcoding Generator now parses the base path (e.g. /api/ambient/v1) from the spec's path keys instead of hardcoding /api/ambient-api-server/v1. Updates model, parser, and all three language templates (Go, Python, TypeScript). Regenerates all SDK output with the correct /api/ambient/v1 base path. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…ace condition - Add handlePhaseTransition() method to detect "Stopping" phase and set desired-phase=Stopped annotation - Remove phase write-back in crStatusToStatusPatch() to prevent race conditions - Add auto-creation of ambient-runner-secrets in operator namespace handler - Bridge gap between API server /stop endpoint and Kubernetes operator pod deletion 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add k8sinformer package for watching AgenticSession CRs across namespaces - Implement HandleK8sCREvent for operator→API status writeback - Add WatchAgenticSessions and GetDynamicClient to kubeclient - Update production kustomization to use OpenShift internal registry - Add Dockerfile.simple for streamlined container builds - Update Go dependencies (gRPC 1.79.1, OAuth2, OpenTelemetry) - Fix phase writeback race condition by avoiding "Stopping" phase sync Enables real-time synchronization between Kubernetes operator state changes and API server database, completing the control plane integration. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Merge Readiness — Blockers Found
|
Summary
Adds the ambient-control-plane component that watches the ambient-api-server via gRPC streams and reconciles desired state into Kubernetes. This implements the controller pattern similar to kube-controller-manager.
• Initial list-sync + gRPC watch: Performs paginated list calls via Go SDK, then subscribes to gRPC watch streams for real-time updates
• Resource reconcilers: Session, Project, and ProjectSettings reconcilers for Kubernetes CRD management
• Two operating modes:
kube(production K8s reconciliation) andtest(tally reconcilers for testing)• Graceful shutdown: Signal handling with context cancellation propagation
• Thread-safe caching: Mutex-protected resource caches with event synthesis
Architecture
Test Plan
make testmake fmt && make lint(0 issues)make binary🤖 Generated with Claude Code