Senior Product Manager at Microsoft · Excel Product Group Build → prototype → eval → production · Hands-on with RAG, agents, triage pipelines Outcome: 350 PB/day data platforms · 6,000+ issues triaged · 22 Copilot skills in daily production
I build AI agents, triage pipelines, and analytics engines for real workflows — then open-source them.
🌐 kustonaut.github.io/portfolio — live demos, architecture deep-dives, and everything I ship.
Build → AI agents, eval systems, PM tooling
Think → Rules-first → LLM-fallback → eval → iterate
Ship → 3 production tools, 22 Copilot skills, 50+ automations
I own the Add-ins & Copilot extensibility platform in Excel — the surface where third-party developers and ISVs plug into Office. On the side, I build open-source PM tooling that I use every day in production:
| Project | What it does | Why this approach | What I'd improve |
|---|---|---|---|
| brain-os | AI-powered daily OS for PMs — 22 Copilot skills, daily intelligence pipeline, Command Center dashboard | Local-first, config-driven. Every PM tool should be a VS Code skill, not a SaaS subscription. | Multi-model routing (GPT-4o for classification, Claude for synthesis), cross-PM onboarding wizard |
| issue-sentinel | AI issue triage — classify, prioritize, route. Zero manual effort. | Rules-first catches 60% at zero cost. LLM handles the remaining 35%. Eval suite tracks accuracy over time. | Fine-tuned classifier to push rules-first coverage to 80%, A/B eval framework for prompt variants |
| github-issue-analytics | 13-metric scorecard from thousands of issues — fix rate, DSAT proxy, SHS, area heatmaps | Because at scale, you need data — not opinions. ETL → classify → score → dashboard. | Streaming ingestion for real-time alerts, anomaly detection on metric trends |
Each repo follows the same pattern: Problem → Architecture → Why this approach → Tradeoffs → Demo → What I'd improve.
- Retrieval-augmented triage pipelines (rule-based + LLM hybrid classification)
- Agent orchestration with MCP servers and GitHub Actions
- Prompt optimization — temperature routing, CoT/Step-Back, few-shot tuning
- Eval frameworks — golden test suites, decision logging, accuracy tracking
- PM signal aggregation — email, Teams, ADO, GitHub → daily intelligence brief
- Developer platform strategy — Add-in lifecycle, Copilot extensibility, DLP integrations
Issue arrives
│
├─ Rules-first pass (keyword matching, YAML config)
│ └─ 60% classified → zero latency, zero cost
│
├─ LLM pass (few-shot, low temperature)
│ └─ 35% classified → handles ambiguity
│
├─ Urgency scoring (regression signals, escalation patterns)
│
├─ Sentiment analysis (frustrated → neutral → positive)
│
└─ Decision logged → JSONL audit trail → tune rules over time
Rules first. LLM second. Eval always.
|
Brain OS — Daily intelligence pipeline |
Pipeline — 9 automated steps, zero manual effort |
|
22 Skills — Morning OS, career coach, incident investigator |
Architecture — Rules-first → LLM → eval loop |
🎮 Live demos: Brain OS · Issue Sentinel · GitHub Issue Analytics
| 🏢 Microsoft | 11+ years — Consulting → Azure Data Explorer → Excel Product Group |
| 📊 Data scale | 350 PB/day (1P connectors), 20x OSS growth (40 PB/day) |
| 🎯 Issues triaged | 6,000+ across 5 charter areas |
| 🤖 Copilot skills | 22 production-grade PM skills |
| ⚙️ Automations | 50+ daily pipeline scripts |
| ⭐ 365daysofADX | 37 stars — 365-day public KQL challenge |
| 🧪 Test coverage | 23/23 tests passing across all repos |



