-
-
Notifications
You must be signed in to change notification settings - Fork 524
Description
Community Request
"Might be interesting to run this on some of the official 'how to' and tutorials repos of various vendors such as Anthropic, OpenAI, and Microsoft to capture differences due to updates. If, for nothing else, to stay ahead of hype cycles by influencers and hucksters."
The Problem
Official vendor tutorials change frequently but most developers rely on:
- ❌ Outdated blog posts from influencers
- ❌ YouTube tutorials from 6 months ago
- ❌ Stack Overflow answers using deprecated APIs
- ❌ Medium articles teaching old patterns
Meanwhile, the official repos are updated with:
- ✅ Breaking API changes
- ✅ New recommended patterns
- ✅ Deprecation warnings
- ✅ Security fixes
Result: Developers learn outdated/wrong information from "hucksters" instead of official sources.
The Opportunity
Use Skill Seekers to:
- Scrape official tutorial repos (Anthropic, OpenAI, Microsoft, etc.)
- Track changes over time (re-scrape weekly/monthly)
- Auto-detect API changes (diff between versions)
- Generate "what changed" reports
- Stay ahead of influencer hype cycles with real, official information
What You Can Do TODAY
Skill Seekers already supports scraping these repos with the GitHub scraper:
# Scrape Anthropic's official cookbook
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-cookbook
# Scrape OpenAI's official cookbook
python3 cli/github_scraper.py --repo openai/openai-cookbook --name openai-cookbook
# Scrape Microsoft's AI tutorials
python3 cli/github_scraper.py --repo microsoft/generative-ai-for-beginners --name microsoft-ai-tutorials
# Create unified skills (docs + code + issues)
python3 cli/unified_scraper.py --config configs/anthropic_unified.jsonResult: Claude skill containing official, current examples and patterns.
Official Repos to Track
AI/ML Vendors
| Vendor | Repository | What to Track |
|---|---|---|
| Anthropic | anthropics/anthropic-cookbook |
Claude API examples, prompt engineering |
| Anthropic | anthropics/anthropic-sdk-python |
SDK changes, new features |
| OpenAI | openai/openai-cookbook |
GPT API examples, best practices |
| OpenAI | openai/openai-python |
SDK updates, breaking changes |
google/generative-ai-docs |
Gemini API tutorials |
Cloud Providers
| Vendor | Repository | What to Track |
|---|---|---|
| Microsoft | microsoft/generative-ai-for-beginners |
Azure OpenAI patterns |
| Microsoft | Azure/azure-sdk-for-python |
Azure SDK changes |
| AWS | aws-samples/aws-genai-llm-chatbot |
Bedrock examples |
Frameworks
| Vendor | Repository | What to Track |
|---|---|---|
| Vercel | vercel/ai |
AI SDK patterns |
| LangChain | langchain-ai/langchain |
LangChain updates |
| LlamaIndex | run-llama/llama_index |
RAG pattern changes |
Proposed Enhancement: Automated Version Tracking
Related Roadmap: Category F2 (Tasks F2.1-F2.5) - Incremental Updates
Feature: Change Detection & Diffing
# Initial scrape - baseline
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-v1
# Re-scrape after 1 month - detect changes
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-v2 --compare anthropic-v1
# Output: Changelog reportGenerated Report:
# Anthropic Cookbook Changes (v1 → v2)
## New Files (5)
- examples/streaming-responses.py
- examples/function-calling-2024.py
- examples/prompt-caching.md
## Modified Files (3)
- examples/basic-completion.py
- ❌ Removed: `anthropic.Client()` (deprecated)
- ✅ Added: `anthropic.Anthropic()` (new SDK)
- examples/tool-use.py
- ⚠️ Breaking: `tools` parameter format changed
- 📝 Added: Error handling examples
## Deleted Files (2)
- examples/legacy-api.py (deprecated)
- examples/old-prompt-format.md (outdated)
## API Pattern Changes
1. **Client initialization** - Old `Client()` → New `Anthropic()`
2. **Function calling** - Tool format changed (breaking)
3. **Streaming** - New async patterns recommendedImplementation Checklist
Phase 1: Basic Tracking (F2.1-F2.2)
- Track file modification times (git commit dates)
- Store content checksums (SHA-256 hashes)
- Record scrape timestamp metadata
- Save version snapshots in
output/{name}_versions/
Phase 2: Change Detection (F2.3-F2.4)
- Compare new scrape against previous version
- Detect: new files, modified files, deleted files
- Skip unchanged content (efficiency)
- Update only changed sections in skill
Phase 3: Diff Generation (New)
- Generate human-readable changelog
- Highlight breaking changes (removed functions/patterns)
- Show side-by-side code diffs
- Categorize changes (new, modified, deprecated, removed)
Phase 4: Automated Monitoring (New)
- Schedule periodic re-scrapes (cron/GitHub Actions)
- Auto-generate change reports
- Send notifications for breaking changes
- Update existing skills automatically
Phase 5: Pattern Analysis (New)
- Detect deprecated API patterns in old scrapes
- Identify migration patterns (old → new)
- Extract "upgrade guide" from commit messages
- Suggest code modernization
Example Workflow
Manual (Available Now)
# Week 1 - Initial scrape
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-2024-10
# Week 5 - Re-scrape
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-2024-11
# Manual diff
diff -r output/anthropic-2024-10/ output/anthropic-2024-11/Automated (Future)
# Setup monitoring
python3 cli/monitor_repo.py --repo anthropics/anthropic-cookbook --schedule weekly
# Generates reports automatically:
# - output/anthropic-cookbook/changelogs/2024-11-01.md
# - output/anthropic-cookbook/changelogs/2024-11-08.mdUse Cases
1. API Migration Guides
Track vendor SDK updates and auto-generate migration guides:
- "Anthropic SDK v0.5 → v1.0: What Changed"
- "OpenAI GPT-3.5 → GPT-4: Pattern Changes"
2. Combat Misinformation
When an influencer teaches outdated patterns:
- ✅ "Official repo shows this pattern was deprecated 3 months ago"
- ✅ "Here's the current recommended approach from vendor"
3. Early Adopter Advantage
Detect new features in official repos before:
- Blog posts are written
- Tutorials are recorded
- Influencers catch on
4. Breaking Change Alerts
Monitor critical repos and get notified:
- "
⚠️ Breaking: Anthropic client initialization changed" - "🚨 OpenAI deprecated completion endpoint"
5. Documentation Freshness
Ensure your skills stay current:
- Auto-update skills when official examples change
- No more "this tutorial is from 2023" problems
Why This Matters
The Hype Cycle Problem:
- Vendor releases new API (day 0)
- Official docs/examples updated (day 1-7)
- Early blog posts appear (week 2-3)
- YouTube tutorials appear (month 2-3)
- Most developers learn from Add an example project to see how a successfull project looks like. #4 (already 2-3 months outdated)
- API changes again (month 4)
- Tutorials are now teaching deprecated patterns
Skill Seekers Solution:
- Scrape official repo (day 1-7)
- Have current, official patterns immediately
- Re-scrape monthly to catch changes
- Always ahead of the hype cycle
Related Issues
- F2.1-F2.5 - Incremental Updates (roadmap tasks)
- [E1.6] Add scrape_github MCP tool #139 - GitHub scraper (completed - enables this)
- Unified scraping (v2.0.0) - combines docs + code
Priority
Medium-High - This is a killer feature that:
- Solves a real problem (outdated tutorials)
- Differentiates Skill Seekers from static scrapers
- Enables unique use case (vendor change tracking)
- Provides value to community (fight misinformation)
Community Impact
This addresses the frustration of:
- Learning outdated patterns from influencers
- Chasing hype cycles
- Not knowing when APIs change
- Relying on unofficial/wrong information
By tracking official sources, Skill Seekers becomes a "source of truth" tracker for the AI/ML ecosystem.