Skip to content

feat: Version tracking for official vendor tutorials - Auto-detect API changes #167

@yusufkaraaslan

Description

@yusufkaraaslan

Community Request

"Might be interesting to run this on some of the official 'how to' and tutorials repos of various vendors such as Anthropic, OpenAI, and Microsoft to capture differences due to updates. If, for nothing else, to stay ahead of hype cycles by influencers and hucksters."

The Problem

Official vendor tutorials change frequently but most developers rely on:

  • ❌ Outdated blog posts from influencers
  • ❌ YouTube tutorials from 6 months ago
  • ❌ Stack Overflow answers using deprecated APIs
  • ❌ Medium articles teaching old patterns

Meanwhile, the official repos are updated with:

  • ✅ Breaking API changes
  • ✅ New recommended patterns
  • ✅ Deprecation warnings
  • ✅ Security fixes

Result: Developers learn outdated/wrong information from "hucksters" instead of official sources.

The Opportunity

Use Skill Seekers to:

  1. Scrape official tutorial repos (Anthropic, OpenAI, Microsoft, etc.)
  2. Track changes over time (re-scrape weekly/monthly)
  3. Auto-detect API changes (diff between versions)
  4. Generate "what changed" reports
  5. Stay ahead of influencer hype cycles with real, official information

What You Can Do TODAY

Skill Seekers already supports scraping these repos with the GitHub scraper:

# Scrape Anthropic's official cookbook
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-cookbook

# Scrape OpenAI's official cookbook
python3 cli/github_scraper.py --repo openai/openai-cookbook --name openai-cookbook

# Scrape Microsoft's AI tutorials
python3 cli/github_scraper.py --repo microsoft/generative-ai-for-beginners --name microsoft-ai-tutorials

# Create unified skills (docs + code + issues)
python3 cli/unified_scraper.py --config configs/anthropic_unified.json

Result: Claude skill containing official, current examples and patterns.

Official Repos to Track

AI/ML Vendors

Vendor Repository What to Track
Anthropic anthropics/anthropic-cookbook Claude API examples, prompt engineering
Anthropic anthropics/anthropic-sdk-python SDK changes, new features
OpenAI openai/openai-cookbook GPT API examples, best practices
OpenAI openai/openai-python SDK updates, breaking changes
Google google/generative-ai-docs Gemini API tutorials

Cloud Providers

Vendor Repository What to Track
Microsoft microsoft/generative-ai-for-beginners Azure OpenAI patterns
Microsoft Azure/azure-sdk-for-python Azure SDK changes
AWS aws-samples/aws-genai-llm-chatbot Bedrock examples

Frameworks

Vendor Repository What to Track
Vercel vercel/ai AI SDK patterns
LangChain langchain-ai/langchain LangChain updates
LlamaIndex run-llama/llama_index RAG pattern changes

Proposed Enhancement: Automated Version Tracking

Related Roadmap: Category F2 (Tasks F2.1-F2.5) - Incremental Updates

Feature: Change Detection & Diffing

# Initial scrape - baseline
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-v1

# Re-scrape after 1 month - detect changes
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-v2 --compare anthropic-v1

# Output: Changelog report

Generated Report:

# Anthropic Cookbook Changes (v1 → v2)

## New Files (5)
- examples/streaming-responses.py
- examples/function-calling-2024.py
- examples/prompt-caching.md

## Modified Files (3)
- examples/basic-completion.py
  - ❌ Removed: `anthropic.Client()` (deprecated)
  - ✅ Added: `anthropic.Anthropic()` (new SDK)
  
- examples/tool-use.py
  - ⚠️ Breaking: `tools` parameter format changed
  - 📝 Added: Error handling examples

## Deleted Files (2)
- examples/legacy-api.py (deprecated)
- examples/old-prompt-format.md (outdated)

## API Pattern Changes
1. **Client initialization** - Old `Client()` → New `Anthropic()`
2. **Function calling** - Tool format changed (breaking)
3. **Streaming** - New async patterns recommended

Implementation Checklist

Phase 1: Basic Tracking (F2.1-F2.2)

  • Track file modification times (git commit dates)
  • Store content checksums (SHA-256 hashes)
  • Record scrape timestamp metadata
  • Save version snapshots in output/{name}_versions/

Phase 2: Change Detection (F2.3-F2.4)

  • Compare new scrape against previous version
  • Detect: new files, modified files, deleted files
  • Skip unchanged content (efficiency)
  • Update only changed sections in skill

Phase 3: Diff Generation (New)

  • Generate human-readable changelog
  • Highlight breaking changes (removed functions/patterns)
  • Show side-by-side code diffs
  • Categorize changes (new, modified, deprecated, removed)

Phase 4: Automated Monitoring (New)

  • Schedule periodic re-scrapes (cron/GitHub Actions)
  • Auto-generate change reports
  • Send notifications for breaking changes
  • Update existing skills automatically

Phase 5: Pattern Analysis (New)

  • Detect deprecated API patterns in old scrapes
  • Identify migration patterns (old → new)
  • Extract "upgrade guide" from commit messages
  • Suggest code modernization

Example Workflow

Manual (Available Now)

# Week 1 - Initial scrape
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-2024-10

# Week 5 - Re-scrape
python3 cli/github_scraper.py --repo anthropics/anthropic-cookbook --name anthropic-2024-11

# Manual diff
diff -r output/anthropic-2024-10/ output/anthropic-2024-11/

Automated (Future)

# Setup monitoring
python3 cli/monitor_repo.py --repo anthropics/anthropic-cookbook --schedule weekly

# Generates reports automatically:
# - output/anthropic-cookbook/changelogs/2024-11-01.md
# - output/anthropic-cookbook/changelogs/2024-11-08.md

Use Cases

1. API Migration Guides

Track vendor SDK updates and auto-generate migration guides:

  • "Anthropic SDK v0.5 → v1.0: What Changed"
  • "OpenAI GPT-3.5 → GPT-4: Pattern Changes"

2. Combat Misinformation

When an influencer teaches outdated patterns:

  • ✅ "Official repo shows this pattern was deprecated 3 months ago"
  • ✅ "Here's the current recommended approach from vendor"

3. Early Adopter Advantage

Detect new features in official repos before:

  • Blog posts are written
  • Tutorials are recorded
  • Influencers catch on

4. Breaking Change Alerts

Monitor critical repos and get notified:

  • "⚠️ Breaking: Anthropic client initialization changed"
  • "🚨 OpenAI deprecated completion endpoint"

5. Documentation Freshness

Ensure your skills stay current:

  • Auto-update skills when official examples change
  • No more "this tutorial is from 2023" problems

Why This Matters

The Hype Cycle Problem:

  1. Vendor releases new API (day 0)
  2. Official docs/examples updated (day 1-7)
  3. Early blog posts appear (week 2-3)
  4. YouTube tutorials appear (month 2-3)
  5. Most developers learn from Add an example project to see how a successfull project looks like. #4 (already 2-3 months outdated)
  6. API changes again (month 4)
  7. Tutorials are now teaching deprecated patterns

Skill Seekers Solution:

  1. Scrape official repo (day 1-7)
  2. Have current, official patterns immediately
  3. Re-scrape monthly to catch changes
  4. Always ahead of the hype cycle

Related Issues

Priority

Medium-High - This is a killer feature that:

  • Solves a real problem (outdated tutorials)
  • Differentiates Skill Seekers from static scrapers
  • Enables unique use case (vendor change tracking)
  • Provides value to community (fight misinformation)

Community Impact

This addresses the frustration of:

  • Learning outdated patterns from influencers
  • Chasing hype cycles
  • Not knowing when APIs change
  • Relying on unofficial/wrong information

By tracking official sources, Skill Seekers becomes a "source of truth" tracker for the AI/ML ecosystem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions