QLearnerKit

A Swift implementation of Q-Learning with Dyna-Q planning, optimized for Apple Silicon using MLX. Perfect for building adaptive iOS apps that learn user preferences over time.

Features

Tabular Q-Learning - Classic reinforcement learning for discrete state/action spaces
Dyna-Q Planning - Model-based learning for improved sample efficiency
MLX-Powered - Leverages Apple's MLX framework for efficient computation
iOS Ready - Designed for real-world iOS app integration

Requirements

macOS 14.0+ / iOS 17.0+
Apple Silicon (M1/M2/M3/M4)
Swift 6.2+

Installation

Swift Package Manager

Add to your Package.swift:

dependencies: [
    .package(url: "https://github.com/timothy/QLearnerKit", from: "1.0.0-alpha.1")
]

Or add via Xcode: File → Add Package Dependencies

Quick Start

Basic Q-Learning

import QLearnerKit

// Create learner for grid world (100 states, 4 actions)
let learner = QLearner(
    numStates: 100,
    numActions: 4,
    alpha: 0.2,
    gamma: 0.9,
    dyna: 200
)

// Initialize with starting state
var action = learner.querysetstate(0)

// Learning loop
while !done {
    let nextState = environment.step(action)
    let reward = environment.getReward(nextState)
    action = learner.query(nextState, reward: reward)
}

// Get learned policy
let policy = learner.getPolicy()

iOS User Preference Learning

Here's a complete example of using QLearnerKit to learn user content preferences in a news app:

Setup

import Foundation
import QLearnerKit

class NewsRecommender {
    private let learner: QLearner
    private var currentState: Int = 0
    
    enum ContentCategory: Int, CaseIterable {
        case sports = 0
        case technology = 1
        case business = 2
        case entertainment = 3
        
        var displayName: String {
            switch self {
            case .sports: return "Sports"
            case .technology: return "Technology"
            case .business: return "Business"
            case .entertainment: return "Entertainment"
            }
        }
    }
    
    init() {
        // 8 states: 4 time periods × 2 day types (weekday/weekend)
        // 4 actions: content categories
        learner = QLearner(
            numStates: 8,
            numActions: ContentCategory.allCases.count,
            alpha: 0.2,        // Learning rate
            gamma: 0.9,        // Future reward discount
            rar: 0.3,          // 30% exploration (lower for production)
            radr: 0.995,       // Slow decay
            dyna: 50           // Model-based planning steps
        )
    }
    
    /// Determine current user context state
    private func getCurrentState() -> Int {
        let calendar = Calendar.current
        let now = Date()
        
        let hour = calendar.component(.hour, from: now)
        let isWeekend = calendar.isDateInWeekend(now)
        
        // Time periods: 0=Morning(6-12), 1=Afternoon(12-18), 2=Evening(18-24), 3=Night(0-6)
        let timeSlot: Int
        switch hour {
        case 6..<12:  timeSlot = 0
        case 12..<18: timeSlot = 1
        case 18..<24: timeSlot = 2
        default:      timeSlot = 3
        }
        
        // Combine time and day type: timeSlot * 2 + (weekend ? 1 : 0)
        return timeSlot * 2 + (isWeekend ? 1 : 0)
    }
    
    /// Get recommended content category for current context
    func recommendContent() -> ContentCategory {
        currentState = getCurrentState()
        let action = learner.querysetstate(currentState)
        return ContentCategory(rawValue: action) ?? .technology
    }
    
    /// Record user engagement with content
    /// - Parameters:
    ///   - category: The content category shown
    ///   - didRead: Whether user read the article (>30 seconds)
    ///   - readDuration: How long user engaged (seconds)
    func recordEngagement(category: ContentCategory, didRead: Bool, readDuration: TimeInterval) {
        let state = getCurrentState()
        
        // Reward based on engagement level
        let reward: Float
        if didRead && readDuration > 60 {
            reward = 1.0  // Strong positive - user engaged deeply
        } else if didRead {
            reward = 0.5  // Moderate positive - user read article
        } else if readDuration > 5 {
            reward = 0.1  // Weak positive - user glanced
        } else {
            reward = -0.2 // Negative - user ignored
        }
        
        _ = learner.query(state, reward: reward)
    }
    
    /// Get best category for current context (no exploration)
    func getBestCategory() -> ContentCategory {
        let state = getCurrentState()
        let policy = learner.getPolicy()
        return ContentCategory(rawValue: policy[state]) ?? .technology
    }
    
    /// Export learned preferences for persistence
    func exportPreferences() -> [Float] {
        return learner.getQTable().asArray(Float.self)
    }
}

Usage in SwiftUI

import SwiftUI

struct ContentView: View {
    @StateObject private var recommender = NewsRecommender()
    @State private var recommendedCategory: NewsRecommender.ContentCategory = .technology
    @State private var articleStartTime: Date?
    
    var body: some View {
        VStack(spacing: 20) {
            Text("Recommended for You")
                .font(.title)
            
            Text(recommendedCategory.displayName)
                .font(.headline)
                .padding()
                .background(Color.blue.opacity(0.2))
                .cornerRadius(10)
            
            Button("Show Article") {
                showArticle()
            }
            .buttonStyle(.borderedProminent)
            
            Button("Get New Recommendation") {
                recommendedCategory = recommender.recommendContent()
            }
        }
        .padding()
        .onAppear {
            recommendedCategory = recommender.recommendContent()
        }
    }
    
    private func showArticle() {
        articleStartTime = Date()
        
        // Simulate user reading article
        DispatchQueue.main.asyncAfter(deadline: .now() + 2.0) {
            recordArticleEngagement(didRead: true)
        }
    }
    
    private func recordArticleEngagement(didRead: Bool) {
        guard let startTime = articleStartTime else { return }
        let duration = Date().timeIntervalSince(startTime)
        
        recommender.recordEngagement(
            category: recommendedCategory,
            didRead: didRead,
            readDuration: duration
        )
        
        // Get next recommendation
        recommendedCategory = recommender.recommendContent()
        articleStartTime = nil
    }
}

Persistence Example

extension NewsRecommender {
    private var preferencesKey: String { "qlearner_preferences" }
    
    func save() {
        let preferences = exportPreferences()
        UserDefaults.standard.set(preferences, forKey: preferencesKey)
    }
    
    func load() {
        // Note: You'd need to implement a method to load Q-table back
        // This requires adding a setter or initializer with Q-table data
        if let saved = UserDefaults.standard.array(forKey: preferencesKey) as? [Float] {
            // Restore Q-table (implementation depends on your needs)
            print("Loaded \(saved.count) preference values")
        }
    }
}

API Reference

Initialization

QLearner(
    numStates: Int = 100,      // Number of discrete states
    numActions: Int = 4,       // Number of available actions
    alpha: Float = 0.2,        // Learning rate [0.0-1.0]
    gamma: Float = 0.9,        // Discount factor [0.0-1.0]
    rar: Float = 0.5,          // Random action rate (epsilon)
    radr: Float = 0.99,        // Exploration decay rate
    dyna: Int = 0,             // Planning steps (0 = disabled)
    verbose: Bool = false      // Debug output
)

Core Methods

`querysetstate(_ s: Int) -> Int`

Initialize state without learning. Use at episode start.

Parameters:

s: Initial state

Returns: Selected action

`query(_ sPrime: Int, reward: Float) -> Int`

Main learning step. Updates Q-table and selects next action.

Parameters:

sPrime: New state after previous action
reward: Reward received

Returns: Next action to take

`getPolicy() -> [Int]`

Extract greedy policy (best action per state).

Returns: Array of actions indexed by state

`getQTable() -> MLXArray`

Get Q-table for inspection or persistence.

Returns: Q-table as MLXArray

Use Cases

Content Recommendation - Learn user preferences for news, videos, products
UI Personalization - Adapt interface layouts based on user behavior
Feature Discovery - Guide users to relevant app features
Notification Timing - Learn optimal times to send notifications
A/B Testing - Multi-armed bandit for dynamic feature testing

Best Practices

State Design

Keep state space small (< 1000 states for fast learning)
Use meaningful features (time, context, user segments)
Combine continuous features into discrete bins

Reward Shaping

Use clear positive/negative signals
Scale rewards consistently (-1 to +1)
Reward desired behaviors immediately

Hyperparameters

Production: rar: 0.1-0.3 (low exploration)
Learning: rar: 0.5-0.7 (high exploration)
Alpha: 0.1-0.3 for stable environments
Dyna: 50-200 for sample efficiency

Privacy

All learning happens on-device
No user data sent to servers
Q-table stored locally

Performance

Lightweight: ~100KB memory for 100 states × 10 actions
Fast updates: < 1ms per query on Apple Silicon
Efficient planning: Dyna-Q provides 5-10× sample efficiency

License

Apache-2.0

Author

Timothy Bradford

Contributing

Contributions welcome! Please open an issue or PR.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Sources/QLearnerKit		Sources/QLearnerKit
Tests/QLearnerKitTests		Tests/QLearnerKitTests
.gitignore		.gitignore
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

QLearnerKit

Features

Requirements

Installation

Swift Package Manager

Quick Start

Basic Q-Learning

iOS User Preference Learning

Setup

Usage in SwiftUI

Persistence Example

API Reference

Initialization

Core Methods

`querysetstate(_ s: Int) -> Int`

`query(_ sPrime: Int, reward: Float) -> Int`

`getPolicy() -> [Int]`

`getQTable() -> MLXArray`

Use Cases

Best Practices

State Design

Reward Shaping

Hyperparameters

Privacy

Performance

License

Author

Contributing

About

Uh oh!

Releases 1

Packages

Languages

timothy/QLearnerKit

Folders and files

Latest commit

History

Repository files navigation

QLearnerKit

Features

Requirements

Installation

Swift Package Manager

Quick Start

Basic Q-Learning

iOS User Preference Learning

Setup

Usage in SwiftUI

Persistence Example

API Reference

Initialization

Core Methods

querysetstate(_ s: Int) -> Int

query(_ sPrime: Int, reward: Float) -> Int

getPolicy() -> [Int]

getQTable() -> MLXArray

Use Cases

Best Practices

State Design

Reward Shaping

Hyperparameters

Privacy

Performance

License

Author

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`querysetstate(_ s: Int) -> Int`

`query(_ sPrime: Int, reward: Float) -> Int`

`getPolicy() -> [Int]`

`getQTable() -> MLXArray`

Packages