banditrl

A lightweight contextual bandit & reinforcement learning library

banditrl

一个轻量级的上下文bandit和强化学习库，旨在用于生产中的实时决策服务。

项目简介
技术架构
当前支持的模型
当前支持的特征类型
Docs
- Contextual Free(User model)
- Contextual Bandits

项目简介

本项目的目标是建立一个灵活简单的在线学习库，并且有足够的性能在生产中使用。在许多现实世界的应用中（例如，推荐系统），action的数量和每秒请求的数量可能非常大，所以我们应该非常小心地管理模型存储、action存储和历史请求数据的存储。因为不同系统的存储管理是非常不同的，我们让用户可以定义如何做。具体来说，这个 repo 包含：

特征工程和预处理
模型实现
模型训练工作流程
基于FastAPI的在线模型服务
模型存储：更新后如何存储模型，如何加载模型
历史请求数据存储：如何存储请求，并在我们获得（延迟的）奖励时找到它
行动存储：如何添加/删除行动并定义每个行动的一些特殊属性

banditrl提供了核心的上下文bandit算法，以及一些常见的存储操作（如内存存储/基于Rlite/redis的存储）。

技术架构

当前支持的模型

Models supported:

Contextual Bandits
- Linear bandit(LinUCB)
- Linear Thompson Sampling bandit(LinTS)
- ε-greedy (LinEpsilonGreedy)
- Logistic bandit(Logistic Upper Confidence Bound-LogisticUCB)
Contextual Free(User model)
- Epsilon Greedy policy bandit(RliteEE)
- Bernoulli Thompson Sampling Policy bandit(BTS) (via. Thompson Sampling )

当前支持的特征类型

4 feature types supported:

Numeric: standard floating point features
- e.g. {totalCartValue: 39.99}
Categorical: low-cardinality discrete features
- e.g. {currentlyViewingCategory: "men's jeans"}
ID list: high-cardinality discrete features
- e.g. {productsInCart: ["productId022", "productId109"...]}
- Handled via. learned embedding tables
"Dense" ID list: high-cardinality discrete features, manually mapped to dense feature vectors
- e.g {productId022: [0.5, 1.3, ...], productId109: [1.9, 0.1, ...], ...}

Docs

pip install .

Get started

Contextual Free(User model)

Epsilon Greedy policy bandit(RliteEE)

Bernoulli Thompson Sampling Policy bandit(BTS)

Contextual Bandits

Linear bandit(LinUCB)

Linear bandit(LinUCB) using dict context

Logistic bandit(Logistic Upper Confidence Bound-LogisticUCB)

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
banditrl		banditrl
configs		configs
docs		docs
resources		resources
.gitignore		.gitignore
DOCS.md		DOCS.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

banditrl

banditrl

项目简介

技术架构

当前支持的模型

当前支持的特征类型

Docs

Contextual Free(User model)

Contextual Bandits

About

Uh oh!

Releases

Packages

Languages

License

AlgoLink/banditrl

Folders and files

Latest commit

History

Repository files navigation

banditrl

banditrl

项目简介

技术架构

当前支持的模型

当前支持的特征类型

Docs

Contextual Free(User model)

Contextual Bandits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages