Releases: Agent-CI/embedsim
Releases · Agent-CI/embedsim
v0.1.1
embedsim 0.1.0
Release Notes - embedsim v0.1.0
A Python library for measuring semantic similarity and detecting outliers in text collections using
embeddings.
Features
Core Functionality:
pairsim()- Compare two texts using cosine similarity of their embeddingsgroupsim()- Analyze text collections and identify outliers using centroid-based coherence scoring
Embedding Model Support:
- OpenAI models (openai-3-small, openai-3-large) via API
- Local sentence-transformer models (Jina v2, MiniLM, etc.) for privacy and offline use
- Configurable via function parameters or environment variables
Use Cases:
- Content moderation and off-topic detection
- Document clustering and outlier identification
- Quality assurance for generated content
- Search relevance scoring
- Duplicate detection