Expath

Lightning-fast XML parsing and XPath querying for Elixir, powered by Rust NIFs.

Expath provides blazing-fast XML processing through Rust's battle-tested sxd-document and sxd-xpath libraries, delivering 2-10x performance improvements compared to existing Elixir XML libraries.

✨ Key Features

🚀 Blazing Fast: 2-10x faster than SweetXml with Rust-powered NIFs
🔄 Parse-Once, Query-Many: Efficient document reuse for multiple XPath queries
🛡️ Battle-Tested: Built on proven Rust XML libraries (sxd-document, sxd-xpath)
🎯 Simple API: Clean, intuitive interface with comprehensive documentation
⚡ Thread-Safe: Safe concurrent access to parsed documents
🌐 Namespace Support: Full XML namespace support for SOAP, RSS, and complex XML
🔧 Zero Dependencies: No external XML parsers required

🚀 Quick Start

Installation

Add expath to your list of dependencies in mix.exs:

def deps do
  [
    {:expath, "~> 0.2.0"}
  ]
end

Then run:

mix deps.get
mix deps.compile

Basic Usage

Simple XPath query

xml = """
<library>
  <book id="1">
    <title>The Great Gatsby</title>
    <author>F. Scott Fitzgerald</author>
  </book>
  <book id="2">
    <title>1984</title>
    <author>George Orwell</author>
  </book>
</library>
"""

# Extract all book titles
{:ok, titles} = Expath.select(xml, "//title/text()")
# => ["The Great Gatsby", "1984"]

# Find specific book
{:ok, [title]} = Expath.select(xml, "//book[@id='1']/title/text()")
# => ["The Great Gatsby"]

# Count books
{:ok, [count]} = Expath.select(xml, "count(//book)")
# => ["2"]

Parse-Once, Query-Many (Recommended for Multiple Queries)

# Parse document once
{:ok, doc} = Expath.new(xml)

# Run multiple queries efficiently
{:ok, titles} = Expath.query(doc, "//title/text()")
{:ok, authors} = Expath.query(doc, "//author/text()")
{:ok, book_count} = Expath.query(doc, "count(//book)")

# Document is automatically cleaned up when out of scope

📊 Performance Benchmarks

Real-world performance comparison with SweetXml across different document sizes:

Document Size	Speed Improvement	Use Case
Small (644B)	2-3x faster	API responses, config files
Medium (5.6KB)	2.3x faster	RSS feeds, small datasets
Large (904KB)	8-10x faster	Large documents, bulk processing

Benchmark Results Summary

*** Large XML Performance ***
Expath (Rust NIFs)    78.27 iterations/sec (12.78 ms avg)
SweetXml               7.77 iterations/sec (128.64 ms avg)

Comparison: Expath is 10.07x faster

Run your own benchmarks:

mix run bench/benchmark.exs

📖 API Reference

Core Functions

`Expath.select/2` - Single Query

Perfect for one-off XPath queries.

Expath.select(xml_string, xpath_expression)
# Returns: {:ok, results} | {:error, reason}

`Expath.new/1` - Parse Document

Creates a reusable document for multiple queries.

{:ok, doc} = Expath.new(xml_string)
# Returns: {:ok, %Expath.Document{}} | {:error, reason}

`Expath.query/2` - Query Parsed Document

Query a previously parsed document.

{:ok, results} = Expath.query(document, xpath_expression)
# Returns: {:ok, results} | {:error, reason}

XPath Support

Expath supports the full XPath 1.0 specification:

# Node selection
Expath.select(xml, "//book")                    # All book elements
Expath.select(xml, "/library/book[1]")          # First book
Expath.select(xml, "//book[@id='1']")           # Book with id="1"

# Text extraction
Expath.select(xml, "//title/text()")            # All title text
Expath.select(xml, "//book/@id")                # All id attributes

# Functions
Expath.select(xml, "count(//book)")             # Count books
Expath.select(xml, "//book[position()=1]")     # First book
Expath.select(xml, "//book[contains(@class,'fiction')]") # Contains filter

# Complex expressions
Expath.select(xml, "//book[price > 10]/title/text()") # Conditional selection

XML Namespace Support

Expath provides full support for XML namespaces, essential for SOAP, RSS, and complex XML documents:

# XML with namespaces
xml = """
<library xmlns:book="http://example.com/book" xmlns:meta="http://example.com/metadata">
  <book:collection meta:id="sci-fi">
    <book:title>1984</book:title>
    <book:author>George Orwell</book:author>
  </book:collection>
</library>
"""

# Define namespace mappings
namespaces = %{
  "book" => "http://example.com/book",
  "meta" => "http://example.com/metadata"
}

# Query with namespace support
{:ok, titles} = Expath.select(xml, "//book:title/text()", namespaces)
# => ["1984"]

{:ok, ids} = Expath.select(xml, "//book:collection/@meta:id", namespaces)
# => ["sci-fi"]

# Multiple queries with namespace support
{:ok, doc} = Expath.new(xml)
{:ok, titles} = Expath.query(doc, "//book:title/text()", namespaces)
{:ok, authors} = Expath.query(doc, "//book:author/text()", namespaces)

For comprehensive namespace documentation, see NAMESPACE_GUIDE.md.

Error Handling

Expath provides detailed error information:

# Invalid XML (detected during query)
{:error, :invalid_xml} = Expath.select("<root><unclosed>", "/*")

# Invalid XPath expression
{:error, :invalid_xpath} = Expath.select(xml, "//[invalid")

# XPath evaluation errors
{:error, :xpath_error} = Expath.query(doc, "unknown-function()")

Performance

Expath is designed for high-performance XML processing:

Native Speed: Rust NIFs provide near-native performance
Zero-Copy: Efficient string handling between Elixir and Rust
Resource Caching: Parse once, query many times without re-parsing
Memory Efficient: Automatic memory management via Erlang garbage collection

Performance Example

# Large XML document
xml = File.read!("large_document.xml")

# Parse once (expensive operation)
{:ok, doc} = Expath.new(xml)

# Multiple queries (very fast - no re-parsing)
Enum.each(1..1000, fn _i ->
  {:ok, _results} = Expath.query(doc, "//some/xpath")
end)

Platform Support

Expath supports all platforms where Rust and Erlang are available:

Linux (x86_64, aarch64)
macOS (Intel, Apple Silicon)
Windows (x86_64)

Apple Silicon (M1/M2) Setup

Expath includes special configuration for Apple Silicon Macs. If you encounter linking issues, ensure you have:

Native Erlang installation (not x86_64 via Rosetta)
Native Rust toolchain for aarch64-apple-darwin

The included Cargo configuration handles the necessary linker flags automatically.

Examples

RSS Feed Processing

defmodule RSSProcessor do
  def process_feed(rss_xml) do
    {:ok, doc} = Expath.new(rss_xml)

    {:ok, titles} = Expath.query(doc, "//item/title/text()")
    {:ok, links} = Expath.query(doc, "//item/link/text()")
    {:ok, descriptions} = Expath.query(doc, "//item/description/text()")

    titles
    |> Enum.zip([links, descriptions])
    |> Enum.map(fn {title, [link, description]} ->
      %{title: title, link: link, description: description}
    end)
  end
end

Configuration File Parsing

defmodule ConfigParser do
  def parse_config(xml_config) do
    {:ok, doc} = Expath.new(xml_config)

    {:ok, database_host} = Expath.query(doc, "//database/@host")
    {:ok, database_port} = Expath.query(doc, "//database/@port")
    {:ok, features} = Expath.query(doc, "//features/feature/@name")

    %{
      database: %{host: database_host, port: database_port},
      features: features
    }
  end
end

Data Extraction Pipeline

defmodule DataExtractor do
  def extract_products(xml_data) do
    {:ok, doc} = Expath.new(xml_data)

    # Extract in parallel using cached document
    tasks = [
      Task.async(fn -> Expath.query(doc, "//product/@id") end),
      Task.async(fn -> Expath.query(doc, "//product/name/text()") end),
      Task.async(fn -> Expath.query(doc, "//product/price/text()") end),
      Task.async(fn -> Expath.query(doc, "//product/category/text()") end)
    ]

    [ids, names, prices, categories] =
      tasks
      |> Enum.map(&Task.await/1)
      |> Enum.map(fn {:ok, results} -> results end)

    [ids, names, prices, categories]
    |> Enum.zip()
    |> Enum.map(fn {id, name, price, category} ->
      %{id: id, name: name, price: price, category: category}
    end)
  end
end

Development

Prerequisites

Elixir 1.18 or later
Erlang/OTP 27 or later
Rust 1.70 or later
C compiler (gcc, clang, or MSVC)

Building from Source

git clone https://github.com/yourusername/expath.git
cd expath
mix deps.get
mix compile

Running Tests

mix test

Building Documentation

mix docs

Docker Development

For cross-platform testing or if you prefer containerized development, Expath includes comprehensive Docker support:

Quick Start with Docker

# Run all tests in Linux container
./scripts/docker-test.sh

# Or use docker-compose for specific tasks
docker-compose run test
docker-compose run benchmark
docker-compose run quality

Available Docker Services

dev: Development environment with all dependencies
test: Run the full test suite
benchmark: Execute performance benchmarks
quality: Run code quality checks (Credo)

Docker Commands

# Build and test everything
docker-compose up test

# Run interactive development shell
docker-compose run dev iex -S mix

# Execute benchmarks
docker-compose run benchmark

# Check code quality
docker-compose run quality

# Clean up containers
docker-compose down --volumes

Multi-Architecture Testing

The Docker setup supports testing on different architectures:

# Test on current architecture
docker-compose run test

# Build for specific platform (requires BuildKit)
DOCKER_PLATFORM=linux/amd64 docker-compose run test

This is particularly useful for ensuring your NIFs work correctly across different platforms before deployment.

Contributing

Fork the repository
Create your feature branch (git checkout -b my-new-feature)
Write tests for your changes
Ensure all tests pass (mix test)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built on top of the excellent sxd-document and sxd-xpath Rust crates
Uses Rustler for safe Elixir-Rust interoperability
Inspired by the need for high-performance XML processing in Elixir applications

Changelog

v0.1.0 (Initial Release)

High-performance XML parsing via Rust NIFs
Full XPath 1.0 support
Parse-once, query-many Document resource API
Comprehensive error handling
Apple Silicon support
Complete test suite and documentation

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
bench		bench
bin		bin
lib		lib
native/expath		native/expath
scripts		scripts
test		test
.credo.exs		.credo.exs
.dockerignore		.dockerignore
.formatter.exs		.formatter.exs
.gitignore		.gitignore
.tool-versions		.tool-versions
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
NAMESPACE_GUIDE.md		NAMESPACE_GUIDE.md
README.md		README.md
demo.exs		demo.exs
docker-compose.yml		docker-compose.yml
mix.exs		mix.exs
mix.lock		mix.lock
namespace_demo.exs		namespace_demo.exs

License

wearecococo/expath

Folders and files

Latest commit

History

Repository files navigation

Expath

✨ Key Features

🚀 Quick Start

Installation

Basic Usage

Simple XPath query

Parse-Once, Query-Many (Recommended for Multiple Queries)

📊 Performance Benchmarks

Benchmark Results Summary

📖 API Reference

Core Functions

Expath.select/2 - Single Query

Expath.new/1 - Parse Document

Expath.query/2 - Query Parsed Document

XPath Support

XML Namespace Support

Error Handling

Performance

Performance Example

Platform Support

Apple Silicon (M1/M2) Setup

Examples

RSS Feed Processing

Configuration File Parsing

Data Extraction Pipeline

Development

Prerequisites

Building from Source

Running Tests

Building Documentation

Docker Development

Quick Start with Docker

Available Docker Services

Docker Commands

Multi-Architecture Testing

Contributing

License

Acknowledgments

Changelog

v0.1.0 (Initial Release)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

`Expath.select/2` - Single Query

`Expath.new/1` - Parse Document

`Expath.query/2` - Query Parsed Document

Packages