Skip to content

Apache Avro serialization/deserialization with Confluent Schema Registry framing using Apicurio Schema Registry

License

Notifications You must be signed in to change notification settings

castoredc/avrocurio

Repository files navigation

AvroCurio

Apache Avro serialization/deserialization with Confluent Schema Registry framing using Apicurio Schema Registry.

Installation

Using uv, poetry, or pip:

uv add avrocurio
poetry add avrocurio
pip install avrocurio

Quick Start

1. Define your schema using dataclasses-avroschema

from dataclasses import dataclass
from dataclasses_avroschema import AvroModel

@dataclass
class User(AvroModel):
    name: str
    age: int
    email: str

2. Serialize and deserialize data

import asyncio
from avrocurio import AvroSerializer, ApicurioClient, ApicurioConfig

async def main():
    # Configure connection to Apicurio Registry
    config = ApicurioConfig(base_url="http://localhost:8080")

    # Create client and serializer
    async with ApicurioClient(config) as client:
        serializer = AvroSerializer(client)

        # Create a user instance
        user = User(name="John Doe", age=30, email="john@example.com")

        # Serialize the user to an Avro binary with Confluent registry framing.
        # Under the hood this will perform a lookup against Apicurio to get the
        # artifact ID for the schema, which is then prepended to the Avro binary
        # (along with a magic byte).
        serialized = await serializer.serialize(user)

        # Deserialize the binary back to a User instance.
        deserialized_user = await serializer.deserialize(serialized, User)

asyncio.run(main())

Confluent Schema Registry Wire Format

AvroCurio implements the Confluent Schema Registry wire format:

+----------------+------------------+------------------+
| Magic Byte     | Schema ID        | Avro Payload     |
| (1 byte = 0x0) | (4 bytes, BE)    | (remaining)      |
+----------------+------------------+------------------+
  • Magic Byte: Always 0x0 to identify Confluent wire format
  • Schema ID: 4-byte big-endian integer referencing the schema in the registry
  • Avro Payload: Standard Avro binary-encoded data

Schema Caching

Schema caching is handled automatically by the ApicurioClient for performance.

Development

Requirements

Integration tests require a running Apicurio Registry. Running it through Docker or Podman using Compose is easiest:

docker compose up

Port 8080 is assumed by default, but you can set APICURIO_URL to point to a different instance.

Running Tests

# Run all tests
uv run pytest

# Run specific test file
uv run pytest tests/test_serializer.py

# Run with verbose output
uv run pytest -v

# Skip integration tests
uv run pytest -m "not integration"

License

AvroCurio is open-source software released under the BSD-2-Clause Plus Patent License. This license is designed to provide: a) a simple permissive license; b) that is compatible with the GNU General Public License (GPL), version 2; and c) which also has an express patent grant included.

Please review the LICENSE file for the full text of the license.

About

Apache Avro serialization/deserialization with Confluent Schema Registry framing using Apicurio Schema Registry

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •