tansu

Tansu is an Apache Kafka API compatible broker written in async 🚀 Rust 🦀

AGPL-3.0 License

Stars
131

Tansu

Tansu is a modern drop-in replacement for Apache Kafka. Without the cost of broker replicated storage for durability. Tansu is in early development, licensed under the GNU AGPL. Written in async 🚀 Rust 🦀

Tansu brokers are:

  • Kafka API compatible (exceptions: transactions and idempotent
    producer)
  • Stateless with instant scaling up or down. No more planning and
    reassigning partitions to a broker
  • Available with PostgreSQL or S3 storage engines

For data durability:

Stateless brokers are cost effective, with no network replication and duplicate data storage charges.

Stateless brokers do not have the ceremony of Raft or ZooKeeper.

S3

You can have 3 brokers running in separate Availability Zones for resilience:

Each broker is stateless. Brokers can come and go. Without affecting leadership of consumer groups. The leader and In-Sync-Replica is the broker serving your request. No more client broker ping pong.

With Tansu there is no replication between brokers. The data transfer cost between the Availability Zones is a $0 line item. There are $0 in duplicated storage charges.

With stateless brokers, you can run Tansu in a server-less architecture:

Spin up a broker for the duration of a Kafka API request. Then spin down. No more idle brokers.

Tansu requires that the underlying S3 service support conditional PUT requests. While AWS S3 does now support conditional writes, the support is limited to not overwriting an existing object. To have stateless brokers we need to use a compare and set operation, which is not currently available in AWS S3.

Much like the Kafka protocol, the S3 protocol allows vendors to differentiate. Different levels of service while retaining compatibility with the underlying API. You can use minio or tigis, among a number of other vendors supporting conditional put.

Tansu uses object store, providing a multi-cloud API for storage. There is an alternative option to use a DynamoDB-based commit protocol, to provide conditional write support for AWS S3 instead.

configuration

The storage-engine parameter is a named S3 URL that specifies the bucket to be used. The following will configure a S3 storage engine called "minio" using the "tansu" bucket (full context is in compose.yaml):

--storage-engine minio=s3://tansu/

First time startup, with the above compose.yaml, you'll need to create a bucket, an access key and a secret in minio.

Just bring minio up, without tansu:

docker compose up -d minio

The minio console should now be running on http://localhost:9001, login using the default user credentials of "minioadmin", with password "minioadmin". Follow the bucket creation instructions to create a bucket called "tansu", and then create an access key and secret. Use your newly created access key and secret to update the following environment in .env:

# Your AWS access key:
AWS_ACCESS_KEY_ID="access key"

# Your AWS secret:
AWS_SECRET_ACCESS_KEY="secret"

# The endpoint URL of the S3 service:
AWS_ENDPOINT="http://localhost:9000"

# Allow HTTP requests to the S3 service:
AWS_ALLOW_HTTP="true"

Once this is done, you can start tansu with:

docker compose up -d tansu

Using the regular Apache Kafka CLI you can create topics, produce and consume messages with Tansu:

kafka-topics \
  --bootstrap-server localhost:9092 \
  --partitions=3 \
  --replication-factor=1 \
  --create --topic test

Producer:

echo "hello world" | kafka-console-producer \
    --bootstrap-server localhost:9092 \
    --topic test

Consumer:

kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic test \
  --from-beginning \
  --property print.timestamp=true \
  --property print.key=true \
  --property print.offset=true \
  --property print.partition=true \
  --property print.headers=true \
  --property print.value=true

PostgreSQL

The major differences between Apache Kafka the Tansu PostgreSQL storage engine are:

  • Messages are not stored in segments, so that retention and
    compaction polices can be applied immediately.
  • Message ordering is total over all topics and not restricted to a
    single topic partition.
  • Brokers do not replicate messages, relying on continous
    archiving
    instead.

To switch between the minio and PostgreSQL examples, firstly shutdown Tansu:

docker compose down tansu

Switch to the PostgreSQL storage engine by updating .env:

# minio storage engine
# STORAGE_ENGINE="minio=s3://tansu/"

# PostgreSQL storage engine
STORAGE_ENGINE="pg=postgres://postgres:postgres@db"

Bring Tansu back up:

docker compose up -d tansu

Using the regular Apache Kafka CLI you can create topics, produce and consume messages with Tansu:

kafka-topics \
  --bootstrap-server localhost:9092 \
  --partitions=3 \
  --replication-factor=1 \
  --create --topic test

Producer:

echo "hello world" | kafka-console-producer \
    --bootstrap-server localhost:9092 \
    --topic test

Consumer:

kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic test \
  --from-beginning \
  --property print.timestamp=true \
  --property print.key=true \
  --property print.offset=true \
  --property print.partition=true \
  --property print.headers=true \
  --property print.value=true

Or using librdkafka to produce:

echo "Lorem ipsum dolor..." | \
  ./examples/rdkafka_example -P \
  -t test \
  -b localhost:9092 \
  -z gzip

Consumer:

./examples/rdkafka_example \
  -C \
  -t test \
  -b localhost:9092

Feedback

Please raise an issue if you encounter a problem.

License

Tansu is licensed under the GNU AGPL.