BARANI Open Sourced Kafka Streaming Manifests for IoT Data Infrastructure
/BARANI meteo innovations open sourced kafka data streaming architecture
At BARANI, accurate environmental measurement does not stop at the sensor. Our weather stations and meteorological sensors generate data that must move reliably from the field into systems where it can be stored, transformed, analyzed, and acted on. That requires more than hardware. It requires dependable data infrastructure.
Today, we are open sourcing part of that infrastructure: our Kafka Platform Manifests repository.
The repository contains Kubernetes manifests for a Kafka-centered streaming platform built around Strimzi and related services. It is designed as a single, structured open-source repository with component directories for Kafka, Schema Registry, ksqlDB, Kafka Bridge, Debezium Connect, and Camel Connect.
This is not a release of our business-specific stream-processing logic. Instead, it shares the deployment patterns, component wiring, and infrastructure manifests that can help teams build and operate similar streaming data platforms. The repository defines Kafka topics, ingestion paths, supporting services, and example CDC and webhook ingestion components, while keeping production application logic outside the public repository.
Why we are sharing this
IoT and environmental monitoring systems generate continuous streams of data. For us, this means handling telemetry, webhook payloads, device-related events, database changes, and downstream processing stages in a way that is repeatable and maintainable.
But this challenge is not unique to BARANI. Nonprofits, NGOs, environmental protection organizations, research teams, citizen-science projects, municipalities, smart city teams, and resilient city initiatives often face the same need: they collect valuable environmental data from sensors, field devices, databases, and partner systems, but need reliable infrastructure to move that data where it can be analyzed and acted on.
Kafka is well suited for this kind of architecture because it gives teams a durable event backbone. Kubernetes gives us a practical way to deploy, version, and reason about that infrastructure as manifests. Strimzi then provides the Kubernetes-native operator layer for Kafka.
By publishing these manifests, we want to make our approach easier to inspect, adapt, and improve. More importantly, we want to give mission-driven teams a realistic starting point for building Kafka-based ingestion and streaming systems on Kubernetes.
Our hope is that this work can support organizations focused on climate resilience, environmental monitoring, conservation, disaster preparedness, sustainable agriculture, air-quality monitoring, water-resource management, smart city infrastructure, resilient city planning, and other projects that help protect our planet and strengthen communities.
Reliable environmental decisions depend on reliable environmental data. So do smarter, more resilient cities. By sharing part of our infrastructure openly, we want to help more teams spend less time rebuilding the same platform foundations and more time using data to understand, protect, and restore the natural world while building communities that are better prepared for the future.
What is included
The repository is organized into component directories, each with its own README and deployment guidance.
Kafka core
The kafka/ directory includes raw Kubernetes manifests for a Strimzi-managed Apache Kafka cluster running in KRaft mode, plus an optional Kafka UI deployment. It includes manifests for the Kafka node pool, Kafka custom resource, Kafka user, Kafka rebalance resource, and Kafka UI.
Schema Registry
The schema-registry/ directory contains manifests for deploying Confluent Schema Registry alongside the Kafka cluster. It includes deployment configuration, an in-cluster service on port 8081, and optional ingress with TLS and basic authentication.
ksqlDB
The ksqldb/ directory contains manifests for a ksqlDB server, helper CLI pod, persistent volume claim, service, and optional ingress. This provides a foundation for querying and working with streams once data is flowing through Kafka.
Kafka Bridge
The kafka-bridge/ directory includes manifests for deploying Strimzi Kafka Bridge, allowing HTTP-based access patterns where they are useful. The included manifests define the Kafka Bridge custom resource and optional ingress with TLS and basic authentication.
Debezium Connect
The debezium-connect/ directory contains manifests for a Strimzi Kafka Connect cluster with the Debezium PostgreSQL connector, example topic definitions, an example PostgreSQL source connector, and least-privilege RBAC for reading database credentials from a Kubernetes Secret.
Camel Connect
The camel-connect/ directory contains manifests for a Strimzi Kafka Connect cluster using the Camel Netty HTTP connector to receive webhook requests and publish them to Kafka topics. It includes connector resources, topics, services, ingress, network policy, an optional webhook logger, and a PlantUML topology source.
How the pieces fit together
The baseline deployment order starts with Kafka, then optional platform services such as Schema Registry, ksqlDB, and Kafka Bridge, followed by integration components such as Debezium Connect and Camel Connect.
That structure reflects a practical streaming architecture:
First, the Kafka cluster provides the event backbone.
Next, services such as Schema Registry and ksqlDB support schema management, stream inspection, and query workflows.
Finally, ingestion components bring data into Kafka from external systems. Debezium Connect is used for PostgreSQL change data capture examples, while Camel Connect is used for webhook ingestion examples.
What you should customize before using it
These manifests are intended as a reusable starting point, not a one-command production deployment. The root README documents shared assumptions such as the kafka namespace, the kafka-kraft cluster name, the Kafka bootstrap service, SCRAM Secret name, placeholder ingress hosts under example.com, and node labels that satisfy affinity rules.
Before applying the manifests in your own environment, review and replace ingress hostnames, image registry placeholders, Secret names and contents, storage classes and storage sizes, node labels and affinity rules, replication factors, and sizing defaults.
Security-sensitive values are intentionally not committed. The repository is designed to avoid live secrets, and referenced secrets must be created separately in your Kubernetes cluster. Some manifests use SASL_PLAINTEXT as an internal example, with guidance to switch to TLS or SASL_SSL when encryption in transit is required.
Why this matters for IoT and environmental data
Reliable meteorological and IoT systems are built from multiple layers. Sensors must measure accurately. Connectivity must be dependable. Data pipelines must preserve events, handle scale, and support downstream analysis.
Open sourcing this repository gives developers, integrators, and infrastructure teams a clearer look at one way to build the data-streaming layer behind such systems. It also helps separate reusable infrastructure from proprietary product and domain logic.
That separation matters. The repository provides platform manifests, example wiring, validation commands, deployment order, and component-specific documentation. It does not expose the full production logic that transforms raw events into final domain outputs.
License and contributions
The repository is released under the MIT License, with copyright attributed to BARANI DESIGN Technologies.
We welcome careful review and useful contributions. The contributing guidance asks contributors to keep changes focused on manifests, examples, validation, and documentation; avoid committing secrets, internal hostnames, internal IPs, or private registry references; and validate changed YAML files before submitting.
Security-sensitive reports should not be opened as public issues. The security policy asks reporters to use private vulnerability reporting, a private security advisory, or another private maintainer channel before disclosing details publicly.
Explore the repository
You can find the open-source Kafka streaming manifests on GitHub. Review the README, inspect the component directories, adapt the placeholders for your environment, and use the manifests as a starting point for your own Kafka-based IoT or streaming data platform.
