What it does

Key features

Log-based storage — append-only partitioned log with configurable retention and infinite replay capability
Consumer groups — horizontally scalable event consumption with independent offset tracking per group
Kafka Connect — 200+ source and sink connectors for APAC data pipeline integration
Kafka Streams — stateful stream processing (joins, windows, aggregations) on Kafka topics
Replication — configurable partition replication across APAC brokers for fault tolerance
Compaction — topic compaction that retains the latest value per key for APAC materialised view use cases
KRaft mode — ZooKeeper-free Kafka metadata management (production-ready since Kafka 3.3)

When to reach for it

Best for

APAC data engineering teams building real-time event pipelines with multiple independent consumers
Event sourcing and CQRS architectures where APAC systems need replay capability and audit history
High-throughput APAC applications (millions of events per second) where RabbitMQ throughput is insufficient
APAC organisations implementing Change Data Capture from transactional databases to downstream analytics systems

Don't get burned

Limitations to know

! Kafka operational complexity — partition management, replication factor tuning, and consumer group monitoring require dedicated platform engineering expertise in APAC teams
! Kafka is not designed for per-message routing complexity — RabbitMQ exchange routing is more flexible for APAC routing requirements
! ZooKeeper dependency (pre-KRaft) added operational overhead — APAC teams should deploy KRaft-mode Kafka 3.3+ for new deployments
! Managed Kafka options (Confluent Cloud, AWS MSK) reduce operational overhead at the cost of significant licensing or usage fees for APAC production clusters

Context

About Apache Kafka

Apache Kafka is a distributed event streaming platform originally developed at LinkedIn that provides APAC data engineering and platform teams with high-throughput, low-latency, fault-tolerant message streaming — where events are stored in an append-only, replicated log with configurable retention, enabling consumers to read events from any position in the log and replay event history on demand.

Kafka's log-based storage model — where events are durably stored in topic partitions as an ordered, append-only log rather than consumed and deleted as in traditional message queues — is the architectural distinction that makes Kafka suited for event sourcing, event replay, and building multiple independent consumer systems from a single event stream. An APAC financial services event stream (payment authorisations, transaction events) written to Kafka can be consumed simultaneously by a real-time fraud detection system, an analytics aggregation pipeline, an audit log archiver, and a customer notification service — each reading independently from the same event log at its own pace.

Kafka's consumer groups — which enable multiple consumers to share the work of consuming a partitioned topic (each partition is consumed by exactly one consumer in the group at a time), with consumer group offset tracking that enables fault-tolerant at-least-once event processing — provide APAC data engineering teams with horizontally scalable event consumption. Adding more consumers to a Kafka consumer group scales throughput linearly up to the number of topic partitions.

Kafka's Connect framework — which provides a standardised integration layer for streaming data between Kafka and external systems (MySQL databases via Debezium CDC, Elasticsearch, S3, Snowflake, MongoDB, and 200+ connectors) — enables APAC data engineering teams to build event-driven data pipelines by connecting source systems to Kafka and Kafka to destination systems without writing custom integration code.

Kafka Streams and ksqlDB — which enable stateful stream processing (joining, windowing, aggregating, filtering) on Kafka topic data directly within the Kafka ecosystem — provide APAC engineering teams with real-time stream processing without introducing a separate stream processing engine (Flink or Spark Streaming) for simple-to-moderate processing requirements.

Apache Kafka

Key features

Best for

Limitations to know

About Apache Kafka

Where this category meets practice depth.