Senior Full-stack Software Engineer, Netherlands, Amsterdam
METHODOLOGY FOR BUILDING STATEFUL MICROSERVICES USING EVENT LOGS INSTEAD OF CENTRALIZED DATABASE MANAGEMENT SYSTEMS
ABSTRACT
The study examines a log-centric approach to state management in microservice systems where Apache Kafka and Kappa-style streaming provide a single backbone for ingestion, storage, and processing of events. The research focuses on environments with hundreds of services deployed on cloud platforms, such as AWS, and implemented in Erlang, where centralised relational databases become a bottleneck for scalability, evolution, and observability. The article describes a methodology in which stateful microservices reconstruct their local state from immutable event logs and rely on projections instead of a shared database. The analysis synthesises recent work on Kappa architectures, event sourcing, CQRS, and stateful stream processing, and formulates a reference architecture together with migration guidelines from database-centric designs. Particular attention is paid to the design of topics, schemas, and state stores, as well as to replay, recovery, and audit requirements. The conclusions provide practical recommendations for practitioners designing real-time microservice platforms, which will be helpful for architects, developers, and engineering managers responsible for high-load distributed systems.
АННОТАЦИЯ
В работе рассматривается подход к управлению состоянием в микросервисных системах, ориентированный на журналы событий, в которых Apache Kafka и потоковая обработка в стиле Kappa образуют единую опорную шину для приёма, хранения и обработки событий. Исследование посвящено средам с сотнями сервисов, развернутых на облачных платформах, таких как AWS, и реализованных на Erlang, в которых централизованные реляционные базы данных превращаются в узкое место для масштабирования, развития и наблюдаемости. В статье описана методика, при которой микросервисы с собственным состоянием восстанавливают локальное состояние из неизменяемых журналов событий и опираются на проекции вместо общей базы данных. Проводимый анализ обобщает недавние работы по архитектурам Kappa, шаблону event sourcing, подходу CQRS и потоковой обработке с сохранением состояния, формируя эталонную архитектуру и рекомендации по миграции от решений, ориентированных на централизованную базу данных. Отдельное внимание уделено проектированию топиков, схем и хранилищ состояния, вопросам повторного проигрывания событий, восстановления и аудита. В заключительной части сформулированы практические рекомендации специалистам, создающим микросервисные платформы реального времени, представляющие интерес для архитекторов, разработчиков и руководителей инженерных команд, отвечающих за высоконагруженные распределённые системы.
Keywords: stateful microservices, event sourcing, Apache Kafka, Kappa architecture, CQRS, stream processing, Erlang, AWS, event log, distributed systems.
Ключевые слова: микросервисы с состоянием, event sourcing, Apache Kafka, архитектура Kappa, CQRS, потоковая обработка, Erlang, AWS, журнал событий, распределённые системы.
Introduction
Real-time data processing now defines architectural requirements for large-scale digital services in e-commerce, finance, telecommunications, and IoT. Microservice systems built on event streaming platforms are used to satisfy latency and availability constraints; yet, many deployments still rely on centralised relational or NoSQL databases as the primary store for business state. This combination produces friction between independent service evolution and globally shared schemas. It increases fragility when hundreds of services compete for common database clusters and must maintain complex links between transactional and analytical workloads.
During the last decade, research on event streaming, event sourcing, CQRS, and log-centric analytics has shown that a distributed log such as Kafka can serve as ingestion mechanism, durable storage, and driver for continuous computation, while serving stores act mainly as projections. Work on event sourcing and CQRS indicates that business state can be reconstructed deterministically from event histories and that microservices maintain dedicated read models instead of a single global database. Experience with stateful stream processing highlights trade-offs between local state, changelog topics, and replay-based recovery under high throughput.
The article aims to formulate a methodology for building stateful microservices that use event logs as the primary source of truth and rely on centralised databases only for auxiliary tasks or not at all. The study generalises architectural patterns from recent work on Kappa-style pipelines, event sourcing, and CQRS for Kafka-based platforms; constructs a reference architecture that aligns event logs, local state stores, and projections; and proposes migration guidelines that help teams transition from database-centric microservices to log-centric designs while maintaining traceability and analytical capabilities. The methodology targets platforms where Kafka, Kappa architecture, and microservices form a unified ecosystem and where hundreds of Erlang-based services run on AWS under strict reliability, replay, and observability requirements.
Materials and Methods
The methodological basis rests on a focused review of ten recent sources that address event streaming architectures, event sourcing and CQRS, Kappa-style designs, and stateful stream processing, combining peer-reviewed studies with engineering reports. A. Barradas, A. Tejeda-Gil, and R.-M. Cantón-Croda present a real-time big data pipeline for cryptocurrency and social media streams with Kafka as streaming backbone [1]. S. R. Gundla analyses an event-sourcing solution for retail inventory that uses Kafka for events and BigQuery for analytics [2]. T. Hachad, A. Sadiq, and F. Ghanimi design a real-time architecture for student attention detection and compare Lambda and Kappa on a Flink-based stack [3]. S. Kesarpu and H. P. Dasari describe a Kafka-based event-sourcing architecture for real-time risk analytics where an immutable event log serves as the primary record of business activity and read models are derived through projections, which confirms the feasibility of treating Kafka topics as a system of record in high-throughput financial environments [4]. M. Kindson and P. Martinek examine distributed message handling in a CQRS architecture [5]. J. M. R. Mendes documents aggregate modelling, event store design, and replay mechanics in an event-sourcing application [6]. A. Ninan summarises the benefits of Kappa architecture in the modern data stack [7]. J. B. N. Penka, S. Mahmoudi, and O. Debauche describe an optimised Kappa architecture for IoT data in smart farming [8]. S. Purella examines the migration of financial transaction platforms from monolithic applications with centralised databases to microservices backed by an event-driven backbone, compares Kafka with RabbitMQ and managed cloud messaging services, and highlights event sourcing as a mechanism for audit trails and regulatory compliance in high-volume systems [9]. G. P. Rusum and K. K. Pappula analyse event-driven patterns for reactive systems, construct an experimental setup on AWS with Kafka as event broker and Apache Flink as a stateful stream processor, and evaluate latency, throughput, and failure recovery while discussing the use of event sourcing and event-carried state transfer in microservice-based architectures [10].
The research uses a qualitative analytical strategy that combines comparative analysis, synthesis, and architectural modelling. Each source was examined with respect to event logs, state management, and layering. Recurring patterns in Kappa-style pipelines, event-sourcing implementations, and CQRS-based microservices were consolidated into a reference architecture for Kafka-based stateful microservices in AWS environments. On this basis, the study outlines methodological steps for transitioning from centralised database usage to log-centric state management, covering domain modelling, topic and schema design, local state stores, and replay and recovery procedures.
Results
The synthesis of the reviewed sources clarifies differences between database-centric microservices and systems where an event log holds the canonical record of business activity. Kafka-based architectures with Kappa-style processing show that a single append-only log supports real-time computation and batch-like recomputation through controlled replay, avoiding duplicated logic in separate batch and speed layers [1; 3; 4; 7]. In such deployments, Kafka ingests events from heterogeneous producers, feeds stream processors, and retains data long enough to serve as a historical record, while serving stores materialise views for specific query patterns in cryptocurrency analytics, education, and smart farming [1; 3; 8].
On this basis, the study proposes a reference architecture for stateful microservices that treats the event log as the core of the system and confines databases to projections. Figure 1 presents this structure in a form adapted from the Kappa architecture for IoT data management in smart farming [8]. Producers, including microservices and external systems, publish domain events to Kafka topics. Stream processors implemented in Erlang or other languages consume these topics, apply business rules, and update local state stores with service-specific aggregates and indexes. Each stateful microservice maintains one or more read models in key–value, document, or columnar stores, populated only by domain events or internal change streams; cross-service joins and a shared relational schema are avoided, and collaboration proceeds through event exchange.
/Rybchanka.files/image001.jpg)
Figure 1. Log-centric Kappa-style architecture for stateful microservices with Kafka as a system of record (compiled by the author based on his own research)
This arrangement follows event-sourcing guidance where every state change of an aggregate is recorded as an immutable event, and the current state is obtained by replaying the history [2; 6]. Gundla’s retail inventory system synchronises distributed stock by emitting events for each sale, return, and adjustment into Kafka and maintaining analytical projections in BigQuery [2]. Mendes’ thesis confirms the feasibility of state reconstruction from events and stresses the importance of idempotent handlers and careful aggregate design [6]. In the proposed methodology, services define domain-aligned aggregates, store them as event streams in Kafka topics, and utilise snapshots or materialised views only for isolation and performance purposes.
A second group of results concerns stateful stream processing and its impact on microservice design. Hachad et al. and Penka et al. apply windowing and aggregation to video frames and sensor readings, which requires attention to state size, retention, and recovery [3; 8]. For stateful microservices, this leads to a design where each local store acts as a bounded cache reconstructed from changelog topics, partitions align with Kafka keys, and recovery after failure follows a simple pattern: an instance resumes from committed offsets, restores state from the changelog, and continues processing without coordination with a central database.
The comparison of Lambda and Kappa architectures strengthens the case for a single streaming pipeline in microservice ecosystems. Kesarpu and Ninan argue that Lambda’s dual-path structure multiplies codebases and complicates change management, since business rules must be updated consistently across batch and speed layers [4; 7]. Kappa, in contrast, applies new logic by deploying updated stream processors and replaying part of the log. Barradas et al. and Hachad et al. show that replay supports incremental and full recomputation of features and models in cryptocurrency and education scenarios [1; 3]. For stateful microservices, this provides a mechanism for recalculating projections or correcting historical errors by reconsuming selected topic segments while preserving the authoritative event history.
CQRS and distributed message handling supply a third cluster of results. Kindson and Martinek decompose microservice communication into commands, events, and queries and propose handling strategies that preserve consistency without large distributed transactions [5]. N’Guessan and co-authors extend CQRS and event sourcing into a “fast data” architecture that integrates software and data engineering with machine learning for real-time COVID-19 analytics [2]. Within the present methodology, commands affect aggregates and emit events that are appended to Kafka; these events feed projections, which in turn target read models only. Write-side invariants rely on event streams and local state, while read-side latency and scalability are addressed through specialised projections.
On this basis, the article outlines methodological steps for migration to log-centric state management. Domain modelling and event storming identify aggregates, their lifecycles, and events that record state transitions [2; 6]. Topic design binds each aggregate type and integration boundary to Kafka topics with naming and partitioning strategies consistent with Kappa deployments in IoT and analytics [1; 3; 8]. Service boundaries follow aggregates and business processes rather than database tables. Local state stores appear where stream processing or session management is required. Projections for REST APIs, reporting, and monitoring are backed by serving stores that consume domain events or internal change messages.
Cross-cutting concerns rely on findings from streaming trends. Rusum highlights centralised governance, schema management, and security in Kafka- and Flink-based platforms [10]. The methodology, therefore, prescribes schema registries, compatibility rules, and topic-level access control as baseline infrastructure. In AWS environments, these principles translate into automated definitions for Kafka clusters, schema registries, and stream processors, together with observability pipelines that track lag, state restore, and replay durations.
Discussion
The comparison carried out in the study indicates that replacing a centralised relational database with an append-only event log alters how state is managed in microservice systems that rely on Kafka and Kappa-style data flows. Architectures for education, agriculture, and financial analytics show that a log-centric backbone ingests heterogeneous real-time sources and enables replay-based refinement of derived views [1; 3; 8]. These findings align with the methodology in which per-service state stores act as disposable caches reconstructed from the log, and cross-service consistency follows from the deterministic consumption of a shared, ordered history, rather than coordinated transactions against a common database.
For microservice platforms built around Kafka, the Kappa model with a single streaming pipeline reduces operational and cognitive overhead compared to dual-path Lambda designs, as batch-style recomputation is handled by replaying over existing topics rather than through a separate stack of jobs and storage systems. Schema evolution, backfills, and bug fixes are applied by reprocessing a segment of the log. Studies of Kappa architectures for cryptocurrencies and smart farming report that such systems sustain competitive throughput while avoiding the maintenance of two distinct data flows [1; 8]. At the same time, these systems reveal configuration-sensitive bottlenecks around retention, partitioning, and memory allocation in Kafka, which need to be considered when extending the proposed methodology to large fleets of stateful microservices.
From a state management perspective, the main distinctions between the evaluated approaches concern the scalability of writes, reproducibility of service state, and coupling between services. Evidence from Kappa deployments in attention detection, IoT agriculture, and cryptocurrency analytics shows that treating the event log as a primary, immutable source simplifies horizontal scaling and allows for the independent evolution of services that materialise their own views [1; 3; 8]. Database-centric microservices, in contrast, tend to drift toward shared schemas, cross-service joins, and ad hoc data access patterns. Table 1 summarises these differences along the criteria observed in practice.
Table 1.
Comparison of database-centric and log-centric state management for microservices [1–9]
|
Criterion |
Database-centric microservices (shared DB) |
Log-centric, event-sourced microservices (Kafka + Kappa-style pipeline) |
|
Scalability of write workloads |
Vertical scaling of a few database clusters; cross-service coordination for schema and transaction management |
Horizontal scaling through topic partitioning and independent consumer groups; backpressure governed by stream engine |
|
Reproducibility of service state |
Backups limit point-in-time recovery and change data capture; complex replay of partial histories |
State reconstructed by replaying the log into local stores; deterministic rebuilds enable forensic analysis and audits |
|
Coupling between microservices |
Tendency toward shared schemas and cross-service joins; coupling via direct SQL access |
Services communicate exclusively through events; coupling is limited to schema contracts and topic semantics |
|
Temporal analytics and backfills |
Historical recomputations require specialised warehouses or ETL processes |
Batch-like recomputations implemented as replay over existing topics or compacted logs |
|
Failure handling and recovery |
Transaction logs internal to the DB; distributed failures require complex coordination and compensating transactions |
Consumers resume from offsets, sagas, and compensating actions expressed as sequences of events |
|
Governance and auditability |
Audit trails are scattered across tables and services, resulting in a limited ability to reconstruct a complete decision history. |
Canonical history in the event log; fine-grained replay of decisions and state transitions |
Research on stateful stream processing highlights the constraints introduced by local state, particularly under long-lived sessions, windowed aggregations, and exactly-once semantics [3; 9; 10]. When the state outgrows memory or fast local storage, restoration from changelog topics lengthens recovery and stresses Kafka clusters. Case studies from Kappa deployments in education and smart farming associate misconfigured retention policies, offset commits, or memory limits with latency spikes and throughput degradation [3; 8]. The methodology addresses these risks through bounded state per microservice, explicit compaction and retention policies for topic categories, and separation of transactional, analytical, and archival streams.
The reviewed sources converge on the view that log-centric design does not remove serving stores but places them at the periphery of the architecture as projections tailored to specific query patterns, as illustrated by Figure 1 [1; 3; 7; 8]. In cryptocurrency analytics and IoT scenarios, Kafka is combined with Druid, HBase, or document stores to serve dashboards and APIs, while Kafka itself retains responsibility for ingestion, ordering, and durable storage [1; 8]. The methodology extends this pattern to stateful microservices: each bounded context owns one or more projections populated from domain events, and the centralised database is replaced by a constellation of purpose-built views that can be rebuilt or discarded as needed.
CQRS-oriented and message-handling literature notes that such a design requires disciplined modelling of commands, events, and read models, but eases system evolution under changing requirements [5]. Independent read models support experimentation with new APIs and analytical features, while write models protect transactional correctness and business invariants. Event-sourcing case studies confirm that treating the log as authoritative history allows OLTP-style microservices and streaming analytics to coexist without competing for a shared database [2; 6].
Given the focus on real-time processing, Kappa-oriented sources evaluate trade-offs against Lambda. Both academic and industrial analyses point out that Lambda supports batch-heavy workloads and complex historical queries but it enforces duplicated logic between batch and streaming paths [1; 4; 7; 10]. Kappa processes historical and live data through a single streaming engine, typically by replaying the log into updated code. The methodology interprets traditional batch jobs as bounded topic replays, which fits environments where hundreds of microservices share a Kafka cluster and benefit from unified tooling.
A further theme concerns governance, observability, and operational boundaries of log-centric systems. When the log replaces the centralised database, schema governance, lineage, access control, and retention policies shift to the streaming layer rather than the SQL layer. Descriptions of production-grade Kappa pipelines emphasise schema registries, explicit versioning, and conventions for naming and partitioning topics [1, 3, 7, 10]. In the methodology, these findings are presented as recommendations to adopt strongly typed schemas with compatible evolution, to separate internal and external topics, and to utilise infrastructure-as-code to maintain alignment between topic topology, stream processors, and projections across AWS environments. Table 2 connects concerns in stateful microservices with log-based design patterns and tooling.
Table 2.
Concerns in stateful microservices and corresponding log-based design patterns [1–3; 5–10]
|
Concern in stateful microservices |
Log-based pattern/mechanism |
Representative tooling and configurations |
|
Exactly-once state updates and replay safety |
Idempotent producers, transactional writes, changelog-backed local stores |
Kafka idempotent producers and transactions, Kafka Streams / Flink state with changelog topics |
|
Long-lived workflows and sagas |
Process managers coordinate sequences of events, compensating for events instead of rollbacks |
Dedicated saga microservices consuming and emitting events, outbox pattern for inter-service coordination |
|
Analytical queries over operational data |
Projection services that build read models from domain events |
Materialised views in Druid, HBase or document stores populated from Kafka topics |
|
Schema evolution and compatibility |
Versioned event schemas with schema registry and compatibility rules |
Kafka Schema Registry, Avro/Protobuf with backward- or forward-compatible evolution rules |
|
Multi-tenant and domain isolation |
Topic namespaces per bounded context and tenant, isolated projections |
Separate topic prefixes and consumer groups; per-tenant projections stored in dedicated logical databases or tables |
|
Operational observability and debugging |
Correlation identifiers, structured event metadata, and replay-based incident analysis |
Trace and correlation IDs embedded in events; replaying subsets of topics into diagnostic stream processors |
|
Regulatory compliance and audit trails |
Immutable event history as canonical audit log |
Long-retention Kafka topics with compaction for non-sensitive fields; encrypted archives for PII |
In summary, the discussion of the surveyed work and the synthesized methodology supports the central thesis of the article: for stateful microservices that already rely on Kafka and streaming technologies, adopting a log-as-database approach within a Kappa-style architecture yields a coherent and operationally manageable foundation for both transactional and analytical workloads, provided that design trade-offs around state size, retention, replay and governance are addressed explicitly.
Conclusion
The analysis presented in the article shows that a methodology based on Kappa architecture, Kafka-centred event logs, and event-sourced microservice design offers a practical alternative to centralised database management for stateful systems under real-time constraints. Across education, IoT agriculture, cryptocurrency analytics, and enterprise scenarios, log-centric architectures supply a consistent mechanism for ingesting, storing, and processing streams while enabling deterministic reconstruction of state and flexible construction of projections for diverse query patterns.
Within the proposed methodology, each microservice treats Kafka topics as authoritative history and shifts persistent state into local stores backed by changelog streams. Consistency boundaries follow aggregates and sagas, and read models evolve independently using CQRS and projection services. At the same time, evidence from recent deployments stresses careful engineering of state boundaries, retention and compaction policies, and observability. Recommendations on topic design, schema governance, saga orchestration, and replay form a blueprint for organisations that intend to migrate from centralised databases to event logs with controlled operational risk.
References:
- Barradas, A., Tejeda-Gil, A., & Cantón-Croda, R.-M. (2022). Real-time big data architecture for processing cryptocurrency and social media data: A clustering approach based on k-means. Algorithms, 15(5), 140. https://doi.org/10.3390/a15050140
- Event sourcing for retail inventory: Kafka + BigQuery real-time analytics. (2024). International Journal of Data Science and Machine Learning, 1(1), 11–36. https://www.academicpublishers.org/journals/index.php/ijdsml/article/download/6119/7031/13855
- Hachad, T., Sadiq, A., & Ghanimi, F. (2020). A new big data architecture for real-time student attention detection and analysis. International Journal of Advanced Computer Science and Applications, 11. https://doi.org/10.14569/IJACSA.2020.0110831
- Kesarpu, S., & Dasari, H. P. (2025). Kafka event sourcing for real-time risk analysis. International Journal of Computational and Experimental Science and Engineering, 11(3), 6012–6018. https://doi.org/10.22399/ijcesen.3715
- Kindson, M., & Martinek, P. (2023). A simplified approach to distributed message handling in a CQRS architecture. Acta Polytechnica Hungarica, 20, 141–160. https://doi.org/10.12700/APH.20.4.2023.4.8
- Mendes, J. M. R. (2022). Implementation of an event sourcing application (Master’s thesis, University of Coimbra). https://estudogeral.uc.pt/bitstream/10316/102129/1/RelatorioFinal.pdf
- Ninan, A. (2023). Advantages of kappa architecture in the modern data stack. SQLServerCentral. https://www.sqlservercentral.com/articles/advantages-of-kappa-architecture-in-the-modern-data-stack
- Nkamla Penka, J. B., Mahmoudi, S., & Debauche, O. (2022). An optimized kappa architecture for IoT data management in smart farming. Journal of Ubiquitous Systems and Pervasive Networks, 17(2), 59–65. https://doi.org/10.5383/JUSPN.17.02.002
- Purella, S. (2025). Microservices and event-driven architectures in high-volume financial systems. Sarcouncil Journal of Engineering and Computer Sciences, 4(7), 851–860. https://doi.org/10.5281/zenodo.16150784
- Rusum, G. P., & Pappula, K. K. (2022). Event-driven architecture patterns for real-time, reactive systems. International Journal of Emerging Research in Engineering and Technology, 3(3), 108–116. https://doi.org/10.63282/3050-922X.IJERET-V3I3P111