Event Sourcing

4 min readNov 30, 2019

Event sourcing is an application architecture pattern where each state change of an entity is stored as an event, and the state of the entity is built by aggregating the sequence of events. Keeping the sequence of events does not only answer what the application state is but also answer how the state got there.

“Event sourcing is a paradigm where changes to application state are recorded as a series of events.“ — Martin Fowler

For example, in a bank account application, you can withdraw or deposit money from or to a bank account. For each operation, either you can mutate the state of the bank account, like changing the value of its balance and then maybe persist it somewhere to reuse later; or you can have an event to represent the performed action on the bank account and re-apply or aggregate all the events to get the latest state of the account.

There are various advantages of event sourcing such as audit logging, temporal queries, application logic segregation.

Needless to say, event sourcing serves the purpose of having advance logging since every performed operation can be stored as an event.

The application state can be rebuilt anytime by applying the sequence of events.

If you want to experiment with a different logic, you can simply re-play all the events with the new algorithm and observe the result. Or you can answer questions such as what was the state of the application at a specific point in time by using temporal queries.

After fixing a bug or updating the behavior of the application, you can simply replay the sequence of events to have an updated state of the application as well. For example, while applying events, you can skip certain types of events so that the state of the application will be different.

Event sourcing helps us to answer business-critical questions like

“How many people added an item to their cart, then removed it, then bought that item a month later”?

So, what is an Event?

Simply, an event is a historical fact, a record that represents a change in an entity. For example,

OrderCreated, 
OrderIdemAdded, 
OrderItemRemoved, 
OrderItemAdded
OrderPaid,
OrderShipped, 
OrderCompleted,
OrderCancelled

All the events should be applied one by one to get the latest state of the order.
An event represents a change that already happened, so it is in the past tense.
Since it has already happened, it cannot be changed, it is immutable.
That is, events are either created or read, but never mutated.
Order of events is important so an event generally has a field, mostly, named version that states the sequence number of the event. Optimistic locking can be build based on the version field.

Scaling the reads: Snapshots of aggregation

Streams of events can be too large to operate in a given time. Especially, when the operation needs the latest state or aggregation of events. To prepare a state, all the events should be processed.

For example, to get the latest state of a credit card, all the events belonging to the card needs to be fetched and processed. And performing the aggregating or folding operation frequently can be costlier in the future. Instead, by rolling snapshots, the state itself can be stored in addition to its events.

A snapshot captures a state of a sequence of events at a specific point in time. Whenever the state is needed, the corresponding snapshot can be used. The snapshot can have version information so that an optimistic locking mechanism can be instrumented.

Snapshots can be generated in an interval or eagerly refreshed for each event. It can be done both async or sync. Or lazily, when a state is needed, it can be generated and stored for further usage.

By snapshots, read and write operations can be separated by using different storages optimized for either read or write. In particular, the read load on the medium in which events are stored can be reduced, and the same applies to the medium where snapshots are stored.

Pitfalls

First, backward compatibility: By the time the event sourcing application is modified to address either new features or bugs, there can be new types of event, likeOrderRefunded , and other existing types can be deprecated. Since the event sourcing will still have those deprecated events, it still has to handle them.

The second is about storage. The storage where events are stored can cause problems if not properly used. For example, in an RDBMS storage, it might be tempting to have dedicated columns for each type of event data, like orderId, timestamp, amount, currency, and that results in having lots of null columns for particular event types, where currency info is not available, e.g.

Lastly, working with external systems might be tricky because replaying events might repeat external calls but the external services may not aware of whether it is a replay or real call. An approach is to have gateway services that can handle replays. Or during replay, you can skip external calls, but it makes your application aware of replays and requires effort to handle it.

Event-Driven vs Event Sourcing?

Event-driven is an application architecture pattern where communication between application or application modules is done via emitting and consuming events. However, it does not mean event-driven requires event-sourcing. An application can still mutate an entity and store it when an event is consumed; without storing or consuming the event. In this case, the application will not have the ability to re-generate the state by traversing the events.

Event Sourcing

So, what is an Event?

Scaling the reads: Snapshots of aggregation

Pitfalls

Event-Driven vs Event Sourcing?

Readings:

Written by Mustafa Atik