Package io.datareplication.producer.feed
The feed producer provides the following guarantees for the generated feed (as long as there are no bugs in either the producer or the repository implementations): (TODO rewrite/expand)
- consistent feed: monotonously increasing timestamps, matching headers
- fully atomic feed updates: no intermediate states are visible
- entities published on a single producer instance will maintain their relative order
- crash resilience: crashes will never corrupt or lose entities or already-consumable feed pages
Usage
TODORepository Implementations
The feed producer needs both read and write access to its repositories. When running a distributed producer across multiple processes or machines, the repositories are how the instances share their state.The feed producer relies on certain guarantees from the repository implementations. It's important to uphold these guarantees when implementing the repositories. Bugs in the repository implementations can break the producer's consistency guarantees and lead to inconsistent feeds or other data corruption.
Consistency Requirements
Repositories must make sure that when theCompletionStage
returned by a
repository method succeeds, all data passed to the method has been successfully saved. Conversely,
if saving fails the CompletionStage must also fail.
When the CompletionStage for a write operation has succeeded, all data written by the write operation must be visible to all future read operations across all producer instances. Most non-distributed databases work this way by default, but an eventually consistent data store is not sufficient on its own and might need some extra handling by e.g. waiting for the write to be acknowledged across all instances.
In general, write operations on repositories are required to be atomic at the level of individual records. This means that any time a method takes a list of multiple records to save or update, until the write operation has finished and its CompletionStage has succeeded:
- read operations may observe both old as well as already updated records
- the order in which updates are performed is undefined, i.e. repository implementations may choose to update records in whichever order is convenient for them
- if saving a record fails, the CompletionStage associated with the operation must also fail, but already updated records don't have to be rolled back
- however, each individual record must either be updated or not: observing partially-updated records must not be possible
Timestamp Precision
The timestamp type used by the library (Instant
) has nanosecond precision. Most
operating system clocks have less precision than that so collected timestamps won't use the full value range.
However, repositories should still store timestamps at their full nanosecond precision.-
Interface Summary Interface Description FeedEntityRepository Repository to store feed entities and what page they are assigned to.FeedPageMetadataRepository FeedPageProvider FeedPageUrlBuilder FeedProducer FeedProducerJournalRepository Repository to store rollback information for the feed producer. -
Class Summary Class Description FeedEntityRepository.PageAssignment A subset of this repository's fields that are sufficient for operations that don't need the entity body.FeedPageMetadataRepository.PageMetadata FeedPageProvider.Builder FeedPageProvider.HeaderAndContentType FeedProducer.Builder FeedProducerJournalRepository.JournalState The rollback information stored by the feed producer to allow clean rollbacks.