public interface RaftStateStore extends Closeable
Modifier and Type | Method and Description |
---|---|
void |
deleteEntriesFrom(long startIndexInclusive)
Rolls back the log by deleting all entries starting with the given index.
|
void |
flushLogs()
Forces all buffered (in any layer) Raft log changes to be written
to the storage layer and returns after those changes are written.
|
void |
open()
Initializes the store before starting to persist Raft state.
|
void |
persistEntry(LogEntry entry)
Persists the given log entry.
|
void |
persistInitialMembers(RaftEndpoint localMember,
Collection<RaftEndpoint> initialMembers)
Persists the given local Raft endpoint and initial Raft group members.
|
void |
persistSnapshot(SnapshotEntry entry)
Persists the given snapshot entry.
|
void |
persistTerm(int term,
RaftEndpoint votedFor)
Persists the term and the Raft endpoint that the local node voted for in
the given term.
|
void open() throws IOException
IOException
void persistInitialMembers(@Nonnull RaftEndpoint localMember, @Nonnull Collection<RaftEndpoint> initialMembers) throws IOException
IOException
void persistTerm(int term, @Nullable RaftEndpoint votedFor) throws IOException
IOException
void persistEntry(@Nonnull LogEntry entry) throws IOException
Log entries are appended to the Raft log with sequential log indices. The first log index is 1.
A block of consecutive log entries has no gaps in the indices, but a gap can appear between the snapshot entry and its preceding regular entry. This happens in an edge case where a follower has fallen so far behind that the missing entries are no longer available from the leader. In that case the leader will send its snapshot entry instead.
In a rare failure scenario Raft must delete a range of the newest entries, rolling back the index of the next persisted entry. Consider the following case where Raft persists three log entries and then deletes entries from index 2:
IOException
flushLogs()
,
persistSnapshot(SnapshotEntry)
,
deleteEntriesFrom(long)
,
RaftAlgorithmConfig
void persistSnapshot(@Nonnull SnapshotEntry entry) throws IOException
After a snapshot is persisted at index=i and flushLogs()
is called, the log entry at index=i and all the preceding
entries are no longer needed and can be evicted from storage. Failing to
evict stale entries will not cause a consistency problem, but it will
increase the time to recover after a restart. Therefore eviction can be
done in a background task.
Raft takes snapshots at a predetermined interval, controlled by commitIndexAdvanceCountToSnapshot
. For instance, if it is 100, snapshots
will occur at indices 100, 200, 300, and so on.
The snapshot index can lag behind the index of the newest log entry that
was already persisted, but there is an upper bound to this difference,
controlled by uncommittedEntryCountToRejectNewAppends
. For instance, if uncommittedEntryCountToRejectNewAppends
is 10, and a persistSnapshot()
call is made with snapshotIndex=100, the index of the
preceding persistEntry()
call can be at most 110.
On the other hand, the snapshot index can also be ahead of the newest log entry. This can happen when a Raft follower has fallen so far behind the leader that the leader no longer holds the missing entries. In that case the follower receives a snapshot from the leader. There is no upper bound on the gap between the newest log entry and the index of the received snapshot.
IOException
flushLogs()
,
persistEntry(LogEntry)
,
RaftAlgorithmConfig
void deleteEntriesFrom(long startIndexInclusive) throws IOException
There is a bound on the number of entries that can be deleted, specified
by uncommittedEntryCountToRejectNewAppends
. Say that it is 5 and the
highest persisted log entry index is 20. At most 5 newest entries can be
deleted, hence deletion can start at index=16 or higher.
IOException
flushLogs()
,
persistEntry(LogEntry)
,
RaftAlgorithmConfig
void flushLogs() throws IOException
IOException
persistEntry(LogEntry)
,
persistSnapshot(SnapshotEntry)
,
deleteEntriesFrom(long)
Copyright © 2019 Hazelcast, Inc.. All rights reserved.