I had seen zookeeper source code , It had very strange operation at cluster . We all known when write to zookeeper cluster nodes , the process steps are :
- Leader send proposal request to all follower and self
- When follower receive the proposal request , then ack it
- When leader receive the most the ack response , then send commit requst
- The follower and the leader commit it
The problem is the step 2 , when follower receive the proposal request , the requst is synced to zk tx log (See the list code) , the commit request only write to memory . But at before ack and after sync to disk time , restart all the node , was the uncommited request is the newest request ?
// the follower receive the proposal request method , forword to syncProcessor
public void FollowerZooKeeperServer#logRequest(TxnHeader hdr, Record txn) {
Request request = new Request(null, hdr.getClientId(), hdr.getCxid(),
hdr.getType(), null, null);
request.hdr = hdr;
request.txn = txn;
request.zxid = hdr.getZxid();
if ((request.zxid & 0xffffffffL) != 0) {
pendingTxns.add(request);
}
syncProcessor.processRequest(request);
}
// the SyncRequestProcessor operation , after tx log commit to disk , it response the ack request. Was it ok ?
private void flush(LinkedList<Request> toFlush)
throws IOException, RequestProcessorException
{
if (toFlush.isEmpty())
return;
zks.getZKDatabase().commit();
while (!toFlush.isEmpty()) {
Request i = toFlush.remove();
if (nextProcessor != null) {
nextProcessor.processRequest(i);
}
}
if (nextProcessor != null && nextProcessor instanceof Flushable) {
((Flushable)nextProcessor).flush();
}
}