ZooKeeper reliability - three versus five nodes -


from zookeeper faq:

reliability:  single zookeeper server (standalone) coordinator no reliability (a single serving node failure brings down zk service).  3 server ensemble (you need jump 3 , not 2 because zk works based on simple majority voting) allows single server fail , service still available.  if want reliability go @ least 3. typically recommend having 5 servers in "online" production serving environments. allows take 1 server out of service (say planned maintenance) , still able sustain unexpected outage of 1 of remaining servers w/o interruption of service. 

with 3-server ensemble, if 1 server taken out of rotation , 1 server has unexpected outage, there still 1 remaining server should ensure no interruption of service. why need 5 servers? or more interruption of service being considered?

update:

thanks @sbridges pointing out has maintaining quorum. , way zk defines quorum ceil(n/2) n original number in ensemble (and not available set).

now, google search zk quorum finds in hbase book chapter on zk:

in zookeeper, number of peers supported, not used because sized ensemble requires, proportionally, more peers form quorum odd sized ensemble requires. example, ensemble 4 peers requires 3 form quorum, while ensemble 5 requires 3 form quorum. thus, ensemble of 5 allows 2 peers fail , still maintain quorum, , more fault tolerant ensemble of 4, allows 1 down peer.

and paraphrasing of wikipedia in edward j. yoon's blog:

ordinarily, majority of people expected there, although many bodies may have lower or higher quorum.

zookeeper requires have quorum of servers up, quorum ceil(n/2). 3 server ensemble, means 2 servers must @ time, 5 server ensemble, 3 servers need @ time.


Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -