ZooKeeper reliability - three versus five nodes -
from zookeeper faq:
reliability: single zookeeper server (standalone) coordinator no reliability (a single serving node failure brings down zk service). 3 server ensemble (you need jump 3 , not 2 because zk works based on simple majority voting) allows single server fail , service still available. if want reliability go @ least 3. typically recommend having 5 servers in "online" production serving environments. allows take 1 server out of service (say planned maintenance) , still able sustain unexpected outage of 1 of remaining servers w/o interruption of service.
with 3-server ensemble, if 1 server taken out of rotation , 1 server has unexpected outage, there still 1 remaining server should ensure no interruption of service. why need 5 servers? or more interruption of service being considered?
update:
thanks @sbridges pointing out has maintaining quorum. , way zk defines quorum ceil(n/2)
n
original number in ensemble (and not available set).
now, google search zk quorum finds in hbase book chapter on zk:
in zookeeper, number of peers supported, not used because sized ensemble requires, proportionally, more peers form quorum odd sized ensemble requires. example, ensemble 4 peers requires 3 form quorum, while ensemble 5 requires 3 form quorum. thus, ensemble of 5 allows 2 peers fail , still maintain quorum, , more fault tolerant ensemble of 4, allows 1 down peer.
and paraphrasing of wikipedia in edward j. yoon's blog:
ordinarily, majority of people expected there, although many bodies may have lower or higher quorum.
zookeeper requires have quorum of servers up, quorum ceil(n/2)
. 3 server ensemble, means 2 servers must @ time, 5 server ensemble, 3 servers need @ time.
Comments
Post a Comment