java - Mitigating Hadoop's Achilles tendons -
i gave this hadoop tuorial read state hadoop has achilles' tendon (a single point of failure) in jobtracker
:
the jobtracker single point of failure hadoop mapreduce service means if jobtracker goes down, running jobs halted.
and this article states namenode
spof:
the single point of failure in hadoop cluster namenode.
single points of failure bad, mkay? what strategies/techniques/tooling/etc. can done circumvent these spofs , make hadoop redundant, faul-tolerant , resilient (buzzword alert!)?
hdfs , mapreduce core components in hadoop, in earlier apache hadoop releases, namenode , jobtracker spof (only 1 instance can configured). problem fixed hadoop 2.x.
jobtracker ha.
jobtracker ha can achived configuring 2 jobtracker(jt)
instance in active - standby
mode on 2 nodes. if 1 jt goes down, second jobtracker available serve request. 1 jobtracker(active) available serving request @ time, second jt(standby ) running in read mode. jobtracker ha requires zookeeper instance, failure over(switching) can configured either manaul or automcatic. automatic failover requires process called failover controller (fc)
. in current release, if active jt fails, running jobs halted, new job automatically submitted new jt. functionality not available in current release.
mr2
second generation of mapreduce uses yarn, resource manager(rm)
master service in yarn, rm can configured in active-standby mode. rm failure not impact running jobs/application.
namenode ha
namenode ha important. namenode ha can configured in active-standby mode(maximum 2 namenode instances). quorum based journaling
accepted method, internally uses zookeeper. 1 namenode active @ time.
secondary namenode(snn)
not standby namenode(sn)
, vice versa, snn has different functionaly in non ha configuration, namenode ha set doesn't require snn, sn namenode performs checkpointing (functionality of snn)
processes namenode ha
- active namenode
- standby namenode
- failover controller : fencing avoid split-brain scenario.
- jounalnodes ( min 3 instances required) : namespace modfication logged journal nodes , standby namenode reads there. 1 namenode allowed write @ time inorder avoide split-brain issue.
Comments
Post a Comment