This Yarn tutorial will take you through all the aspects about Apache Yarn like Yarn introduction, Yarn Architecture, Yarn nodes / daemons resource manager and node manager. In this tutorial we will discuss various Yarn features, characteristics and High availability modes.

2. Yarn Introduction
Apache Yarn “Yet Another Resource Negotiator” is the resource management layer of Hadoop . Yarn was introduced in hadoop 2.x. Yarn allows different data processing engines like graph processing, interactive processing, stream processing as well as batch processing to run and process data stored in hdfs (Hadoop Distributed File System) . Apart from resource management Yarn is also used for job Scheduling. Yarn extends the power of hadoop to other evolving technologies, so they can take the advantages of hdfs (most reliable and popular storage system on the planet) and economic cluster
Yarn is also considered as the data operating system for Hadoop 2.x. Yarn based architecture of Hadoop 2.x provides a general purpose data processing platform which is not just limited to the MapReduce. It enables hadoop to process other purpose built data processing system other thanMapReduce. It allows running several different frameworks on the same hardware where hadoop is deployed.

3. Yarn Architecture
Yarn Framework consists of a master daemon known as “Resource Manager”, slave daemon called node manager (one per slave-node) and Application Master (one per application).
i. Resource Manager (RM)It is the master daemon of Yarn. It manages the global assignments of resources (cpu and memory) among all the applications. It arbitrates system resources between competing applications. To learn more about Resource Manager follow this comprehensive guide .
Resource Manager has two Main components
Scheduler Application managera. Scheduler
Scheduler is responsible for allocating the resources to the running application. Scheduler is pure scheduler it means that it performs no monitoring no tracking for the application and even doesn’t guarantees about restarting failed tasks either due to application failure or hardware failures.
b. Application Manager
It manages running Application Masters in the cluster, i.e., it is responsible for starting application masters and for monitoring and restarting them on different nodes in case of failures.
ii. Node Manager (NM)It is the slave daemon of Yarn. NM is responsible for containers monitoring their resource usage and reporting the same to the ResourceManager. Manage the user process on that machine. NodeManager also tracks the health of the node on which it is running. The design also allows plugging long-running auxiliary services to the NM; these are application-specific services, specified as part of the configurations and loaded by the NM during startup. For MapReduce applications on YARN, shuffle is a typical auxiliary service loaded by the NMs. To learn more about node manager follow this comprehensive guide .
iii. Application Master (AM)One application master runs per application. It negotiates resources from the resource manager and works with the node manager. It Manages the application life cycle.
The AM acquires containers from the RM’s Scheduler before contacting the corresponding NMs to start the application’s individual tasks.
4. Resource Manager RestartResource Manager is the central authority that manages resources and schedules applications running on YARN. Hence, it is potentially a SPOF in an Apache YARN cluster.
There are two types of restart for Resource Manager:
Non-work-preserving RM restart: This restart enhances RM to persist application/attempt state in a pluggable state-store. Resource Manager will reload the same info from state-store on restart and re-kick the previously running apps. Users are not required to re-submit the applications.
Node manager and clients during down time of RM will keep polling RM untill RM comes up, when RM comes up, it will send a re-sync command to all the NM and AM it was talking to via heartbeats. The NMs will kill all its mangers containers and re-register with RM
Work-preserving RM restart: This focuses on re-constructing the running state of RM by combining the container status from Node Managers and container requests from Application Masters on restart. The key difference from Non-work-preserving RM restart is that alreadyrunning apps will not be stopped after masterrestarts, so applications will not lose its processed databecause of RM / master outage.
RM recovers its running state by taking advantage of container status which is sent from all the node managers. NM will not kill the containers when it re-syncs with the restarted RM. It continues managing the containers and sends the container status across to RM when it re-registers.
5. Resource Manager High availabilityTheResourceManager (master) is responsible for handlingthe resources in a cluster, and scheduling multiple applications (e.g., spark apps or MapReduce). Beforeto Hadoop v2.4, the master (RM)was the SPOF (single point of failure). The High Availability feature adds redundancy in the form of an Active/Standby ResourceManager pair to remove this otherwise single point of failure.
ResourceManager HA is realized through an Active/Standby architecture at any point of time, one of the masteris Active, and otherResource Managers are in Standby mode, they are waiting to take over when anything happen to the Active. The trigger to transition-to-active comes from either the admin (through CLI) or through the integrated failover-controller when automatic-failover is enabled.
i. Manual transitions and failoverWhen automatic failover is not configured, admins have to manually transit one of the Resource managers to Active state. Failover from activemaster to the other, they are expected to transit the Active masterto standby and transit a Standby-RM to Active. This activitycan be done using the yarn rmadminclient.
ii. Automatic failoverIn this case there is no need of any manual intervention. The master have an option to embed the Zookeeper (a coordination engine) based ActiveStandbyElector to decide which Resource Manager should be the Active. When the active fails, another Resource Manager is automatically selected to be the Active. Note that, there is no need to run a separate zookeeperdaemon because ActiveStandbyElector embedded in Resource Managers acts as a failure detector and a leader elector instead of a separate ZKFC daemon.
6. Web Application ProxyIt is the part of Yarn. By default it runs as a part of RM but we can configure and run in a standalone mode. The reason of the proxy is to reduce the possibility of the web based attack through Yarn.
In Yarn the AM has a responsibility to provide a web UI and se