# Hadoop ______ ## Ecosystem - The Hadoop ecosystem include HDFS, Hive, Pig, YARN, MapReduce, Spark, HBase, Oozie, Sqoop, Zookeeper, etc. HDFS - Hadoop Distributed File System (HDFS), is one of the largest Apache projects and primary storage system of Hadoop. - It employs a NameNode and DataNode architecture. - It is a distributed file system able to store large files running over the cluster of commodity hardware. ## YARN - YARN stands for Yet Another Resource Negotiator - It is one of the core components in open source Apache Hadoop suitable for resource management. - It is responsible for managing workloads, monitoring, and security controls implementation. - It also allocates system resources to the various applications running in a Hadoop cluster while assigning which tasks should be executed by each cluster node. **YARN has two main components:** - Resource Manager - Node Manager - Pig - A high-level scripting language used to execute queries for larger datasets that are used within Hadoop. Pig’s simple SQL-like scripting language is known as Pig Latin and its main objective is to perform the required operations and arrange the final output in the desired format.