Apache hadoop installation on ubuntu

2/13/2023

There are some prerequisites to be installed before installing Hadoop in Single-Node Cluster (Pseudo-Distributed) mode. Hadoop can be run in any of the following three modes: Standalone Mode, Single-Node Cluster (Pseudo-Distributed) and Multiple- Node Cluster (Fully Distributed) mode. Hadoop ecosystem consists of the above base modules along with some additional packages which are installed on top of the Hadoop like Apache Hive, Apache Pig, Apache HBase, Apache Spark, Apache Flume, Apache Oozie, Apache Sqoop, Apache ZooKeeper, Cloudera Impala, Apache Storm and Apache Phoenix. Hadoop MapReduce is an implementation of the MapReduce programming model for large scale data processing. Hadoop YARN is a resource-management platform responsible for managing computing resources in clusters and using them for scheduling of users’ applications. HDFS a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster.

Hadoop Common contains the libraries and utilities required by other Hadoop modules.

The Apache Hadoop framework is composed of the following modules: Hadoop stores data in Hadoop Distributed File System (HDFS), the processing of these data is done using MapReduce. Apache Hadoop is an open source framework used for distributed storage and distributed processing of big data on clusters of computers/ commodity hardwares.

0 Comments

Author

Archives

Categories

Apache hadoop installation on ubuntu

Leave a Reply.