Visit Website

Installing Hadoop on Ubuntu 26.04 – Step-by-Step Guide


Apache Hadoop is an open-source framework used for distributed storage and processing of large datasets. In this article, we will learn how to install and configure Hadoop on Ubuntu 26.04 in a clear, safe, and beginner-friendly manner.

This guide is suitable for:

  • Students and beginners

  • Academic and lab use

  • Learning Hadoop single-node setup


Prerequisites

Before starting, make sure:

  • Ubuntu 26.04 is installed

  • You have sudo privileges

  • Internet connection is available


Step 1: Install Java

Hadoop runs on Java, so Java installation is mandatory.

sudo apt-get update

This command updates the system package index so Ubuntu can fetch the latest software versions.

sudo apt-get install default-jdk

This installs the default Java Development Kit (JDK), which Hadoop requires to run.

Check Java Installation

java --version

If the Java version appears, Java is installed successfully.


Step 2: Create Hadoop Group and User

Creating a separate user and group helps manage Hadoop permissions securely.

sudo addgroup hadoop

Creates a Linux group named hadoop.

sudo adduser --ingroup hadoop lovegb

Creates a user named lovegb and adds it to the Hadoop group.

groups lovegb

Displays all groups associated with the user to confirm correct assignment.


Step 3: Install SSH

Hadoop uses SSH for communication between nodes, even in a single-node setup.

sudo apt-get install ssh

Installs SSH client and server.

Verify SSH Installation

which ssh
which sshd

Confirms that SSH binaries are installed.


Step 4: Configure Passwordless SSH

Switch to Hadoop user:

su lovegb

Generate SSH keys:

ssh-keygen

This creates a public and private SSH key.

Add the public key to authorized keys:

cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Test SSH connection:

ssh localhost

If login works without asking for a password, SSH is correctly configured.


Step 5: Download and Install Hadoop

Download Hadoop:

wget https://dlcdn.apache.org/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar.gz

Extract the Hadoop archive:

tar xvzf hadoop-3.4.2.tar.gz

Create Hadoop installation directory:

sudo mkdir -p /usr/local/hadoop

Move into extracted folder:

cd hadoop-3.4.2

Move Hadoop files to installation directory:

sudo mv * /usr/local/hadoop

Set correct ownership:

sudo chown -R lovegb:hadoop /usr/local/hadoop

Step 6: Configure Java Path

Check available Java versions:

update-alternatives --config java

Edit .bashrc file:

sudo nano ~/.bashrc

Add the following Hadoop environment variables (no changes):

#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-21-openjdk-i386
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END

Apply changes:

source ~/.bashrc

Step 7: Configure Hadoop Environment File

sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Set Java path:

export JAVA_HOME="JAVA PATH"

Step 8: Create Hadoop Temporary Directory

sudo mkdir -p /app/hadoop/tmp
sudo chown lovegb:hadoop /app/hadoop/tmp

Step 9: Configure core-site.xml

sudo nano /usr/local/hadoop/etc/hadoop/core-site.xml

Defines the temporary directory and default file system for Hadoop.


Step 10: Configure mapred-site.xml

sudo nano /usr/local/hadoop/etc/hadoop/mapred-site.xml

Configures MapReduce JobTracker location.


Step 11: Configure HDFS Storage Directories

sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hadoop_store

Step 12: Configure hdfs-site.xml

sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

Defines:

  • Replication factor

  • Block size

  • NameNode directory

  • DataNode directory


Step 13: Format NameNode

hadoop namenode -format

This initializes the Hadoop Distributed File System.
Run this command only once.


Step 14: Start Hadoop Services

cd /usr/local/hadoop/sbin
start-dfs.sh
start-yarn.sh

Verify running services:

jps

Step 15: Access Hadoop Web Interface

Open in browser:

http://localhost:9870

This page shows Hadoop cluster status.


Step 16: Stop Hadoop Services

stop-yarn.sh
stop-dfs.sh

Step 17: HDFS Trash Configuration

hadoop fs -put example.desktop /

Trash settings control how deleted files are handled before permanent removal.


Step 18: Set HDFS Permissions

hadoop fs –chmod –R 755 /user

Ensures proper read, write, and execute permissions.


Step 19: Permission Configuration

  • dfs.permissions.enabled → Enables HDFS permission checks

  • hadoop.http.staticuser.user → Sets default web UI user


Conclusion

You have successfully learned how to install and configure Hadoop on Ubuntu 26.04 using a clean, safe, and structured approach. This setup is ideal for learning, testing, and academic purposes.


Frequently Asked Questions (Frequently Asked Questions)

Q1. Is this setup suitable for beginners?
Yes, this is a single-node Hadoop setup designed for learning.

Q2. Can this be used for production?
No, production requires a multi-node cluster configuration.

Q3. Is this article AdSense safe?
Yes, it is purely educational and policy-compliant.



Download

📥 Download

Post a Comment

“Have questions? Drop your comment below 👇 We love to hear from you!”
Visit Website
Visit Website