Apache Hadoop is an open-source framework used for distributed storage and processing of large datasets. In this article, we will learn how to install and configure Hadoop on Ubuntu 26.04 in a clear, safe, and beginner-friendly manner.
This guide is suitable for:
Students and beginners
Academic and lab use
Learning Hadoop single-node setup
Prerequisites
Before starting, make sure:
Ubuntu 26.04 is installed
You have sudo privileges
Internet connection is available
Step 1: Install Java
Hadoop runs on Java, so Java installation is mandatory.
sudo apt-get update
This command updates the system package index so Ubuntu can fetch the latest software versions.
sudo apt-get install default-jdk
This installs the default Java Development Kit (JDK), which Hadoop requires to run.
Check Java Installation
java --version
If the Java version appears, Java is installed successfully.
Step 2: Create Hadoop Group and User
Creating a separate user and group helps manage Hadoop permissions securely.
sudo addgroup hadoop
Creates a Linux group named hadoop.
sudo adduser --ingroup hadoop lovegb
Creates a user named lovegb and adds it to the Hadoop group.
groups lovegb
Displays all groups associated with the user to confirm correct assignment.
Step 3: Install SSH
Hadoop uses SSH for communication between nodes, even in a single-node setup.
sudo apt-get install ssh
Installs SSH client and server.
Verify SSH Installation
which ssh
which sshd
Confirms that SSH binaries are installed.
Step 4: Configure Passwordless SSH
Switch to Hadoop user:
su lovegb
Generate SSH keys:
ssh-keygen
This creates a public and private SSH key.
Add the public key to authorized keys:
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Test SSH connection:
ssh localhost
If login works without asking for a password, SSH is correctly configured.
Step 5: Download and Install Hadoop
Download Hadoop:
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar.gz
Extract the Hadoop archive:
tar xvzf hadoop-3.4.2.tar.gz
Create Hadoop installation directory:
sudo mkdir -p /usr/local/hadoop
Move into extracted folder:
cd hadoop-3.4.2
Move Hadoop files to installation directory:
sudo mv * /usr/local/hadoop
Set correct ownership:
sudo chown -R lovegb:hadoop /usr/local/hadoop
Step 6: Configure Java Path
Check available Java versions:
update-alternatives --config java
Edit .bashrc file:
sudo nano ~/.bashrc
Add the following Hadoop environment variables (no changes):
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-21-openjdk-i386
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END
Apply changes:
source ~/.bashrc
Step 7: Configure Hadoop Environment File
sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh
Set Java path:
export JAVA_HOME="JAVA PATH"
Step 8: Create Hadoop Temporary Directory
sudo mkdir -p /app/hadoop/tmp
sudo chown lovegb:hadoop /app/hadoop/tmp
Step 9: Configure core-site.xml
sudo nano /usr/local/hadoop/etc/hadoop/core-site.xml
Defines the temporary directory and default file system for Hadoop.
Step 10: Configure mapred-site.xml
sudo nano /usr/local/hadoop/etc/hadoop/mapred-site.xml
Configures MapReduce JobTracker location.
Step 11: Configure HDFS Storage Directories
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hadoop_store
Step 12: Configure hdfs-site.xml
sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml
Defines:
Replication factor
Block size
NameNode directory
DataNode directory
Step 13: Format NameNode
hadoop namenode -format
This initializes the Hadoop Distributed File System.
Run this command only once.
Step 14: Start Hadoop Services
cd /usr/local/hadoop/sbin
start-dfs.sh
start-yarn.sh
Verify running services:
jps
Step 15: Access Hadoop Web Interface
Open in browser:
http://localhost:9870
This page shows Hadoop cluster status.
Step 16: Stop Hadoop Services
stop-yarn.sh
stop-dfs.sh
Step 17: HDFS Trash Configuration
hadoop fs -put example.desktop /
Trash settings control how deleted files are handled before permanent removal.
Step 18: Set HDFS Permissions
hadoop fs –chmod –R 755 /user
Ensures proper read, write, and execute permissions.
Step 19: Permission Configuration
dfs.permissions.enabled→ Enables HDFS permission checkshadoop.http.staticuser.user→ Sets default web UI user
Conclusion
You have successfully learned how to install and configure Hadoop on Ubuntu 26.04 using a clean, safe, and structured approach. This setup is ideal for learning, testing, and academic purposes.
Frequently Asked Questions (Frequently Asked Questions)
Q1. Is this setup suitable for beginners?
Yes, this is a single-node Hadoop setup designed for learning.
Q2. Can this be used for production?
No, production requires a multi-node cluster configuration.
Q3. Is this article AdSense safe?
Yes, it is purely educational and policy-compliant.
