Today's Question:  What weekend projects have you created?        GIVE A SHOUT

Technical Article => Operating System =>  Linux/Unix

Build Hadoop environment in Linux

  sonic0002      2013-07-31 23:22:27      3,423    1

Hadoop standalone installation:

1. Install JDK

Install JDK with below command:

sudo apt-get install sun-java6-jdk

Configure Java environment, open /etc/profile, add below contents:

export JAVA_HOME = (Java installation directory)
export PATH = "$JAVA_HOME/:PATH"

Verify installation of Java

Type java --version, if it outputs Java version information, then Java is successfully installed.

2. Install SSH

Install SSH with below command:

sudo apt-get install ssh

Configure SSH to login to local PC without password:

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

Press "Enter", two files will be created in ~/.ssh/ : id_rsa and . These two files appear in pair, similar to the key and lock.

Then add the to the authorized keys:

cat ~/.ssh/ >> ~/.ssh/authorized_keys

Verify installation of SSH:

type ssh localhost, if it shows login success, then SSH is successfully installed.

3. Switch off firewall

sudo ufw disable

Note: This step is very important, if the firewall is not switched off, then you may encounter cannot find datanode issue.

4. Install Hadoop(Take version 0.20.2 as an example)

Download Hadoop from

Install and configure Hadoop

Single node configuration:

There is no configuration needed for single node Hadoop. In this mode, Hadoop will be considered as a single Java process.

Pseudo-Distributed Mode

Pseudo-Distributed Mode is a cluster with only one node. In this cluster, the local machine is the master as well as the slave, it's the namenode as well as the datanode and it's the jobtracker as well as the tasktracker.


Modify below files in conf directory:


Add exportJAVA_HOME = (JAVA installation directory)

In core-site.xml, modify below contents:

	<!-- global properties -->

	<!-- file system properties -->

In hdfs-site.xml, modify below contents:


In mapred-site.xml, modify below contents:


Format Hadoop file system:

bin/hadoopnamenode -format

Start Hadoop:


Verify installation of Hadoop. Type below URL in browser. If they can be opened normally, then Hadoop is successfully installed.



5, Run instance

Create two files locally:

echo "Hello World Bye World" > file01
echo "Hello Hadoop Goodbye Hadoop" > file02

Create an input directory in hufs:

hadoop fs -mkdir input

Copy file01 and file02 to hufs:

hadoop fs -copyFromLocal /home/zhongping/file0* input

Run wordcount:

hadoop jar hadoop-0.20.2-examples.jarwordcount input output

Check result:

hadoop fs -cat output/part-r-00000

Source :



Share on Facebook  Share on Twitter  Share on Google+  Share on Weibo  Share on Reddit  Share on Digg  Share on Tumblr    Delicious



tao [Reply]@ 2013-08-05 00:25:16

what's hadoop?


My personal assistant

By sonic0002