Getting Started with HADOOP - Single Server, Multiple Node Simulation

Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.

1. Install Java SDK if not already
java -version
In my case java version "1.6.0_22" was available
whereis javac

If you don't have a recent java sdk installed download jdk and install

Make sure to note the location where you install for use later in setting JAVA_HOME in my case JAVA_HOME was set to /usr

2. Do some basic tasks to make life easy

mkdir /hadoop
useradd hadoop
passwd hadoop
groupadd hadoop

3. Get stable release of hadoop, install in to /hadoop/hadoop
cd /hadoop

tar -xvf  hadoop-1.0.3.tar.gz

mv hadoop-1.0.3 hadoop
export HADOOP_INSTALL=/hadoop/hadoop
Set these up in your login profile to ensure they are set each time you login.

4.Setup hadoop local variables

cd /hadoop/hadoop/conf
export JAVA_HOME=/usr

verify its working
hadoop version

5. Setup local configuration files

mkdir -p /var/hadoop/cache/

vi /hadoop/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<!-- core-site.xml -->

vi /hadoop/hadoop/conf/hdfs-site.xml

<?xml version="1.0"?>
<!-- hdfs-site.xml -->


vi /hadoop/hadoop/conf/mapred-site.xml
<?xml version="1.0"?>
<!-- mapred-site.xml -->

6. Setup no-password required, SSH access to local machine
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/ >> ~/.ssh/authorized_keys
ssh localhost # test

7. Format a new distributed file system
hadoop namenode -format

8. Start the hadoop daemons

9. Test it out, create a folder /testdir, take files in the hadoop conf folder, with the .xml extension and move them in to testdir/conf
mkdir /testdir
cp conf/*.xml /testdir
cd /
hadoop fs -put testdir input
# Check if new distributed folder input exists
hadoop fs -ls
# Check contents of distributed folder
hadoop fs -ls input

10. To view namenode for the pseudo cluster


To view the jobtracker for the pseudo cluster

11. To view contents of files you've added to the DFS
hadoop fs -cat input/conf/*

12. To remove files from DFS
hadoop fs -rm input/conf/*


Popular posts from this blog

ActiveMQ, easy to use open source message oriented middleware (MOM)

Basic Send Message to MQ with Java and IBM MQ JMS

MySQL Error Invalid Table or Database Name