Hadoop

From Cncz
Jump to navigation Jump to search

Apache Hadoop documentation

Setup terminal rooms

An Ubuntu package for hadoop (downloaded from ftp.nluug.nl) has been added to the science ubuntu repository.

hadoop_1.1.1-1_x86_64.deb

Local users

uid: 201 for hdfs
uid: 202 for mapred
gid:  49 for hadoop

In /etc/hadoop/hadoop-env.sh, the HADOOP_CLIENT_OPTS environment variable has been changed from -Xmx128m to -Xmx1024m.

Stand-alone test

With this setup, we could successfully run the example job:

$ cd /scratch/
$ mkdir input 
$ cp /usr/share/hadoop/templates/conf/*.xml input 
$ hadoop jar /usr/share/hadoop/hadoop-examples-1.1.1.jar grep input output 'dfs[a-z.]+' 
$ cat output/*