Difference between revisions of "Hadoop"

From Cncz
Jump to: navigation, search
(Stand-alone test)
(Stand-alone test)
Line 19: Line 19:
 
  $ cd /scratch/
 
  $ cd /scratch/
 
  $ mkdir input  
 
  $ mkdir input  
  $ cp /usr/share/hadoop/templates/conf/*.xml input  
+
  $ cp /usr/share/hadoop/templates/conf/*.xml input # heeft niks met configuratie te maken, dit is het genereren van input data
 
  $ hadoop jar /usr/share/hadoop/hadoop-examples-*.jar grep input output 'dfs[a-z.]+'  
 
  $ hadoop jar /usr/share/hadoop/hadoop-examples-*.jar grep input output 'dfs[a-z.]+'  
 
  $ cat output/*
 
  $ cat output/*

Revision as of 16:36, 28 March 2013

Apache Hadoop links

Setup terminal rooms

An Ubuntu package for hadoop (downloaded from ftp.nluug.nl) has been added to the science ubuntu repository.

hadoop_1.1.1-1_x86_64.deb

Local users

uid: 201 for hdfs
uid: 202 for mapred
gid:  49 for hadoop

In /etc/hadoop/hadoop-env.sh, the HADOOP_CLIENT_OPTS environment variable has been changed from -Xmx128m to -Xmx1024m.

Stand-alone test

With this setup, we could successfully run the example job:

$ cd /scratch/
$ mkdir input 
$ cp /usr/share/hadoop/templates/conf/*.xml input # heeft niks met configuratie te maken, dit is het genereren van input data
$ hadoop jar /usr/share/hadoop/hadoop-examples-*.jar grep input output 'dfs[a-z.]+' 
$ cat output/*