Steps to Verify the One Box Setup of Hadoop

Steps to Verify the One Box Setup of Hadoop

  •         Start Virtual Box, Choose the machine you prepared in earlier step and click on the “Start” button ( green color ).


  • If asked for please enter password ‘abcd1234’


  • Click on the Ubuntu on the top-left corner and look for terminal and click on the terminal


  • Once the terminal is up and running it should look similar to following –


  • Login as hduser user.
    • su hduser
    • password ‘abcd1234’
  • Go to home directory and take a look on the directory presents
    • cd /home/hduser
    • ‘pwd’ command should show path as ‘/home/hduser’.
    • execute ‘ls -lart’ to take a look on the files and directory in general.
  • Start hadoop
    • cd /usr/local/hadoop/sbin/
    • ./
  • Confirm that serivce is running successfully or not
    • run ‘jps’ – you should see something similar to following –


  • Go to the cd /home/hduser/example/WordCount1/
  • Run command ‘ls’ – if there is a directory named ‘build’ please delete that and recreate the same directory. This step will ensure that your program does not uses precompiled jars and other files
    • rm -rf build
    • mkdir build
  • Set JAVA_HOME and update PATH
  • Build the example ( please make sure that when you copy – paste it does not leave any space between the command) –
    • javac -classpath /usr/local/hadoop/share/hadoop/common/hadoop-common 2.6.0.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0.jar -d build


  • Create Jar –
    • jar -cvf wcount.jar -C build/ .
  • Now prepare the input for the program ( please give ‘output’ directory your own name – it should not be existing earlier )
    • Make your own input directory –
    • hadoop dfs -mkdir /user/hadoop/input
    • Copy the input files ( file1, file2, file3 ) to hdfs location
    • hadoop dfs -put file* /user/hduser/input
    • Check if the output directory already exists.
      • hadoop dfs -ls /user/hduser/output
      • In the below screen shot it already exists
    • If it already existing delete with the help of following command –
      • dfs -rm /user/hduser/output/*
      • hadoop dfs -rmdir /user/hduser/output
  • Run the program
    • hadoop jar wcount.jar org.myorg.WordCount /user/hduser/input/ /user/hduser/output
    • At the end you should see something similar –


  • Check if the output files have been generated


  • hadoop dfs -ls /user/hduser/output     – you should see something similar to below screenshot


  • Get the contents of the output files –
    • hadoop dfs -cat /user/hduser/output/part-r-00000


  • Verify the word count with the input files-
    • cat file1 file2 file3
    • The words count should match.