
Hadoop uses SSH (to access its nodes) which would normally require the user to enter a password. If Java has been installed, this should display the version details as illustrated in the following image: Java Verification 2. After executing these commands, execute the following command to verify that Java has been installed: java -version

These commands will update the package information on your VPS and then install Java. Hadoop requires Java to be installed, so let’s begin by installing Java: apt-get update However, since this process requires editing multiple configuration and setup files, make sure that each step is properly followed.

Installing and getting Hadoop up and running is quite straightforward. This is possible because Hadoop transfers the responsibility of fault tolerance from the hardware layer to the application layer. One of the most important features of Hadoop is that it allows you to save enormous amounts of money by substituting cheap commodity servers for expensive ones. New nodes can be added incrementally without having to worry about the change in data formats or the handling of applications that sit on the file system. The architecture of Hadoop allows you to scale your hardware as and when you need to. It handles the assignment of work to different nodes in the cluster. MapReduce is the framework that orchestrates all of Hadoop’s activities. These nodes could be on a single VPS or they can be spread across a large number of virtual servers. This file system spans across all the nodes that are being used by Hadoop. HDFS is the filesystem that is used by Hadoop to store all the data on. Two of the main components of Hadoop are HDFS and MapReduce. Hadoop is a framework (consisting of software libraries) which simplifies the processing of data sets distributed across clusters of servers. Use the ‘Console Access’ from the Digital Ocean Droplet Management Panel You will need to execute commands from the command line which you can do in one of the two ways:

The only prerequisite for this tutorial is a VPS with Ubuntu 13.10 圆4 installed.
