Tutorial: Installing a Apache Hadoop Single Node Cluster with Hortonworks Data Platform
Tutorial: Installing a Apache Hadoop Single Node Cluster with Hortonworks Data Platform
In this Tutorial I will show you a complete way how you can install your own small Hadoop Single Node Cluster with the Hortonworks Data Platform inside a Virtualbox. After the easy setup you can play around with the cluster and get some experience with it without the need to setup a new machine. It could also be a local development environment where you can debug your Map/Reduce jobs. The Hortonworks Data Platform is an 100% Open Source Apache Hadoop Distribution and comes with the following components:
- Hadoop Distributed File System (HDFS)
- MapReduce
- Apache Pig
- Apache Hive
- Apache HCatalog
- Templeton
- Apache HBase
- Apache ZooKeeper
- Apache Oozie
- Apache Sqoop
- Ganglia
- Nagios
Install Virtualbox
- The first step is the installation of the Virtualbox Software, which can be downloaded here. Please choose the installation binaries for your operating system.
- Install Virtualbox with default options.
- Download the ISO for CentOS 6.3 from your favourite mirror. (Maybe you take directly this one).
- Install the ISO-file in your Virtualbox. You will find detailed setup instructions here.
- Before you start the virtual machine make sure that you configure the following settings:
- Main memory: 4096 MB
- Disk space: 16 GB
- Enable the bridged network adapter
- Enable IOAPIC
- Start the Virtual machine
Install CentOS
- When everthing is working correctly then CentOS will start the installation process.
- Please chosse “Install or upgrade an existing system” from the list.
- For the hostname leave the default “localhost.localdomain”.
- Skip the media test.
- Choose the installation type “Minimal Desktop”.
- Create a user for the cluster (e.g. hadoop).
- After the successful setup reboot your virtual system and login as root.
Prepare the HMC Single Node Cluster Setup
- Change the keyboard layout to the correct language through “System->Administration->Keyboard”.
- Disable the firewall.
- Disable SELinux.
- Change SELINUX=enforcing to SELINUX=disabled.
- Configure ntpd to start at bootup.
- Edit the File “/etc/hosts” so that it looks like in the following screenshot. It is important that the first entry is “localhost.localdomain”, otherwise the HMC-Setup will not work, because you will get a problem with the hostname resolution.
- Type “hostname -f” in the terminal. It should be “localhost.localdomain”.
- Type “hostname -s” in the terminal. It should be “localhost”.
- Start the ssh-Service with
- Make sure that sshd ist started automatically on startup.
- Prepare password-less SSH Login for the root user to localhost.
- Check that password-less login works with
- Create a text file “hostdetail.txt” with the host names that will be part of your cluster. In our example with only one Node it should only contain this entry:
- When you want to use a GUI-Editor to edit the file then you will get this error. Just install your favourite editor, e.g. gedit. Just follow the instructions.
- After this preparation it’s recommended to make a snapshot of your actual system so that you can come back to this point when something goes wrong with the current installation.
chkconfig iptables off
chkconfig ip6tables off
vi /etc/selinux/config
chkconfig ntpd on
/sbin/service sshd start
chkconfig sshd on
ssh-keygen
ssh-copy-id localhost
chmod 700 .ssh
chmod 640 authorized_keys
ssh localhost
localhost.localdomain
Install Hortonworks Data Platform with HMC
- Download the RPM (Please verify if there is a newer version on this page)
- Install “Extra Packages for Enterprise Linux (EPEL)”.
- Install HMC.
- Check the installation status with
- Start the HMC service. You will be prompted to agree to the Oracle Java License and download the binaries.
- Stop the firewall
- Proceed to the final installation step.
rpm -Uvh http://public-repo-1.hortonworks.com/HDP-1.1.1.16/repos/centos6/hdp-release-1.1.1.16-1.el6.noarch.rpm
yum install epel-release
yum install hmc
rpm -qa | grep hmc
service hmc start
/etc/init.d/iptables stop
Provisioning Your Cluster
- Go to the main page of the Hortonworks Management Center (HMC). Maybe you replace “localhost” with the IP from your Virtual machine host, when you access it from outside.
- Follow the wizard instructions
- When you are prompted to specify the Disk Mount Point then choose another as proposed in the wizard. For example “/data”.
- When the installation was successful you should see this screen
- When there is an error then the following logfiles are maybe helpful for troubleshooting:
- You can now go to the dashboard and check the status of your cluster:
- To safely shutdown your Cluster please stop all services in the HMC and then you can stop your Virtual machine.
- When you restart your system you can start HMC again by issuing the following commands:
- To run the HMC Service on startup follow the steps described here (optional).
http://localhost/hmc/html
/var/log/hmc/hmc.log
/var/log/puppet_apply.log
service hmc start
service hmc-agent start
Installing A Apache Hadoop Single Node Cluster With Hortonworks Data Platform >>>>> Download Now
ResponderEliminar>>>>> Download Full
Installing A Apache Hadoop Single Node Cluster With Hortonworks Data Platform >>>>> Download LINK
>>>>> Download Now
Installing A Apache Hadoop Single Node Cluster With Hortonworks Data Platform >>>>> Download Full
>>>>> Download LINK gi