Spark build cluster is cumbersome, need more content, here mainly from CentOS, Hadoop, Hive, ZooKeeper, Kafka of the server environment to start speaking. The construction of CentOS is not specifically said, mainly on the cluster configuration.
Environment Building software package
As I was directly took three off-the-shelf centos 5.6 system to build, so the CentOS build does not say, there is a need to self-search online, but also relatively simple. Of course, some of the following tools can also be used in your already handy kit O (∩_∩) o~~
- Centos 5.6 (Linux server)
- JDK 1.7 (Java Development environment)
- Xs Hell 5 (Windows connected to Linux)
- Xftp 4 (Windows uploads Linux)
- Hadoop 2.4.1
- Hive 0.13
- ZooKeeper 3.4.5
- Kafka 2.9.2-0.2.1
- Spark 1.3.0
CentOS Server Configuration
After installing the three CentOS operating systems, we are going to do a ssh-free login to three servers.
First, we set a name for three servers on a temporary basis. Take one for example, the other two leaf out are OK.
- Set host name (three servers)
[[email protected] ~]# hostname // display hostname localhost~]# sudo hostname spark1 Set the host name to Spark1, the other two servers can be set to SPARK2, Spark3// If you want to permanently modify this can be [[email protected] ~]# vi/etc/ sysconfig/Network// Edit file as follows Hostname=spark1 #修改localhost. Localdomain to Spark1
Continue to modify the/etc/hosts file after Setup is complete
[[email protected] ~]# vi/etc/hosts[ip address] spark1
If it is permanently modified, to restart the server, temporary modification will not be used, or restart the end by restoring.
After Setup we can ping the address to see if it is OK.
[[email protected] ~]# Ping Spark1
- Shut down the firewall (three servers)
[[Email protected] ~~~]# vi/etc/selinux/config // modify selinux in config file =disabled
- Set up password-free ssh login (three servers)
After we set up the three CentOS server hostname, we continue to enter three servers and configure the hostname mapping of the other 2 servers in the/etc/hosts file.
[Email protected] ~]# vi/etc/hosts
[IP address 11] spark1 // pre-configured [IP address 12] SPARK2 [IP address 13] SPARK3
Then we set up password-free SSH login.
[[email protected] ~]# ssh-keygen-t RSA // set SSH login password
Return to the bottom without setting the password. Then execute the following command.
[[email protected] ~]# cd/root/.ssh // generated key file is automatically placed in this folder [[email protected] ~~]# Lsauthorized_keys id_rsa id_rsa.pub known_hosts// at this point we have no password to login. ~]# SSH Spark1
Then we pass the SSH key to other 2 other servers so that we can login with each other password-free ssh.
[[email protected] ~]# ssh-copy-id-i SPARK2 // copy ssh key to SPARK2// The first time you need to enter SPARK2 login password, follow the prompts to complete // completed ~]# ssh spark2
Found that can be password-free login to the SPARK2 server, the other two servers in turn 22 password-free SSH login settings completed.
Spark Primer to Mastery-(seventh) environment Setup (server Setup)