Hadoop tutorial (1) ---- use VMware to install CentOS
1. Overview
My Learning Environment-install four CentOS systems (used to build a Hadoop cluster) under the vmwarevm. One of them is the Master, three are the Slave, and the Master is the NameNode in the Hadoop cluster, three Slave as DataNode. At the same time, we s
Use Cloudera Manager to install Hadoop
Hadoop is composed of many different services (such as HDFS, Hive, HBase, Spark, and so on). These services also have some dependencies. If you directly download the original Apache package, it is troublesome to download multiple times and configure multiple times. As a result, some companies have customized
Use protocolbuffer with lzo in hadoop (2) 1. lzo Introduction
Lzo is a type of encoding with high compression ratio and extremely high compression rate. It features
The decompression speed is very fast.Lzo is lossless compression, and the compressed data can be accurately restored.Lzo is block-based and allows data to be split into chunks, which can be decompressed in parallel.
For installation instructio
/ambari-ddl-mysql-create.sql;
CREATE USER ' oozie ' @ ' localhost ' identified by ' Oozie ';
GRANT all privileges in *.* to ' oozie ' @ ' localhost ';
CREATE USER ' oozie ' @ '% ' identified by ' Oozie ';
GRANT all privileges in *.* to ' oozie ' @ '% ';
CREATE USER ' Oozie ' @ ' hdp01.test ' identified by ' Oozie ';
GRANT all privileges in *.* to ' Oozie ' @ ' hdp01.test ';
FLUSH privileges;
Create DATABASE Oozie;
2)
sudo vi/etc/mysql/my.cnf
Add ' # ' to comment out "bind-address = 127.0
to "randomly throw at figure2", that is, how to achieve equal probability of each vertex in figure2 being cast.In the hadoop examples code, Halton sequence is used to ensure this. For details about Halton sequence, refer to Wikipedia.Here I will summarize the role of Halton sequence: In the square where 1 is multiplied by 1, there are non-repeated and even points. The abscissa and ordinate values of each point are between 0 and 1.This ensures that "r
Prerequisites for using FPGA on Yarn
Yarn currently only supports FPGA resources released through intelfpgaopenclplugin
The driver of the supplier must be installed on the machine where the yarn nodemanager is located and the required environment variables must be configured.
Docker containers are not supported yet.
Configure FPGA Scheduling
InResource-types.xml,Add the following configuration
In the yarn-site.xml,DominantresourcecalculatorMust be configured to enable FPGA scheduling and is
Use Hadoop ACL to control access permissions.Use Hadoop ACL to control access permissions 1. HDFS Access Control
Hdfs-site.xml settings startup acl
Core-site.xml sets user group default permissions.
The requirements and solutions are as follows:
1. Apart from the data warehouse owner, normal users cannot create databases or tables in the default database.The
Use sqoop to import mysql Data to hadoop
The installation and configuration of hadoop will not be discussed here.Sqoop installation is also very simple. After sqoop is installed, you can test whether it can be connected to mysql (Note: The jar package of mysql should be placed under SQOOP_HOME/lib): sqoop list-databases -- connect jdbc: mysql: // 192.168.1.109: 3
PreviousArticleThis section describes various streaming parameters.
Example of submitting a hadoop task:
$ Hadoop_home/bin/hadoop streaming \
-Input/user/test/input-output/user/test/output \
-Mapper "mymapper. Sh"-reducer "myreducer. Sh "\
-File/Home/work/mymapper. Sh\
-File/home/work/mycer Cer. Sh\
-Jobconf mapred. Job. Name = "file-demo"
The preceding command submits a
Java_homeExport Java_home=/usr/lib/jvm/java-8u5-sunForget java_home can be used:Echo $JAVA _home(2) Add the Bin folder under the JDK directory to the environment variableExport path= $JAVA _home/bin: $PATH(3) Adding Hadoop_classpath to environment variablesExport Hadoop_classpath= $JAVA _home/lib/tools.jarCompiling the Wordcount.java file.. /bin/hadoop Com.sun.tools.javac.Main Wordcount.javaWhere Com.sun.tools.javac.Main is an instance of building a
success:mysql-h172.16.77.15-uroot-p123 mysql-h host address-u user name-P user PasswordView Character SetsShow variables like '%char% ';To Modify a character set:VI/ETC/MY.CNF add Default-character-set=utf8 under [client]create sudo without password loginTo set the Aboutyun user with no password sudo permissions: Chmode u+w/etc/sudoersaboutyun all= (root) nopasswd:allchmod u-w/etc/sudoers test: sudo ifconfigUbuntu View Service List codesudo service--status-allsudo initctl listTo view the file s
When dealing with complex business in Hadoop, it is necessary to use the composite key, which is different from the simple inherited writable interface, but inherits the WritablecomparableOn the Source:Public interface writablecomparablePublic interface Writable { void write (DataOutput out) throws IOException; void ReadFields (Datainput in) throws IOException;}Public interface ComparableThe following is
Easyreport is an easy-to-use Web Reporting tool (supporting hadoop,hbase and various relational databases) whose main function is to convert the row and column structure queried by SQL statements into an HTML table (table) and to support cross-row (RowSpan) and cross-columns ( ColSpan). It also supports report Excel export, chart display, and fixed header and left column functions. The overall architecture
How to Use Hadoop MapReduce to implement remote sensing product algorithms with different complexity
The MapReduce model can be divided into single-Reduce mode, multi-Reduce mode, and non-Reduce mode. For exponential product production algorithms with different complexity, different MapReduce computing modes should be selected as needed.
1) low-complexity product production Algorithms
For the production
Recently using vagrant to build a Hadoop cluster with 3 hosts, using Cloudera Manager to manage it, initially virtualized 4 hosts on my laptop, one of the most Cloudera manager servers, Several other running Cloudera Manager Agent, after the normal operation of the machine, found that the memory consumption is too strong, I intend to migrate two running Agent to another working computer, then use the Vagant
BackgroundCompany data processing has two computing frames, single frame and Mr Framework. Now I have abstracted a set of API interface for business computing developers to use. The implementation of API execution scheduling is carried out in two computing frames respectively. The application developer has time to adjust the business calculation parameters by uploading the override configuration file. The stand-alone framework is easy to implement, bu
Address: http://blog.cloudera.com/blog/2013/04/how-to-use-vagrant-to-set-up-a-virtual-hadoop-cluster/
Vagrant is a very useful tool that can be used to program and manage multiple virtual machines (VMS) on a single physical machine ). It supports native virtualbox and provides plug-ins for VMWare Fusion and Amazon EC2 Virtual Machine clusters.
Vagrant provides an easy-to-
IOException, interruptedexception {//TODO auto-generated Method StubSuper.setup (context); Uri[] Cachefile=Context.getcachefiles (); Path Tagsetpath=NewPath (cachefile[0]); Path Tagedurlpath=NewPath (cachefile[1]); File operations (such as reading the contents into set or map); } @Override Public voidmap (longwritable key, Text value, Context context) throws IOException, Interruptedexception {in Using the data read out in map (); }Similarly, if you want to read the distributed cache fil
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.