Install system Environment
Linux version: Redhat6
jdk:jdk1.7
1. Local Installation and testing
1.1 Installation
1.1.1 Download Drill M1 binary Release
Http://people.apache.org/~jacques/apache-drill-1.0.0-m1.rc3/apache-drill-1.0.0-m1-binary-release.tar.gz
1.1.2 Unpack apache-drill-1.0.0-m1-binary-release.tar.gz and make links
TAR-ZXF apache-drill-1.0.0-m1-binary-release.tar.gz
Do link links
Ln-s APACHE-DRILL-1.0.0-M1 Drill
1.1.3 Configuration Environment variables
Export Drill_home=/home/{username}/drill
Export path= $PATH: $DRILL _home/bin
1.2 Test
1.2.1 Connection
[sudo] sqlline-u jdbc:drill:schema=parquet-local-n admin-p admin
Parsing: Schema Primitives define 5 types:
Parquet-local (local parquet), PARQUET-CP (Classpath-parquet), jsonl (local JSON), Parquet (Classpath-parquet), parquet
Specific definition, reference Conf/storage-engines.json
1.2.2 Exit
Jdbc:drill:schema=parquet-local>!q
1.2.3 Run a query
SELECT * from "Sample-data/region.parquet";
Statement Guide
Https://developers.google.com/bigquery/query-reference
Https://cwiki.apache.org/confluence/display/DRILL/Running+Queries
2. Distributed Installation and Testing
2.1 Installation
2.1.1. Installing Hadoop
The native supported version of the current drill is hadoop1.2
http://litongbupt.iteye.com/blog/1473179
http://litongbupt.iteye.com/blog/1473265
Start Hadoop
2.1.2. Install Zookeeper
The website recommends installs Zookeeper3.4.3, after the author tests, 3.4.5 also can use.
Deploy and start zookeeper
http://litongbupt.iteye.com/admin/blogs/1987737
2.1.3 Deploy drill Distributed schema Modify conf/drill-override.conf file Zk:connect: "{Zookeeper address}:2181" Modify Conf/storage-engines File
"Parquet":
{
"Type": "Parquet",
"Dfsname": "Hdfs://{hadoop Namenode address}:9000"
},
"JSON":
{
' type ': ' JSON ',
"Dfsname": "Hdfs://{hadoop Namenode address}:9000"
Copy the drill directory to the other node to copy the. BASHRC to the other node start Drill:sudo drillbit.sh started at each node
2.2 Test
2.2.1 Test drill Whether the cluster started successfully
Zkcli.sh-server {Zookeeper Address}:2181
Get/drill/drillbits1
Czxid = 0x100000003
CTime = Tue Dec 10:18:42 CST 2013
Mzxid = 0x100000003
Mtime = Tue Dec 10:18:42 CST 2013
Pzxid = 0x10000001c
Cversion = 12
dataversion = 0
aclversion = 0
Ephemeralowner = 0x0
datalength = 0
Numchildren = 4
This test was Numchildren = 4 nodes
2.2.2 Test Query
Put the data on the HDFs Hadoop fs-put sample-data/
Link Cluster sqlline-u Jdbc:drill:schema=parquet
SELECT _map[' R_regionkey '] as Region_key, _map[' R_name '] as NAME, _map[' r_comment ' as COMMENT from '/sample-data/region. Parquet ";
SELECT count (Distinct _map[' n_regionkey ') from "/sample-data/nation.parquet";
SELECT _map[' N_regionkey '] as Regionkey, _map[' N_name ' as NAME from '/sample-data/nation.parquet ' WHERE cast (_map[' N_ NAME '] as varchar < ' M ';
2.3 Off the cluster
2.3.1 Shutdown Drill Cluster
Perform sudo drillbit.sh stop on each node
2.3.2 Off Zookeeper
Perform sudo zkserver.sh stop on each node
2.3.3 is executed on Namenode.
sudo stop-all.sh