Facebook Presto is a recent interactive distributed query engine that supports simple SQL, supposedly 10 times times faster than MapReduce, and visits Presto homepage to gain insight into Presto
Presto it is best to use a directory that stores data specifically, this directory stores log and local metadata, and the separate data directory is useful for updating
Configure Presto
Node properties: Contains configuration for each node, meaning that each node's configuration is personalized
JVM Config: Contains options for the command line for the Java Virtual machine, one line, these commands are not explained, so if you include spaces or special characters, you will report an error onoutofmemoryerror, and then exit the Java Virtual machine, write a heap dump for Debuging then terminates the process, Presto compiles each query to a byte code so it generates a lot of classes files, Join Floatingdecimal-0.1.jar to Presto's Lib because, for example, hive float to text, but if you use this package, you can parse float to improve performance
Config properties: Contains configuration for Presto server, each Prosto server can act as both coordinator and worker, but on a larger cluster with a single Machine to provide coordinator work will get the best performance, datasources is supporting what type of catalog, such as Hive and JMX, Http-server.http.port specify HTTP port, internally also external, presto-metastore.db.filename specify H2 data stored metastore file address (this is still in development, should be only coordinator, but now version In this case, task.max-memory refers to the size of the memory used by each single task, which restricts the right branch of group By,join, the order by, and my guess is that there are sync points in these operations. So use the memory to cache. Setting the low will affect the running of the queries, set high will cause the JVM memory overflow, discovery-server.enabled with Discovery service to find node in the cluster, Each instance is registered with Discovery service at boot time, the default coordinator will run Discovery Service,discovery.uri is Ip+port, Because the default coordinator can be used to complete this function, the coordinator Ip+port is specified directly
Log level: Similar to Java log level, set debug to output debug log all
Catalog Properties:presto through connectors contact data, "mount" in catalogs, connector provide all schemas and table in catalog. For instance, Hive Connector will hive database
Mapped to a schema, so if the hive connector is mounted, you can access the Table,catalogs in hive to be registered in/etc/catalog, if the JMX connector is to be registered, Add File Jmx.properties
If you want to create a hive mount, first, create the hive.properties in/etc/catalog
Actual installation record:
1, first according to the said in the Presto_home set up etc directory, the establishment of documents Node.preperties
Note Node.environment If you should have the same name in a cluster, node.id must be node unique, Node.data-dir is the data directory to store logs and data
In my machine I set to:
Node.environment=production
Node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
Node.data-dir=/home/casa/maintenance/presto/data
2,JVM config, first in the ETC under the establishment of jvm.config files
Note Jvm.config If you want to process the float data type, join the Floatingdecimal-0.1.jar in Lib
The configuration in my machine is:
-server
-xmx16g
-xx:+useconcmarksweepgc
-xx:+explicitgcinvokesconcurrent
-xx:+cmsclassunloadingenabled
-xx:+aggressiveopts
-xx:+heapdumponoutofmemoryerror
-xx:onoutofmemoryerror=kill-9%p
-xx:permsize=150m
-xx:maxpermsize=150m
-xx:reservedcodecachesize=150m
-xbootclasspath/p:/home/casa/maintenance/presto/presto-server-0.52/lib/floatingdecimal-0.1.jar
3,config properties, first in the establishment of config.properties
Note that config.properties, because it is a stand-alone version, so coordinator should be true, the second datasources should be specific connector, the last one defaults to your machine's IP and 8080
The configuration in my machine is:
Coordinator=true
Datasources=jmx
http-server.http.port=8080
Presto-metastore.db.type=h2
Presto-metastore.db.filename=var/db/metastore
Task.max-memory=1gb
Discovery-server.enabled=true
discovery.uri=http://10.11.1.174:8080
4,log.properties, follow your own.
The configuration in my machine is:
#有四个选项DEBUG, Ingo,warn,error.
Com.facebook.presto=debug
5,catalog directory, first in the ETC under the establishment of catalog directory, and then according to your own situation, add connector, such as Java can increase jmx, so the corresponding, jmx.properties to be built in catalog
The configuration in my machine is:
Connector.name=jmx
Start at the end of the step above
Start Server:presto_home/bin/launcher Start
Start Client:1, first you want to download the executable file, Presto-cli-0.52-executable.jar
2, rename it to Presto, find a place to put it.
3 and then execute the client: Presto--server localhost:8080--catalog jmx--schema Default