Hive Learning HiveServer2 Server configuration and Startup

Last Update:2018-07-20 Source: Internet

Author: User

Tags min port number

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In previous learning and practice hive, the CLI or hive–e approach was used, allowing only HIVEQL to perform queries, updates, and so on, and this was a clumsy and singular approach. Fortunately, Hive provides a light client implementation that, through Hiveserver or HiveServer2, allows the client to manipulate data in hive without starting the CLI, both allowing remote clients to use multiple programming languages such as Java, Python submits the request to hive to retrieve the results. Hiveserver or HiveServer2 are based on thrift, but Hivesever is sometimes called thrift server, and HiveServer2 does not. Since there is already hiveserver why do you need HiveServer2? This is because Hiveserver cannot handle concurrent requests from more than one client, which is due to limitations caused by the thrift interface used by Hiveserver and cannot be modified by modifying Hiveserver code. Therefore, rewriting the Hiveserver code in the Hive-0.11.0 version has been HiveServer2, which solves the problem. HIVESERVER2 supports multi-client concurrency and authentication, providing better support for open API clients such as JDBC and ODBC.

Now that HiveServer2 provides more powerful functionality, it will focus on learning, but it will also give you a quick look at how hiveserver is used. Enter Hive--service help in the command, as shown below. As you can see from the results, you can use hive <parameters>--service serviceName <serviceparameters> to start specific services such as the CLI, Hiverserver, Hiveserver2 and so on.

[hadoop@hadoop~]$ Hive--service  help
Usage ./hive<parameters>--service serviceName <service parameters>
Service list:beelinecli help hiveserver2 hiveserver hwi jar lineage Metastore Metatool ORCFILEDUMPRCF Ilecat schematool version
parametersparsed:
  --auxpath:auxillary jars
  --config:hive Configuration Directory
  --service:starts specificservice/component. CLI is default
Parameters used:
  hadoop_home or Hadoop_prefix:hadoop installdirectory
  hive_opt:hive options for help on
aparticular service:
  ./hive--s Ervice serviceName--help
Debug Help:  ./hive--debug--help

Enter Hive--service hiveserver–help on the command line to view the Help information for Hiveserver:

[hadoop@hadoop~]$ Hive--service hiveserver--help
starting hive Thrift Server
usage:hiveserver-
 h,--Help                        Print Help information
    --hiveconf <property=value> use   value for given property
    -- maxWorkerThreads <arg>      Maximum number of worker threads,
                                 default:2147483647
    --minworkerthreads <arg>      Minimum number of worker threads,
                                  default:100-
 p <port>                        Hive Server portnumber, default:10000-
 v,--verbose                     verbose mode

Starting the Hiveserver service, you can learn that the default Hiveserver runs on port 10000, a minimum of 100 worker threads, and a maximum of 2147483647 worker threads.

[hadoop@hadoop~]$ Hive--service hiveserver-v
starting hive Thrift Server
14/08/01 11:07:09warn conf. Hiveconf:deprecated:hive.metastore.ds.retry.* no longer has anyeffect.  Use Hive.hmshandler.retry.*instead
starting hive Serveron Port 10000 with min worker threads and 2147483647 Maxwor Ker Threads

Next, learn more powerful hiveserver2. HIVESERVER2 allows configuration management in configuration file Hive-site.xml, with the following parameters:

   hive.server2.thrift.min.worker.threads– minimum number of worker threads, default is 5.

   hive.server2.thrift.max.worker.threads– minimum number of worker threads, default is 500.
   Hive.server2.thrift.port–tcp's listening port, which defaults to 10000.
   hive.server2.thrift.bind.host–tcp bound host, default is localhost.

You can also set the environment variables Hive_server2_thrift_bind_host and Hive_server2_thrift_port override Hive-site.xml settings for the host and port number. Starting with Hive-0.13.0, HIVESERVER2 supports the transmission of messages over HTTP, which is particularly useful when there is proxy mediation between the client and server. The parameters related to HTTP transport are as follows:

  The hive.server2.transport.mode– default value is binary (TCP), which is an optional value of HTTP.
  Hive.server2.thrift.http.port–http's listening port, the default value is 10001.

  The endpoint name of the hive.server2.thrift.http.path– service, which defaults to Cliservice. The
  minimum worker thread in the hive.server2.thrift.http.min.worker.threads– service pool, which defaults to 5. The
  minimum worker thread in the hive.server2.thrift.http.max.worker.threads– service pool, which defaults to 500.

There are two ways to start Hiveserver2, one is the hive--service Hiveserver2 described above, and the other is more concise, hiveserver2. Use Hive--service hiveserver2–h or Hive--service hiveserver2–help to view Help information:

Starting HiveServer2
unrecognizedoption:-H
usage:hiveserver2-
 h,--help Print help                        information
    --hiveconf <property=value>   Use value for given property

By default, HiveServer2 executes the query as the user who submits the query (true), and if Hive.server2.enable.doAs is set to False, the query runs as the user running the Hiveserver2 process. To prevent memory leaks in non-encrypted mode, you can disable the file system cache by setting the following parameter to true:

   fs.hdfs.impl.disable.cache– disables the HDFs file system cache with the default value of FALSE.
   fs.file.impl.disable.cache– disables the local file system cache, the default value is False.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More