Hive Learning HiveServer2 Server configuration and Startup

Source: Internet
Author: User
Tags min port number

In previous learning and practice hive, the CLI or hive–e approach was used, allowing only HIVEQL to perform queries, updates, and so on, and this was a clumsy and singular approach. Fortunately, Hive provides a light client implementation that, through Hiveserver or HiveServer2, allows the client to manipulate data in hive without starting the CLI, both allowing remote clients to use multiple programming languages such as Java, Python submits the request to hive to retrieve the results. Hiveserver or HiveServer2 are based on thrift, but Hivesever is sometimes called thrift server, and HiveServer2 does not. Since there is already hiveserver why do you need HiveServer2? This is because Hiveserver cannot handle concurrent requests from more than one client, which is due to limitations caused by the thrift interface used by Hiveserver and cannot be modified by modifying Hiveserver code. Therefore, rewriting the Hiveserver code in the Hive-0.11.0 version has been HiveServer2, which solves the problem. HIVESERVER2 supports multi-client concurrency and authentication, providing better support for open API clients such as JDBC and ODBC.

Now that HiveServer2 provides more powerful functionality, it will focus on learning, but it will also give you a quick look at how hiveserver is used. Enter Hive--service help in the command, as shown below. As you can see from the results, you can use hive <parameters>--service serviceName <serviceparameters> to start specific services such as the CLI, Hiverserver, Hiveserver2 and so on.

[hadoop@hadoop~]$ Hive--service  help
Usage ./hive<parameters>--service serviceName <service parameters>
Service list:beelinecli help hiveserver2 hiveserver hwi jar lineage Metastore Metatool ORCFILEDUMPRCF Ilecat schematool version
parametersparsed:
  --auxpath:auxillary jars
  --config:hive Configuration Directory
  --service:starts specificservice/component. CLI is default
Parameters used:
  hadoop_home or Hadoop_prefix:hadoop installdirectory
  hive_opt:hive options for help on
aparticular service:
  ./hive--s Ervice serviceName--help
Debug Help:  ./hive--debug--help

Enter Hive--service hiveserver–help on the command line to view the Help information for Hiveserver:

[hadoop@hadoop~]$ Hive--service hiveserver--help
starting hive Thrift Server
usage:hiveserver-
 h,--Help                        Print Help information
    --hiveconf <property=value> use   value for given property
    -- maxWorkerThreads <arg>      Maximum number of worker threads,
                                 default:2147483647
    --minworkerthreads <arg>      Minimum number of worker threads,
                                  default:100-
 p <port>                        Hive Server portnumber, default:10000-
 v,--verbose                     verbose mode

Starting the Hiveserver service, you can learn that the default Hiveserver runs on port 10000, a minimum of 100 worker threads, and a maximum of 2147483647 worker threads.

[hadoop@hadoop~]$ Hive--service hiveserver-v
starting hive Thrift Server
14/08/01 11:07:09warn conf. Hiveconf:deprecated:hive.metastore.ds.retry.* no longer has anyeffect.  Use Hive.hmshandler.retry.*instead
starting hive Serveron Port 10000 with min worker threads and 2147483647 Maxwor Ker Threads

Next, learn more powerful hiveserver2. HIVESERVER2 allows configuration management in configuration file Hive-site.xml, with the following parameters:

   hive.server2.thrift.min.worker.threads– minimum number of worker threads, default is 5.
   hive.server2.thrift.max.worker.threads– minimum number of worker threads, default is 500.
   Hive.server2.thrift.port–tcp's listening port, which defaults to 10000.
   hive.server2.thrift.bind.host–tcp bound host, default is localhost.

You can also set the environment variables Hive_server2_thrift_bind_host and Hive_server2_thrift_port override Hive-site.xml settings for the host and port number. Starting with Hive-0.13.0, HIVESERVER2 supports the transmission of messages over HTTP, which is particularly useful when there is proxy mediation between the client and server. The parameters related to HTTP transport are as follows:

  The hive.server2.transport.mode– default value is binary (TCP), which is an optional value of HTTP.
  Hive.server2.thrift.http.port–http's listening port, the default value is 10001.
  The endpoint name of the hive.server2.thrift.http.path– service, which defaults to Cliservice. The
  minimum worker thread in the hive.server2.thrift.http.min.worker.threads– service pool, which defaults to 5. The
  minimum worker thread in the hive.server2.thrift.http.max.worker.threads– service pool, which defaults to 500.

There are two ways to start Hiveserver2, one is the hive--service Hiveserver2 described above, and the other is more concise, hiveserver2. Use Hive--service hiveserver2–h or Hive--service hiveserver2–help to view Help information:

Starting HiveServer2
unrecognizedoption:-H
usage:hiveserver2-
 h,--help Print help                        information
    --hiveconf <property=value>   Use value for given property

By default, HiveServer2 executes the query as the user who submits the query (true), and if Hive.server2.enable.doAs is set to False, the query runs as the user running the Hiveserver2 process. To prevent memory leaks in non-encrypted mode, you can disable the file system cache by setting the following parameter to true:

   fs.hdfs.impl.disable.cache– disables the HDFs file system cache with the default value of FALSE.
   fs.file.impl.disable.cache– disables the local file system cache, the default value is False.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.