Impala Configuration and Error resolution

Source: Internet
Author: User
Tags error code socket hadoop fs
Installation Environment

Version 2.1.0 corresponds to CDH5.3.0
Impala is a CDH component, and the other Hadoop environment (HDFS, yarn, hive) is ready to install directly through Yum, where download address Impala downloads

Installation content:
The installed user is: root
Hdname (Hive metadata node resides)
Impala Impala-server Impala-state-store Impala-catalog Impala-shell
Other nodes
Impala-server Impala-shell Permissions configuration (all machines should be)

Users and groups named Impala are created during Impala installation and do not delete the users and groups.
If you want Impala and YARN to work together, you need to add Impala users to the HDFs group, and the relevant llama project is known.
Impala when performing a DROP TABLE operation, the file needs to be moved to the HDFs Recycle Bin, so you need to create an HDFs directory/user/impala and set it to Impala user writable. Similarly, Impala needs to read the data under the Hive Data Warehouse, so it needs to add Impala users to the Hive group. Add Satellite Group command

    Usermod-g Hive,hdfs,hadoop Impala

The result is as shown
Create a directory of Impala on HDFs and set permissions

Sudo-u HDFs Hadoop fs-mkdir/user/impala
sudo-u hdfs Hadoop fs-chown Impala/user/impala
Set Scoket path

Create a/var/run/hadoop-hdfs on each node
Kdir-p/var/run/hadoop-hdfs
Note: The folder may already exist and you should confirm that you have permission to read and write with Impala
If it already exists, add user Impala to the group to which the file belongs, and modify the permissions for that filegroup: chmod 775/var/run/hadoop-hdfs mysql driver

Driver Download Address mysql-connector-java-5.1.30.tar.gz
Copy the downloaded file to/usr/share/java/and modify the name Mysql-connector-java.jar
Select this file path because the default path for Impala is this, you can view the parameters in/etc/default/impala Mysql_connector_jar

#配置文件设置
Configuration file exists in two places
##/etc/default/impala
This file is the default configuration for Impala, contains the machine information and the associated path configuration, including all the configuration files of JDBC, Impala, which need to be modified is the host information of Metastore and catalog two service installation, the final result is as follows, the highlighted part is the modification place, Where Hdname is the host name of the machine where the catalog and State-store components of Impala are installed, that is, the machine where the metadata MySQL resides

==impala_catalog_service_host=hdname== ==impala_state_store_host=hdname== impala_state_store_port=24000 IMPALA_ backend_port=22000 Impala_log_dir=/var/log/impala impala_catalog_args= "-log_dir=${impala_log_dir} ==-state_store_ host=${impala_state_store_host}== "impala_state_store_args="-log_dir=${impala_log_dir}-state_store_port=${ Impala_state_store_port} "impala_server_args=" \-log_dir=${impala_log_dir} \-catalog_service_host=${impala_catal Og_service_host} \-state_store_port=${impala_state_store_port} \-use_statestore \-state_store_host=${impala _state_store_host} \-be_port=${impala_backend_port} "Enable_core_dumps=true # Libhdfs_opts=-djava.library.path=/us R/lib/impala/lib # Mysql_connector_jar=/usr/share/java/mysql-connector-java.jar # IMPALA_BIN=/usr/lib/impala/sbin #
Impala_home=/usr/lib/impala # hive_home=/usr/lib/hive # hbase_home=/usr/lib/hbase # IMPALA_CONF_DIR=/etc/impala/conf # hadoop_conf_dir=/etc/impala/conf # hive_conf_dir=/etc/impala/conf
# hbase_conf_dir=/etc/impala/conf 

##/etc/impala/conf
In fact, the location of this file is determined by/etc/default/impala, and some versions may be under/usr/lib/impala/conf
Hive, HDFs and core three core profiles are copied from the HDFs and hive configuration files, and if there are hbase-site.xml, copy them together.
The results are as follows:

Modify the file Hdfs-site.xml to add the following to the file

<property>
    <name>dfs.client.read.shortcircuit</name>
    <value>true</value>
</property>
<property>
    <name>dfs.domain.socket.path</name>
    <value >/var/run/hadoop-hdfs/dn._PORT</value>
</property>
<property>
  <name> dfs.datanode.hdfs-blocks-metadata.enabled</name>
  <value>true</value>
</property >
<property>
  <name>dfs.client.file-block-storage-locations.timeout</name>  
  <value>10000</value>
</property>
Configure file Synchronization

Sync Impala files and conf folders to all IMPALA-SERVER nodes launch Impala

Confirm that all services have been started, including Hive, Impala, view Impala boot service can use command

Ps-ef |grep Impala

If normal, there are the following three services in Hdname, as shown in figure

You can access the relevant services through a URL or through a browser

            Hostname: 25010  The default metadata is the node information,
            host name: 25000    server information, any server-mounted node can access the
            hostname: 25020  Catalog Information

Log in to the Impala terminal using the command Impala-shell on any node and connect to the store in connect Hdname;
Execute command invalidate metadata update metadata
The result is as shown
Error Resolution Reference Error 1

Path Permission Issues
Error connecting:ttransportexception, Could not connect to master:21000
View Impalad in the log file/var/log/impala. Error, errors are as follows

Error:short-circuit Local reads is disabled because
  -Dfs.domain.socket.path are not configured.
  -Dfs.client.read.shortcircuit is not enabled.


Workaround
Find the value below the corresponding parameter to see if the path exists and whether the Impala user has permission to read and write errors 2

Invalidate Metadata Update metadata error
Error code

Query:invalidate metadata
error:couldn ' t Open Transport for hdname:26000 (connect () Failed:connection refused)

View log information with the following error, unable to read data block information on Datanode

I0507 10:03:36.218281 21562 blockstoragelocationutil.java:177] Failed to query block locations on Datanode 192.168.73.16: 50020:org.apache.hadoop.ipc.remoteexception (java.lang.UnsupportedOperationException): datanode# Gethdfsblocksmetadata is not  enabled with Datanode config
        at Org.apache.hadoop.hdfs.server.datanode.DataNode.getHdfsBlocksMetadata (datanode.java:1547)

Workaround
Note: The port used here is said to be 26000, in fact, the entire configuration process does not have any port set to 26000, while this error occurs at the login terminal and after a successful connection, which means that this is related to metadata
Workaround:
Configuration item Impala_catalog_args in/etc/default/impala, need to add parameter-state_store_host=${impala_state_store_host}
The results are as follows

This parameter tells the catalog to which host to find the metadata, in fact, in the State-store configuration also has the State_store_host parameter reset

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.