Sqoop Import MySQL Database garbled

Source: Internet
Author: User
Tags sqoop

First, the code in MySQL

mysql> show variables like  ' collation_% '; +----------------------+-------------------+|  variable_name | value |+----------------------+-------------------+| collation_ connection | latin1_swedish_ci | |  collation_database | latin1_swedish_ci | |  collation_server | latin1_swedish_ci |+----------------------+-------------------+3  rows in set  (0.00 sec) mysql> show variables like  ' character_set_% '; +--------------------------+----------------------------+| variable_name | value |+------- -------------------+----------------------------+| character_set_client | latin1 | |  character_set_connection | latin1 | |  character_set_database | latin1 | |  character_set_filesystem | binary | |  character_set_results | latin1 | |  character_set_server | latin1 | |  character_set_system | utf8 | |  character_sets_dir | /usr/share/mysql/charsets/ |+--------------------------+------------ ----------------+8 rows in set  (0.00 SEC)

The original default is Latin1 encoding, will result in Chinese garbled.

You can set the encoding in MySQL, a single setting
mysql> ALTER DATABASE name character set UTF8;
Mysql> set Character_set_connection=utf8;

Query OK, 0 rows Affected (0.00 sec)
Mysql> set Character_set_connection=utf8;
Query OK, 0 rows Affected (0.00 sec)
Mysql> set Character_set_results=utf8;
Query OK, 0 rows Affected (0.00 sec)
Mysql> set Character_set_server=utf8;
Query OK, 0 rows Affected (0.00 sec)
However, it will fail after a reboot.

You can modify the configuration file:

[[email protected] ~]# Vi/etc/my.cnf[mysql]default-character-set=utf8[client]default-character-set=utf8[mysqld] default-character-set=utf8character_set_server=utf8init_connect= ' Set NAMES UTF8 '

Restart MySQL so that the default encoding is UTF8

[Email protected] ~]# service mysqld restart

See if it becomes UTF8:

Mysql> \s--------------mysql ver 14.12 distrib 5.0.95, for  redhat-linux-gnu  (x86_64)  using readline 5.1Connection id: 2Current  Database: toplistscurrent user: [email protected]ssl: not in usecurrent  pager: stdoutUsing outfile:  ' using delimiter: ; server version: 5.0.95 source distributionprotocol version: 10connection:  localhost via unix socketserver characterset: utf8db characterset:  Utf8client characterset: utf8conn. characterset: utf8unix socket: /var/lib/mysql /mysql.sockuptime: 39 secthreads: 1 questions: 12 slow queries: 0  Opens: 15 Flush tables: 1 Open tables: 9 Queries per  second avg: 0.308--------------mysql> show variables like  "char%", +--------------------------+----------------------------+| variable_name |  value |+--------------------------+----------------------------+| character_set_client |  utf8 | |  character_set_connection | utf8 | |  character_set_database | utf8 | |  character_set_filesystem | binary | |  character_set_results | utf8 | |  character_set_server | utf8 | |  character_set_system | utf8 | |  character_sets_dir | /usr/share/mysql/charsets/ |+--------------------------+------------ ----------------+8 rows in set  (0.00 sec) mysql> show variables like   "colla%", +----------------------+-----------------+| variable_name | value |+------- ---------------+-----------------+| collation_connection | utf8_general_ci | |  collation_database | utf8_general_ci | |  collation_server | utf8_general_ci |+----------------------+-----------------+3 rows  in set  (0.00 SEC)

change the MySQL encoding to UTF8 and specify the encoding as UTF8 on the command line, and perform an export error:

 [[email protected] ~]$ sqoop export --connect  "jdbc:mysql://Hadoop48/ Toplists?useunicode=true&characterencoding=utf-8 " -m 1 --table award --export-dir  /user/hive/warehouse/hive_myaward/000000_0 --input-null-string  "\\\\n"  -- input-null-non-string  "\\\\n"  --input-fields-terminated-by  "\\01"  -- input-lines-terminated-by  "\\n" 12/07/20 13:17:22 info mapred. jobclient: task id : attempt_201207191159_0233_m_000000_0, status :  failedjava.io.ioexception: java.sql.sqlexception: incorrect string value:  ' \xE6\x9D\ X80\xe7\xa0\xb4 '  for column  ' Nick '  at row 1 

Originally I created the table default encoding is not UTF8, also need to change a bit.

Mysql> ALTER TABLE Award CONVERT to CHARACTER SET UTF8 COLLATE utf8_general_ci; Query OK, 0 rows Affected (0.00 sec) records:0 duplicates:0 warnings:0

re-export

 [[email protected] ~]$ sqoop export --connect  "jdbc:mysql://Hadoop48/ Toplists?useunicode=true&characterencoding=utf-8 " -m 1 --table award --export-dir  /user/hive/warehouse/hive_myaward/000000_0 --input-null-string  "\\\\n"  -- input-null-non-string  "\\\\n"  --input-fields-terminated-by  "\\01"  -- input-lines-terminated-by  "\\n" ... 12/07/20 14:03:10 info mapred. Jobclient: map 0% reduce 0%12/07/20 14:03:24 info mapred. jobclient: map 100% reduce 0% ... 12/07/20 14:03:29 info mapreduce. Exportjobbase: exported 2 record 
Checklist mysql> select * from award;+-------------------------------+-----------+-----------+ ------+-----------+-----------+--------+------+-------------+-----------------+---------+| rowkey |  productid | matchid | rank | tourneyid | userid |  gameid | gold | loginid | nick | plat |+------------------------- ------+-----------+-----------+------+-----------+-----------+--------+------+-------------+-----------------+- --------+| 2012-04-27 06:55:00:402713629 | 5947 | 433203828 | 2 |  4027102 | 402713629 | 1001 | NULL | 715878221 |  Kill Day A  | ios | |  2012-04-27 06:55:00:406788559 | 778 | 433203930 | 19 |  4017780 | 406788559 | 1001 | 1 | 13835155880 |  Pro 牛牛旦旦  | android |+-------------------------------+-----------+-----------+------+-----------+----------- +--------+------+-------------+-----------------+---------+2 rows in set  (0.00 sec)


Sqoop Import MySQL Database garbled

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.