MySQL Cookbook Reading notes the 5th Chapter

Source: Internet
Author: User

1, String property

View the systems that have those character sets:

If you need to store from multiple languages into the same column, you will consider the Unicode character set (UTF8 or UCS2), only it can represent multiple language characters

Some character sets support multibyte, some contain only one byte, and the method to determine if multibyte is supported is to determine if there are multiple bytes in the string by comparing the return values of length () H and Char_length functions. For example, the byte length with UCS2 is 6 and the number of characters is 3.

In addition, although the Unicode character set UTF8 contains multibyte characters, a specific string may contain only a single-byte word

Another characteristic of a non-binary string is collation, which determines the order in which characters are sorted in the character set.

Collation with the name Ci,cs or bin end are case insensitive, case sensitive, and binary.

How collation is affected by sorting:

Case insensitive (AAA,AAA and bbb,bbb are different in relative order):

Case sensitive:

Binary collation:

2, select the data type of the string

When the column values of binary and char data types are shorter, they are filled to meet the required length, and the padding value is 0x00

The Varbinary,varchar and Blob,text types do not need to add or remove padding values:

If you need to store UTF8 (Unicode) and Sjis (Japanese) strings, you can define them in the following ways:

CREATE TABLE myTB1 (

Utf8data varchar (character) set UTF8 collation utf8_danish_ci,

Sjisdata varchar (character) set Sjis collation Sjis_japanese_ci

);

3. Set the character set of the client connection correctly

A, change the option configuration file:

[MySQL]

Default-character-set=utf8

b, the SET NAMES statement is executed after the connection is established

Set Names UTF8

Specify the collation of the connection:

Set names ' UTF8 ' collate ' utf8-general_cl '

C, programming interfaces provide their own way to set character sets

4. Check the character set or character sort of a string

Use the CharSet () and collation () functions.

When the configuration changes, getting the character set and collation properties from the current configuration also changes.

5, changing character set and character ordering of strings

Use the CONVERT () function to change the character set of a string

Use the COLLATE operator to change the collation of a string

To change the character set and collation for both strings, first use convert to change the character set and then use the COLLATE operator to change the sort

6, change the case of the string letter

Use upper () and lower ():

If you change the string only the first character of STR is uppercase, the other parts remain unchanged:

Concat (Upper (left (str,1), Mid (str,2)))

To prevent multiple input of lengthy expressions, you can define a function:

To change the case where there is an error in case:

When these two functions fail, usually because a binary string needs to be converted, after MySQL4.1, the binary string cannot be converted to uppercase and lowercase because there is no default character set, and at this point, to convert, the string must first be converted to use a character set and then used for case conversion.

7,mysql for pattern matching

MySQL provides two pattern matching methods: Based on SQL mode and regular expression based

A, using SQL mode

SQL mode uses the operator like and not to match strings and pattern strings, typically using two matches: _ is used to match any one character, and% is used to match any string (including empty strings).

The not-like is used to match a pattern-matching procedure like the opposite.

b, using regular expressions for pattern-matching

Use RegExp to manipulate wildcards regular expressions, which are common pattern characters:

For example, start with a specific substring:

End with a specific substring:

Contains a specific substring:

A particular substring appears in a position (the third position in the first number):

Using regular expressions also enables many features that are not available in MySQL mode, such as matching a character set and any character, and MySQL also supports the use of the POSIX character set to define regular expressions:

The regular expressions in MySQL and vi,linux are basically the same, need to study hard, the last time I went to China Telecom Games, there is the use of regular expressions to match the message format of the topic so this area to spend more time.

To define a selective match:

Match the name string that begins with Aeiou or ends with ER:

Match a string consisting entirely of numbers or entirely of letters:

You can see that the result is not correct, because the relationship of precedence matches a string that begins with a number or ends with a character. You need to use parentheses to group matching options:

Case sensitivity in pattern matching is also a matter of changing the character set (not binary) and collation to match whether it is sensitive. The general collation is case insensitive at the end of CI, and the CS end is case sensitive.

8, splitting or concatenating strings

Use the Substring-extraction function to get substrings, concatenate strings using the concat () function

Common substring-extraction functions include the Left () function, the right () function, and the mid () function.

The substring () function returns the entire substring from the specified start to the right, and if mid () omits the third argument, its execution results are the same as substring ()

Substring_index (str,c,n) returns the entire substring to the left or right of any specified character in Str, the left side of Str looks for the position where the character C appears, and returns the entire substring to the left of that position, and if n is negative, the right side is the same treatment.

Use the concat () function to concatenate strings:

9, Query substring

How do I determine whether a string contains a substring? The locate () function returns the position of the substring in the first occurrence of the string. The third parameter specifies where to start the lookup.

Locate () also requires collation support to determine case sensitivity

10, use fulltext query

How do I query large amounts of text? Using the fulltext Index

When the amount of data is large, queries that use pattern matching can get a structure, but are inefficient, and can use pattern matching to query the same string from multiple columns, but with less efficiency.

For large amounts of text or multiple columns, you can use Fulltext instead of pattern matching. To use a fulltext query, you first add a Fulltext index to the table, and then use the match operator to query the columns that are indexed. Fulltext can be used for non-binary type strings in the MyISAM table

A, create a table KJV

b, from the already processed TXT file, import the data into the table KJV

C, use Fulltext Index to prepare to use Fulltext to retrieve

D, if you want to know how many times a name appears in Kjv.txt, you only need to use the match () function to specify the index and use the against () function to define the text to query

or query details:

If you frequently include other non-fulltext columns in a standard query, you can improve the related query performance by adding a rule index to those columns.

Change: ALTER TABLE KJV Add index (bnum), add index (cnum), add index (VNUM);

If you want to use Fulltext to query multiple columns at the same time,

ALTER TABLE Tbl_name ADD fulltext (col1, col2, col3)

When querying, select ... from tbl_name where match (col1, col2, col3) against (' search string ');

11, fulltext query with a phrase

Text such as "God" in KJV text is important, but if you fulltext query for these words in the KJV table, you will find that the query is not

The reason is that the index engine will be slightly too common for words (that is, words that appear in more general lines), to query for this time the phrase needs to use SQL pattern matching to count the number of lines that contain these two words

or modify the value of Ft_min_word_len in the configuration file to change it.

Use Boolean to search for words

When you use the following query, the rows that contain the name David or Goliath are queried:

What we want to query is a row containing these two names, which can be resolved as follows: using and federated query criteria:

Another more convenient way is to use the Boolean pattern:

+ indicates compliance with certain words,-means exclusion of certain words, * acts as a wildcard

Use double quotes to enclose phrases in Boolean mode to support phrase search

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.