This is a creation in Article, where the information may have evolved or changed.
This article goes from Golove Blog: http://www.cnblogs.com/golove/p/3273585.html
Functions and methods in a Unicode package
Latters.go
Const (
Maxrune = ' \u0010ffff '//Unicode code point maximum Value
Replacementchar = ' \ufffd '//Represents an invalid Unicode code point
Maxascii = ' \u007f '//ASCII point maximum value
MaxLatin1 = ' \u00ff '//Latin-1 code point max
)
------------------------------------------------------------
Determine if the character R is within the Rangtab table range
Func is (Rangetab *rangetable, R rune) bool
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Determine if a character is Chinese
If Unicode. Is (Unicode. scripts["Han"], R) {
Fmt. Printf ("%c", R)//World
}
}
}
------------------------------------------------------------
Determines whether the character R is in uppercase format
Func isupper (R rune) bool
Func Main () {
S: = "Hello abc! "
For _, R: = Range S {
Determines whether a character is uppercase
If Unicode. Isupper (R) {
Fmt. Printf ("%c", R)//HABC
}
}
}
------------------------------------------------------------
Determines whether the character r is in lowercase format
Func islower (R rune) bool
Func Main () {
S: = "Hello abc! "
For _, R: = Range S {
If Unicode. Islower (R) {
Fmt. Printf ("%c", R)//ELLOABC
}
}
}
------------------------------------------------------------
Determines whether the character R is a Unicode-defined Title character
The Title format of most characters is its uppercase format
Only a few characters of the Title format are special characters
The special characters are judged here.
Func Istitle (R rune) bool
Func Main () {
S: = "helloᾇᾗᾧ! "
For _, R: = Range S {
If Unicode. Istitle (R) {
Fmt. Printf ("%c", R)//ᾇᾗᾧ
}
}
}
Output Unicode-defined Title character
Func Main () {
For _, CR: = Range Unicode. lt.r16 {
For I: = cr. Lo; I <= Cr. Hi; i + = cr. Stride {
Fmt. Printf ("%c", i)
}
Fmt. Println ("")
}
}
------------------------------------------------------------
To converts the character R to the specified format
_case value: Uppercase, lowercase, titlecase
Func to (_case int, R rune) Rune
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. to (Unicode. Uppercase, R))
}//HELLO world!
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. to (Unicode. lowercase, R))
}//Hello world!
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. to (Unicode. Titlecase, R))
}//HELLO world!
}
------------------------------------------------------------
ToUpper Convert character R to uppercase format
Func ToUpper (R rune) Rune
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. ToUpper (R))
}//HELLO world!
}
------------------------------------------------------------
ToLower convert character r to lowercase format
Func ToLower (R rune) Rune
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. ToLower (R))
}//Hello world!
}
------------------------------------------------------------
Totitle Convert the character R to the Title format
The Title format of most characters is its uppercase format
Only a few characters of the Title format are special characters
Func Totitle (R rune) Rune
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. Totitle (R))
}//HELLO world!
}
------------------------------------------------------------
Specialcase is a mapping table in a specific locale, such as "Turkey"
Specialcase's method set customizes the standard mappings
Type Specialcase []caserange
------------------------------------------------------------
ToUpper convert R to uppercase format
Prioritize using the specified mapping table special
Func (Special Specialcase) ToUpper (R rune) Rune
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. Specialcase (Unicode. Caseranges). ToUpper (R))
}//HELLO world!
}
------------------------------------------------------------
ToLower convert R to lowercase format
Prioritize using the specified mapping table special
Func (Special Specialcase) ToLower (R rune) Rune
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. Specialcase (Unicode. Caseranges). ToLower (R))
}//Hello world!
}
------------------------------------------------------------
Totitle convert R to Title format
Prioritize using the specified mapping table special
Func (Special Specialcase) Totitle (R rune) Rune
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c", Unicode. Specialcase (Unicode. Caseranges). Totitle (R))
}//HELLO world!
}
------------------------------------------------------------
Simplefold iterates through Unicode, looking for the "next" character equivalent to R,
"Next" means: The value of the code point is greater than R and closest to the R character
If there is no "next" character, start looking for a character that is equivalent to r from scratch
"Quite" means: the same character has different wording in different situations.
These characters are quite different in wording.
This function is implemented by querying the Caseorbit table
//
For example:
Simplefold (' a ') = ' a '
Simplefold (' a ') = ' a '
//
Simplefold (' k ') = ' k '
Simplefold (' k ') = ' \u212a ' (Kelvin symbol: k)
Simplefold (' \u212a ') = ' K '
//
Simplefold (' 1 ') = ' 1 '
//
Func Simplefold (R rune) Rune
Func Main () {
Fmt. Printf ("%c\n", Unicode. Simplefold (' Φ '))//φ
Fmt. Printf ("%c\n", Unicode. Simplefold (' φ '))//ϕ
Fmt. Printf ("%c\n", Unicode. Simplefold (' Φ '))//φ
}
============================================================
Digit.go
------------------------------------------------------------
IsDigit to determine if R is a decimal numeric character
Func isdigit (R rune) bool
Func Main () {
S: = "Hello 123123!" "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) IsDigit (R))
}//123123 = True
}
============================================================
Graphic.go
------------------------------------------------------------
Isgraphic determines whether the character R is a "graphic character"
"Graphic characters" include letters, markers, numbers, punctuation, symbols, spaces
They correspond to the L, M, N, P, S, Zs categories respectively
These categories are rangetable types that store the corresponding categories of character ranges
Func isgraphic (R rune) bool
Func Main () {
S: = "Hello world!" \ t "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Isgraphic (R))
}//\ t = False
}
------------------------------------------------------------
Isprint determines whether the character R is a "printable character" defined by Go
Printable characters include letters, markers, numbers, punctuation, symbols, and ASCII spaces
They correspond to L, M, N, P, S category and ASCII spaces, respectively
"Printable characters" and "graphic characters" are basically the same, except that
"Printable character" contains only ASCII spaces in the Zs category (u+0020)
Func isprint (R rune) bool
Func Main () {
S: = "Hello world!" \ t "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Isprint (R))
}//full-width space = false \ t = False
}
------------------------------------------------------------
Isoneof to determine if R is within set table range
Func isoneof (Set []*rangetable, R Rune) bool
Func Main () {
S: = "Hello world!" "
Set table to "Kanji, punctuation"
Set: = []*unicode. Rangetable{unicode. Han, Unicode. P
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Isoneof (set, R))
}//World! = True
}
------------------------------------------------------------
Iscontrol to determine if R is a control character
Unicode class C contains more characters, such as proxy characters
Use is (C, R) to test them
Func Iscontrol (R rune) bool
Func Main () {
S: = "hello\n\t world!" "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Iscontrol (R))
}//\n\t = True
}
------------------------------------------------------------
Isletter to determine if R is an alphabetic character (category L)
Kanji is also an alphabetic character
Func Isletter (R rune) bool
Func Main () {
S: = "hello\n\t world!" "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Isletter (R))
}//Hello World = True
}
------------------------------------------------------------
Ismark to determine if R is a mark character (category M)
Func Ismark (R rune) bool
Func Main () {
S: = "hello៉៊់៌៍! "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Ismark (R))
}//៉៊់៌៍= True
}
Output all Mark characters
Func Main () {
For _, CR: = Range Unicode. m.r16 {
For I: = UInt16 (cr. Lo); I >= Cr. Lo && I <= cr. Hi; i + = UInt16 (cr. Stride) {
Fmt. Printf ("%c =%v\n", I, Unicode.) Ismark (Rune (i)))
}
Fmt. Println ("")
}
}
------------------------------------------------------------
Isnumber to determine if R is a numeric character (category N)
Func isnumber (R rune) bool
Func Main () {
S: = "Hello 123123!" "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Isnumber (R))
}//123123 = True
}
------------------------------------------------------------
Ispunct to determine if R is a punctuation character (category P)
Func ispunct (R rune) bool
Func Main () {
S: = "Hello world!" "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Ispunct (R))
} // ! = True
}
------------------------------------------------------------
IsSpace to determine if R is a white space character
In the Latin-1 character set, the white space character is: \ t, \ n, \v, \f, \ r,
Space, u+0085 (NEL), u+00a0 (NBSP)
Other whitespace characters are defined as "category Z" and "Pattern_white_space attribute"
Func IsSpace (R rune) bool
Func Main () {
S: = "Hello \ t world!" \ n "
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) IsSpace (R))
}//space \ t full-width space \ n = True
}
------------------------------------------------------------
Issymbol to determine if R is a symbolic character
Func Issymbol (R rune) bool
Func Main () {
S: = "Hello (< = bounded >)"
For _, R: = Range S {
Fmt. Printf ("%c =%v\n", R, Unicode.) Issymbol (R))
}//<=> = True
}