This is a creation in Article, where the information may have evolved or changed.
It is a simple operation to turn the English lowercase in the string into uppercase. Each character's encoding can be considered an integer, Golang inside byte
and rune
is uint8
int32
. Other languages are similar. In the encoding table, the position is from the A
to Z
, followed by, the a
z
A
corresponding integer is zero, the a
corresponding integer is 97, the middle of the difference between 26 English letters and 6 other characters of the length of a total of 32. So the method of conversion is to cut the value of lowercase letters by 32.
or a coding problem. The earliest encoding that appears is ASCII, which is increased from 128 to 256 characters. And then to the back of the Unicode, GBK, and so on, these new codes and the earliest ASCII are compatible, that is, different encodings, the first 256 bits are the same. So, here's the English character to capitalization problem, there is no coding problem.
This topic has a special implementation in the Golang package, which encapsulates the package within the package strings
ToUpper
unicode
letter.go
file, which is a concrete implementation.
const MaxASCII = '\u007F'func toUpper(r rune) rune {if r <= MaxASCII {if 'a' <= r && r <= 'z' {r -= 'a' - 'A'}return r}return r}func ToUpper(s []rune) (res []rune) {for i := 0; i < len(s); i++ {res = append(res, toUpper(s[i]))}return res}func main() {a := "Hello, 世界"fmt.Println(string(ToUpper([]rune(a))))}
The large Golang supports the encoding of UTF8 and Unicode in two ways. The data types corresponding to each are the byte
and rune
. Through []byte(str)
and []rune(str)
can convert the string into UTF8 and UTF16 two kinds of parsing methods. The parsed string may be in Chinese, so follow the rune
processing. For other characters such as Chinese, do not do processing, so added if r <= MaxASCII
filtering.
It is also important to note that the Golang language level implements the string string
. It differs from the character array in the C language, so it is not possible to modify the string contents by subscript. Can only be read by subscript.
Please refer to the complete source code in this article.
###### References + "1" Source file src/pkg/strings/strings.go-the Go programming language+ "2" ascii-wikipedia+ "3" Sou RCE file src/pkg/unicode/letter.go-the Go programming Language
Original link: A string of characters from small to uppercase, reproduced please indicate the source!