. NET to Java phonetic components

Source: Internet
Author: User

PS: 4 years, self-feeling. NET to the bottleneck, and the company did not use it in depth. NET technology projects, self-learning feel also not too much motivation (please scold me lazy t_t). Plus the technical years go up, learned. NET career to improve the environment is more and more difficult (personal understanding, Jet will not spray, sprayed on me also no impact, haha ha). Then began to learn Java technology half a month ago. The company just set up a Java development team to apply for the transfer. The company has many components. NET environment, for pure Java Siege Lion, for. NET code is also a headache, so this kind of task throws one to me.  
    • Features of the component:
because of the business needs of the company, for the Chinese name of the uncommon words (for common GB2312, not GBK), need to be identified and converted to pinyin. For more than one Chinese name, the entire name pinyin string appears duplicate name, need prompt. first, say the algorithm:GB2312 has its own coding table and Pinyin Index table (I am looking for online), and then can be based on their own design of the search algorithm to create a corresponding array of resources. The first-level characters of the 16~55 area of GB2312 are sorted by pinyin. The first level of Chinese characters into a byte array, according to the algorithm to locate the previous resource array, you can find the corresponding pinyin. The level two characters of the 56~87 district of GB2312 are not used in pinyin sorting, they can be specially processed, and compatible algorithms can be designed (so the resource array needs to be designed by itself). for uncommon characters:more than GB2312 Chinese characters in the location of the rare word (of course, the byte array is still two-bit), the current use is directly imported GBK in the rare word information, made an indexer, and then index matching.
    • Different points:
Chinese character string to byte array :          . NET, Chinese character strings use GB2312 and GBK encoding, and for uncommon characters, the resulting bytes are two-bit byte array information in GBK. But in Java, what you get with GB2312 is a "? "Byte in ASCII code, only use GBK to get to the corresponding two-byte array. (After all, GBK is completely compatible with GB2312) unsigned integer :because an index resource is used, an unsigned integer is used when the associated index is evaluated.           . NET has its own definition of unsigned integers, and there is no unsigned integer in Java, so the need for the operation of the ascending bit, as well as the storage (UInt16 corresponds to int,uint32 corresponding to long). byte Array (two bytes) to non-symbolic shaping :          . NET has a direct approach. It is important to note that there are some methods for turning, low in front, and high in the back. You need to write the appropriate method in Java. It is important to note the conversion of high and low bits, and the representation of uint in Java. file read :because of the need to read uncommon characters, the file format has turned into UTF-8. In Windows systems, MS will add a BOM (byte order mark) to the head of a non-ASCII encoded file, and a byte sequence mark. Java when reading a file containing a BOM, reading the first character will appear one, may affect the subsequent processing of the program. The solution is also relatively simple, with similar notepad++ software into a non-BOM encoding can be.
    • Summarize:
The syntax is roughly the same, but Java does not have that much built-in syntax, and many of the bottom layers need to be written by themselves (the company does not allow a third-party class library to be referenced). feeling. NET although very convenient, but if not active, do not have to understand a lot of computer underlying things. (In fact, I am lazy t_t)Java needs to understand a lot of the underlying knowledge, to correct programming, although more painful, but the personal feel need to know a lot of things. feel the introductory words. NET is really a good language, but if there is no initiative, Java is very helpful for programmers to know about program-related knowledge.

. NET to Java phonetic components

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.