Leetcode:utf-8 Validation

Source: Internet
Author: User

A character in UTF8 can is from 1 to 4 bytesLong, subjected to the following rules:for1-byteCharacter, the first bit is a 0, followed by its Unicode code. For n-bytes character, the first n-bits is all one's, the n+1 bit is 0, followed by n-1 bytes with most significant 2 bits bei Ng 10.This is how the UTF-8encoding would work:char. Number Range| UTF-8octet sequence (hexadecimal)|(binary)--------------------+---------------------------------------------0000 0000-0000 007F |0xxxxxxx0000 0080-0000 07FF |110xxxxx 10xxxxxx0000 0800-0000 FFFF |1110xxxx 10xxxxxx 10xxxxxx0001 0000-0010 FFFF |11110xxx 10xxxxxx 10xxxxxx 10xxxxxxGiven An array of integers representing the data,returnWhether it is a valid utf-8encoding. Note:the input is an array of integers. Only the least significant8 bits of each integer are used to store the data. This means each integer represents only 1byteof data. Example1:d ATA= [197, 1], which represents the octet sequence:11000101 10000010 00000001. Returntrue. It is a valid UTF-8 encoding forA 2-bytes character followed by a 1-bytecharacter. Example2:d ATA= [235, 4], which represented the octet sequence:11101011 10001100 00000100. Returnfalse. the first3 bits is all one's and the 4th bit is 0 means it's a 3-bytes character. The nextbyteis a continuationbyteWhich starts with ten and that ' s correct. But the second continuationbyteDoes not start with ten, so it is invalid.

This problem gives a method of judging one single UTF-8 char, and then gives a UTF-8 char sequence, which determines whether the sequence is correct. (Read the question for a long time)

The key to this problem is to learn how to use & to take out a bit sequence.

Binary number notation: Add 0b in front, octal plus 0o, hex plus 0x

1  Public classSolution {2      Public BooleanValidUtf8 (int[] data) {3         if(data==NULL|| data.length==0)return false;4          for(inti=0; i<data.length; i++) {5             if(Data[i] > 255)return false;6             intMorechecks = 0;//morecheck is the number of more bytes, need to check for this Char7             if((Data[i] & 0b10000000) = = 0) morechecks = 0;8             Else if((Data[i] & 0b11100000) = = 0b11000000) Morechecks = 1;9             Else if((Data[i] & 0b11110000) = = 0b11100000) Morechecks = 2;Ten             Else if((Data[i] & 0b11111000) = = 0b11110000) Morechecks = 3; One             Else return false; A              for(intJ=1; j<=morechecks; J + +) { -                 if(I+j >= data.length)return false; -                 if((Data[i+j] & 0b11000000)! = 0b10000000)return false; the             } -i = i +morechecks; -         } -         return true; +     } -}

Leetcode:utf-8 Validation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.