Read gb2312 (Chinese) encoded files or binary streams from iPhone

Source: Internet
Author: User

Conversion from: http://www.cnblogs.com/likwo/archive/2011/06/26/2090914.htmliphoneto read gb2312 (Chinese) encoded files or binary streams.

Speaking of text encoding, software developers who have dealt with File Read and Write should know, such as International General: UTF-8 encoding, anscii encoding, Unicode encoding Chinese: gb2312, GBK Japanese, shift-JIS and so on.

If so many irrelevant words are mentioned, the technology is used to solve the actual problem:

1. How do I read UTF-8-encoded text files?

2. How to read gb2312 (Chinese) text files?

3. How to read other encoded files?


First, solve the first problem,

1. How do I read UTF-8-encoded text files?

Nsstring * filepath = [[[nsbundle mainbundle] bundlepath] stringbyappendingpathcomponent: Filename]; [nsstring stringwithcontentsoffile: filepath encoding: nsutf8stringencoding error: nil] // Of course, you can also use the following method // nsdata * Data = [nsdata datawithcontentsoffile: filepath]; // nsstring * textfile = [[nsstring alloc] initwithdata: data encoding: nsutf8stringencoding];

2. How to read gb2312 (Chinese) text files?

Many may think that this is not simple. Apple certainly provides gb2312 file encoding constants. Let's take a look at those constants and see the nsutf8stringencoding definition file. There should be a constant definition of gb2312.

/* Note that in addition to the values explicitly listed below, NSStringEncoding supports encodings provided by CFString.See CFStringEncodingExt.h for a list of these encodings.See CFString.h for functions which convert between NSStringEncoding and CFStringEncoding.*/enum {    NSASCIIStringEncoding = 1,/* 0..127 only */    NSNEXTSTEPStringEncoding = 2,    NSJapaneseEUCStringEncoding = 3,    NSUTF8StringEncoding = 4,    NSISOLatin1StringEncoding = 5,    NSSymbolStringEncoding = 6,    NSNonLossyASCIIStringEncoding = 7,    NSShiftJISStringEncoding = 8,          /* kCFStringEncodingDOSJapanese */    NSISOLatin2StringEncoding = 9,    NSUnicodeStringEncoding = 10,    NSWindowsCP1251StringEncoding = 11,    /* Cyrillic; same as AdobeStandardCyrillic */    NSWindowsCP1252StringEncoding = 12,    /* WinLatin1 */    NSWindowsCP1253StringEncoding = 13,    /* Greek */    NSWindowsCP1254StringEncoding = 14,    /* Turkish */    NSWindowsCP1250StringEncoding = 15,    /* WinLatin2 */    NSISO2022JPStringEncoding = 21,        /* ISO 2022 Japanese encoding for e-mail */    NSMacOSRomanStringEncoding = 30,    NSUTF16StringEncoding = NSUnicodeStringEncoding,      /* An alias for NSUnicodeStringEncoding */#if MAC_OS_X_VERSION_10_4 <= MAC_OS_X_VERSION_MAX_ALLOWED || __IPHONE_2_0 <= __IPHONE_OS_VERSION_MAX_ALLOWED    NSUTF16BigEndianStringEncoding = 0x90000100,          /* NSUTF16StringEncoding encoding with explicit endianness specified */    NSUTF16LittleEndianStringEncoding = 0x94000100,       /* NSUTF16StringEncoding encoding with explicit endianness specified */    NSUTF32StringEncoding = 0x8c000100,                       NSUTF32BigEndianStringEncoding = 0x98000100,          /* NSUTF32StringEncoding encoding with explicit endianness specified */    NSUTF32LittleEndianStringEncoding = 0x9c000100        /* NSUTF32StringEncoding encoding with explicit endianness specified */#endif};

Unfortunately, I did not find it, but apple left a clue for me to search for it, but I took a closer look at its comments.

Note that in addition to the values explicitly listed below, nsstringencoding supports encodings provided by cfstring.

See cfstringencodingext. h for a list of these encodings.

See cfstring. h for functions which convert between nsstringencoding and cfstringencoding.

My English is not good, but I understand it. I don't support the encoding format below. I declare it in the cfstringencodingext. h file.

Find cfstringencodingext. H through the file search method of the finder.

Take a closer look and find the kcfstringencodinggb_18030_2000 (I Thought It Was kcfstringencodinggb_2312_80, actually not) but this is the cfstringencoding type. We need the nsencode type.

Cfstring and nsstring have the same memory structure, which is also an important supplement to nsstring. by searching the cfstring help document, we found this method cfstringconvertencodingtonsstringencoding

  NSStringEncoding enc = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingGB_18030_2000);  NSString *textFile  = [NSString stringWithContentsOfFile:filePath encoding:enc error:nil];

Then the second problem is solved.

3. How to read other encoded files?

I believe that the third problem can be solved smoothly through the above methods.

Article: http://www.cnblogs.com/likwo/archive/2011/06/26/2090914.html

Reprinted please indicate the source

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.