New Features of PhP6: Unicode and textiterator-ma yongzhan

Source: Internet
Author: User
Tags php unicode

Copyright statement: original works can be reproduced. During reprinting, you must mark the original publication, author information, and this statement in hyperlink form. Otherwise, legal liability will be held. Http://blog.csdn.net/mayongzhan-ma yongzhan, myz, mayongzhan

New Features of PhP6: Unicode and textiterator
Address: http://blog.makemepulse.com/2008/03/13/php6-unicode-and-textiterator-

I just installed PhP6 Dev and decided to test the new feature of PhP6-PHP Unicode support. I am not planning to talk about the new features of PhP6 or Unicode. The following is just my Unicode test.

The first thing to do is to make PhP6 support Unicode and modify it in the PHP. ini file.

;;;;;;;;;;;;;;;;;;;;
; Unicode settings;
; Unicode. Semantics = on
Unicode. runtime_encoding = UTF-8
Unicode. script_encoding = UTF-8
Unicode. output_encoding = UTF-8
Unicode. from_error_mode = u_invalid_substitute
Unicode. from_error_subst_char = 3f
Because I use different French and English, some characters need to be processed.
So, the purpose of my first test is to test the Unicode of strlen functions...

$ Word = "inclutre ";
Echo "Length:". strlen ($ word );

Result: Length: 4. The results are very correct... ... But it's just the beginning! :)

My second test object is the textiterator in the new SPL of PhP6.
$ Word = "inclutre ";
Foreach (New textiterator ($ word, textiterator: character) as $ character ){
�Var_inspect ($ character );
}

Output: Unicode (1) "bytes" {00ea} Unicode (1) "T" {0074} Unicode (1) "R" {0072} Unicode (1) "E" {0065}
Break down words and get a lot of letters and letters...

Textiterator: character operations look very powerful, but textiterator: Word is more powerful.

$ Sentences = "Bonjour, nous sommes français! A ï e :)";
Foreach (New textiterator ($ sentences, textiterator: Word) as $ word ){
Var_inspect ($ word );
}

Result: Unicode (7) "Bonjour" {0042 006f 006e 006a 006f 0075 0072} Unicode (1) "," {002c} Unicode (1) "{0020} Unicode (4)" Nous "{006e 006f 0075 0073} Unicode (1)" {0020} Unicode (6) "sommes" {0073 006f 006d 006d 0065 0073} Unicode (1) "{0020} Unicode (8) "français" {0046 0072 0061 006e 00e7 0061 0069 0073} Unicode (1) "{0020} Unicode (1)"!" {0021} Unicode (1) "{0020}

What we get is a word. Why is there a lot of code in {} behind a word? Let's make an experiment:

Echo "/u0046/u0072/u0061/u006e/u00e7/u0061/u0069/u0073 ";

We get the result: "franceais ".
PhP6 can process letters or words!

$ Sentences = "Bonjour, nous sommes français ";
$ Word_break = new textiterator ($ sentences, textiterator: Word );

Take the last word:

$ Word_break-> preceding ($ word_break-> last ());
Echo $ word_break-> current ();

Take the first word:

$ Word_break-> first ();
Echo $ word_break-> current ();

Take the third word:
$ Word_break-> first ();
$ Word_break-> next (3 );
Echo $ word_break-> current ();

This is only part of PhP6 about Unicode. Next I will test what I learned when I went to the PHP conference in Paris.
"Str_transliterate". This str_transliterate can be used for transliteration of words in different languages.

$ Name = "Antoine ughetto ";
$ JAP = str_transliterate ($ name, 'Latin', 'katakana ');
Echo str_transliterate ($ Jap, 'any', 'Latin ');

Oh, yeah, my name is Japanese. It sounds like "Antoine uguhetto ".

All of this is interesting, but it is difficult to test without a manual.
Thanks to Andrei zmievski's blog article for helping me perform these tests...

 

 

PhP6, Unicode and textiterator features-Antoine ughetto

I 've just install the last version of PhP6 Dev and I 've decided to test the famous new feature, the PHP Unicode support. I will not explain ain new things about PhP6 or Unicode or textiterator, it's just my discoveries test on this features.
So the first thing to do is to enable PhP6 Unicode in the PHP. ini file.
;;;;;;;;;;;;;;;;;;;;
; Unicode settings;
; Unicode. Semantics = on
Unicode. runtime_encoding = UTF-8
Unicode. script_encoding = UTF-8
Unicode. output_encoding = UTF-8
Unicode. from_error_mode = u_invalid_substitute
Unicode. from_error_subst_char = 3f
(More ...)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.