I have just installed the PHP6 Dev version and decided to test the Unicode support for PHP6 's new feature-php. I'm not going to talk about the new features of PHP6 or Unicode, just the tests I did on Unicode.
The first thing to do is to have PHP6 support Unicode, which is modified in the php.ini file.
;;;;;;;;;;;;;;;;;;;;
; Unicode settings;
;;;;;;;;;;;;;;;;;;;; Unicode.semantics = On
unicode.runtime_encoding = Utf-8
unicode.script_encoding = Utf-8
unicode.output_encoding = Utf-8
Unicode.from_error_mode = U_invalid_substitute
Unicode.from_error_subst_char = 3f
since I am using French and English differently, there are some characters that need to be handled.
So, my first experiment was designed to test the strlen function of Unicode ...
$word = "être";
echo "Length:". strlen ($word);
The result: Length:4. The result is very correct ... But it's just the beginning! : )
My second Test object is the Textiterator textiterator with PHP6 's new SPL.
$word = "être";
foreach (New Textiterator ($word, Textiterator::character) as $character) {
? Var_inspect ($character);
}
Output: Unicode (1) "Ê" {00ea} Unicode (1) "T" {0074} Unicode (1) "R" {0072} Unicode (1) "E" {0065}
Break down words, get a lot of letters and letters of information ...
Textiterator::character's operation looks very powerful, but Textiterator::word is more powerful.
$sentences = "Bonjour, nous sommes français!" Aïe:) ";
foreach (New Textiterator ($sentences, Textiterator::word) as $word) {
Var_inspect ($word);
}
Results obtained: Unicode (7) "Bonjour" {0042 006f 006e 006a 006f 0075 0072} Unicode (1), "{002c} Unicode (1)" "{0020} Unicode (4) "Nous" {006e 006f 0075 0073} Unicode (1) "" {0020} Unicode (6) "sommes" {0073 006f 006d 006d 0065 0073} Unicode (1 "' {0020} Unicode (8)" Français "{0046 0072 0061 006e 00e7 0061 0069 0073} Unicode (1)" "{0020}