JavaScript in the face of international programming, some of the recommendations _ basics

Source: Internet
Author: User
Tags collator documentation i18n intl numeric time zones locale tomcat

What is internationalization?

Internationalization (the acronym for internationalization is i18n--i, 18 characters in the middle, N) is a process that processes software that makes it easier for users to use various languages from various places. Suppose a user comes from somewhere and says a language, he may inadvertently get some false hints. Especially if you don't even make that assumption.

function FormatDate (d)
{
 //Everyone uses month/date/year...right?
 var month = D.getmonth () + 1;
 var date = D.getdate ();
 var year = D.getfullyear ();
 Return month + "/" + Date + "/" + year;
}
 
function Formatmoney (amount)
{
 /all of dollars with two fractional digits...right?
 Return "$" + amount.tofixed (2);
}
 
function sortnames (names)
{
 function sortalphabetically (A, B)
 {
 var left = A.tolowercase (), right = B.tolowercase ();
 if (Left > right) return
  1;
 if (left = OK) return
  0;
 return-1;
 }
 
 Names always sort alphabetically...right?
 Names.sort (sortalphabetically);
}


JavaScript used to i18n support too bad

Traditional JS i18n programs are formatted using the toLocaleString () method. The resulting string contains all the details that are provided by the implementation itself: no way to choose for yourself (do you really need that date format weekday?). Is the year irrelevant? )。 Even if you include the corresponding details, the format may be wrong, such as this expectation is a percentage but get the number. And you can't choose a locale (locale).

For sorting, JS provides a fundamentally useless text comparison function based on locale (locale-sensitive). Localecompare () does exist, but its interface is not suitable for sort at all. It also does not allow you to select regional Settings or sort by.

These restrictions are too bad (when I realized it, I was very surprised!) , because rigorous web applications that require i18n support (usually the financial site to display the currency) will package the data, send it to the server, operate it, and then send it back to the client. The data round-trip server is only to handle the amount of money. Yeesh.

The new JS Internationalization API

The new ECMAScript internationalization API greatly improves the i18n capability of JS. It provides the means by which you can think of formatting date, numbers, and text sorting. Regional settings are optional and can be rolled back if the requested locale is not supported. A format request can specify the components to be included specifically. Supports custom percentages, valid numbers, currency formats. Open a large number of sort options for text sorting. If you care about performance, the first thing to do is to select a locale and then process the option parameters, which are now processed only once, not every time the locale-dependent operation is performed.

This is not to say that the API is a panacea, but simply "try your best". The exact output is almost always deliberately unspecified. An implementation can support only the OJ locale (legal), or it can ignore (almost all) the formatting options provided. Most implementations contain high-quality, multiple-zone support, but do not guarantee (especially resource-limited systems, such as mobile phones).

At the bottom, the implementation of Firefox relies on the Internationalized component Library (ICU) of Unicode, which relies on the regional data set of the Unicode Common regional Data Warehouse (CLDR). Our implementation is self hosted: Most implementations above the ICU are written in JS. In the process, we have encountered some problems (we have never been on such a large scale of self hosting), but basically not much.

Intl interface (not number 1, letter L)

i18n exist on Intl objects. INTL contains 3 constructors: Intl.collator, Intl.datetimeformat, and Intl.numberformat. Each constructor creates an object that provides related operations and efficiently caches locale settings and options for those operations. Create an object in the following ways:

var ctor = "Collator"; or other 
var instance = new Intl[ctor] (locales, options);

Locales is a string that specifies a single language label, or a class array object that contains multiple language tags. Language tags such as the following string: En (General English), De-at (Austrian German), ZH-HANT-TW (Traditional Chinese used in Taiwan). Language tags can contain a "Unicode extension" in the form of-u-key1-value1-key2-value2 ..., where each key is "extended key". The different constructors explain this specifically.

Opions is an object whose properties (assigned as undefined if not present) determine the behavior of the formatter (formatter) and the finishing (collator). The exact interpretation is determined by the constructor.


Given zone information and options, the implementation attempts to generate the closest behavior of an approximate ideal behavior. Firefox supports the 400+ area for organizing (collation), for Date/time and digitally formatted 600+ areas, so it is likely (but not guaranteed) that the area you want is supported.

Intl usually does not guarantee certain behaviors. If the requested area is not supported, Intl promises to "do what is best". Even if the area is supported, the behavior is not strictly specified. Never assume that a particular set of options applies to a particular format. The language of the overall format (around the requested component) may vary depending on the browser or even the version of the browser. The format of a single component is unspecified: The weekday short form can be "S", "Sa", or "Sat". The Intl API is not intended to expose precise specific behavior.

Options

The main option attributes for Date/time formatting are as follows:

    • Weekday, era

"Narrow", "short", or "long". (era usually refers to a period of more than one year in the calendar system, such as the current reign of the emperor, or other dating laws)

    • Month

"2-digit", "numeric", "narrow", "short", or "long"

    • Year
    • Day
    • Hour, minute, second

"2-digit" or "numeric"

    • timeZoneName

"Short" or "long"

    • TimeZone

Case-sensitive "UTC" is formatted with the corresponding TOUTC. Some values, such as "CEST" and "america/new_york", do not have to be supported, and they do not work under current Firefox.

These values do not map to a specific format: Remember that the Intl API almost does not specify exact behavior.  The purpose of Intl, for example, is "narrow", "short", and "long" to generate the corresponding size of "s"/"Sa", "Sat", and "Saturday" (the output may not be accurate because Saturday and Sunday can generate "s"). "2-digit" and "numeric" are mapped to 2-digit strings or full length numeric strings, such as "70" and "1970".

Most of the options that are ultimately used are the options that are requested. However, if you do not specify the weekday/year/month/day/hour/minute/second of the request, then Year/month/day will be added to the option you provide.


In addition, there are some special options:

    • Hour12

Specifies whether the hour is in 12 hours or 24-hour format. The default is usually dependent on locale (some details, such as whether midnight is 0, or 12, and if there is a leading 0, are dependent on locale).

There are 2 other special properties, Localematcher (either "Lookup" or "best Fit") and Formatmatcher (optional "basic" or "best Fit"), both of which default to best fit. These affect the selection of the correct locale and format. Their use cases may be more difficult to understand than to repeat them.
Locale-related options

DateTimeFormat is also allowed to be formatted through custom calendars and digital systems. Specific details exist in the locale, so they can be found in the Unicode extension of the language label.


For example, Thailand's Thai language tag is th-th.  Back to the format of Unicode extensions-u-key1-value1-key2-value2 .... The key of the calendar system is the CA, the key of the digital system nu. Thai numeral system value is Thai, the Chinese calendar system value is Chinese. So to format date in roughly this way, we append the Unicode containing these key/value to the language tag: Th-th-u-ca-chinese-nu-thai.

For more information on calendars and digital systems, see the full documentation for DateTimeFormat.
examples

After creating the DateTimeFormat object, the next step is to format date with the convenient format () function. More conveniently, this function is bounded (bound function): You do not have to call directly on DateTimeFormat. It is then passed a timestamp or Date object.


To summarize, the following is an example of how to create a datetimeformat option for a specific purpose (under current Firefox execution behavior).

var msperday = * * 1000; 
 
July, 2014 00:00:00 UTC.
var july172014 = new Date (Msperday * (44 * 365 + 11 + 197));

We're going to format the date for American English. We first create a 2-digit month/day/year, plus a 2-digit hours/minutes, and a short time zone to determine the time. (The results must be significantly different for different time zones)

var options =
 {year: "2-digit", Month: "2-digit", Day: "2-digit",
 Hour: "2-digit", Minute: "2-digit",
 Timezo Nename: "Short"};
var americandatetime =
 new Intl.datetimeformat ("en-us", options). format; 
 
Print (Americandatetime (july172014)); 07/16/14, 5:00 PM PDT

or similar to Portuguese, preferably Brazilian Portuguese, but used in Portuguese works. The format will be slightly longer because it contains the full year and the month of the formal spelling, but will be converted to UTC because of portability.

var options =
 {year: "Numeric", Month: "Long", Day: "Numeric", Hour: "2-digit", Minute
 : "2-digit",
 Timezon ENAME: "Short", TimeZone: "UTC"};
var portuguesetime =
 new Intl.datetimeformat (["Pt-br", "pt-pt"], options); 
 
De Julho de 2014 00:00 GMT
Print (Portuguesetime.format (july172014));

What about a weekly dispatch table for a compressed, UTC-formatted Swiss train? We try to use the official language to choose the most readable one in terms of popularity from big to small.

var swisslocales = ["De-ch", "Fr-ch", "It-ch", "rm-ch"];var options =
 {weekday: "Short",
 Hour: "Numeric", Minute: "Numeric",
 timeZone: "UTC", timeZoneName: "Short"};
var swisstime =
 New Intl.datetimeformat (Swisslocales, Options). format; 
 
Print (Swisstime (july172014)); Do. 00:00 GMT

Or we try a date in a descriptive text from a picture in a Japanese museum that uses the year and era calendars of Japan.

var Jpyearera =
 new Intl.datetimeformat ("Ja-jp-u-ca-japanese",
       {year: "Numeric", Era: "Long"}); 
 
Print (Jpyearera.format (july172014)); Flat into 26

For some completely different, longer date, Thai Thai, but using the Thai digital system and the Chinese calendar. (similar to Firefox's high quality implementations usually use ordinary th-th as TH-TH-U-CA-BUDDHIST-NU-LATN, because Thailand uses the Buddha calendar system and Latin 0-9 digits).

var options =
 {year: "Numeric", Month: "Long", Day: "Numeric"};
var thaidate =
 new Intl.datetimeformat ("Th-th-u-nu-thai-ca-chinese", options); 
 
Print (Thaidate.format (july172014)); // ?? 6??

Apart from the calendar and the digital system, it is very simple. Just select your own component and length.

Options

The primary option attributes for number formatting are:

    • Style

"Currency", "percent", or "decimal" (default value).

    • Currency

3-Letter currency code, such as USD, CHF. You need style to be "currency" or it doesn't make sense.

    • Currencydisplay

"Code", "symbol", or "name", default to "symbol". "Code" uses the 3-letter currency code of the format string. "Symbol" uses a currency symbol, such as $ or £. "Name" usually uses some of the official spelling versions of the currency. (Firefox currently only supports "symbol", this problem is not fixed)

    • Minimumintegerdigits

integers ranging from 1 to 21 (contained), default to 1.  The integer portion of the resulting string, if not so long, is preceded by a 0来 fill. (If this value is 2, then 3 is formatted as "03".) )

    • Minimumfractiondigits, Maximumfractiondigits

0-20 (contains) an integer. The result string is at least minimumfractiondigits and at most maximumfractiondigits a valid number. If style is "currency", the default minimum is related to money (usually 2, rarely 0 or 3), or 0. Default maximum, the percentage is 0, the number is 3, and the maximum value of the currency is related to the currency.

    • Minimumsignificantdigits, Maximumsignificantdigits

1-21 (contains) an integer. If they do, they will override the number control over integers/fractions, and by them and the number of required lengths, together determine the value of the smallest/largest number in the formatted numeric string. (Note that for multiples of 10, valid digits may not be accurate, such as 100, it's 1,2,3 digits.) )

    • Usegrouping

A Boolean value (the default true) determines whether the formatted string contains a grouping separator (for example, thousand separator in English, ",").

NumberFormat also identifies difficult, most localematcher attributes that can be ignored.


Regionalization Options

In Unicode extensions, using the NU keyword enables DateTimeFormat to support custom digital systems, NumberFormat. For example, in China, the language tag of Chinese is zh-cn. The corresponding value of the Chinese decimal digit system is hanidec. To format the numbers of these systems, we add some Unicode extensions to these language tags: zh-cn-u-nu-hanidec.

For complete information on the identification of different digital systems, see NumberFormat detailed documentation.

Example

The NumberFormat object has a format method, which is the same as DateTimeFormat. The format method is a bounded function that can sometimes be used independently of NumberFormat.


Below is an example of creating the NumberFormat option for a specific purpose in the current Firefox environment. First, we'll format the Chinese currency format, especially with Chinese characters (rather than the more popular Latin numbers). Select the "Currency" style, and then use the currency code of renminbi (yuan), where the numbers are grouped by default, and the decimal part numbers are in the usual practice.

var Tibetanrmbinchina =
 new Intl.numberformat ("Zh-cn-u-nu-hanidec",
      {style: "Currency", Currency: "CNY"}); 
Print (Tibetanrmbinchina.format (1314.25)); ¥ one, 314.25

Or we're going to format the American gas price, with a quirky 9 in American English.

var gasprice =
 new Intl.numberformat ("en-us",
      {style: "Currency", Currency: "USD",
       minimumfractiondigits : 3}); 
Print (Gasprice.format (5.259)); $5.259

Or we try the percentages in the Arabic language of Egypt. A percentage is determined to have at least 2 valid digits. (Note that this and other RTL examples may not appear in the same order as the RTL context, such as?????? rather than??????, rtl/right to left, Right-to-left)

var arabicpercent =
 new Intl.numberformat ("Ar-eg",
      {style: "percent",
       minimumfractiondigits:2}). format ; 
Print (Arabicpercent (0.438)); // ??????

Or suppose we format the Persian language of Afghanistan, we expect at least 2-bit integers, up to 2 decimal parts.

var persiandecimal =
 new Intl.numberformat ("Fa-af",
      {minimumintegerdigits:2,
       maximumfractiondigits : 2}); 
Print (Persiandecimal.format (3.1416)); // ?????

Finally, we format the number of Bahraini dinars in Bahrain's Arabic language. Unlike most currencies, Bahraini dinar equals 1000 dinars, so we need three decimal places. (again, don't be too sure about the surface reading order)?

var bahrainidinars =
 new Intl.numberformat ("Ar-bh",
      {style: "Currency", Currency: "BHD"}); 
Print (Bahrainidinars.format (3.17)); // ?.?.? ?????

Finishing
Options

The following are the main option properties for collation:

    • Usage

"Sort" or "search" (Default "Sort"), specifying the purpose of the collator. (The lookup sequencer may be more concerned with whether the string is equal than the sort defragmenter.) )

    • sensitivity/Sensitivity

"Base", "accent", "case", or "variant". This option affects the sensitivity of the classifier to characters with the same basic character but accent/tone. (the base character is related to the locale: "A" and "?" The basic characters are the same in German, but the Swedish language is different. "Base" sensitivity only considers basic characters, ignoring various variants (such as "a", "a", and "?" in German). be considered to be the same). The "accent" sensitivity considers basic characters and accents, but ignores case (such as "a" and "a" in German) is the same, "?" Different from the two. "Case" considers basic characters and capitalization, ignoring stress (such as "a" and "?" in German). The same, but different from "A". "Variant" considers basic characters, accents, and capitalization (such as "a" in German, "?" and "A" are different). If the usage is "sort", the default "variant"; Otherwise it is related to the locale.

    • Numeric

The default false Boolean value that determines whether the number in the string is compared as a number. such as the number of the results may be the sort of "F-4 Phantom II", "F-14 Tomcat", "F-35 Lightning II"; Not as a result of numbers "F-14 Tomcat", "F-35 Lightning II", "F-4 Phantom II".

    • Casefirst

"Upper", "lower", or "false" (default). Determines whether or not to consider capitalization when comparing: "Upper" puts uppercase in front ("B", "a", "C"), "lower" puts lowercase in front ("A", "C", "B"), "false" completely ignores case ("a", "B", "C"). (Note that Firefox now completely ignores this attribute)

    • Ignorepunctuation

The default false Boolean value. Determines whether punctuation (such as "biweekly" and "bi-weekly" are equal) are ignored when comparing.

Localematcher property you can ignore it.

Area-related options

The Unicode Extension section of a zone specifies the main options for the PHONEBK is co, which selects the sort operations: Telephone book (), dictionary (Dict), and others.

In addition, KN and KF two keys can select the numeric and Casefirst properties of the Copy option object. However, it is not guaranteed to be available for language tags, and the options are clearer than the components of language tags. So it's best to make adjustments within the options.

Key-value (Paris) is embedded in Unicode in the same way as DateTimeFormat and NumberFormat; To find out how to specify them in the Language tab, you can view the corresponding chapters.

Example

The Collator object has a comparison function property. This function accepts 2 arguments x and Y, and if X<y returns a negative value, X>Y returns a positive value and X==y returns 0. For formatting functions, comparisons are bounded functions (bound function) that can be extracted for other purposes.

We tried to sort the surnames of German German. There are actually 2 kinds of sorting methods, telephone books and dictionaries in German. This sort of phone book emphasizes pronunciation, such as "?", "?" The approximation is extended to "ae", "oe", and so on.

var names =
 ["Hochberg", "H?nigswald", "Holzman"]; 
var germanphonebook = new Intl.collator ("DE-DE-U-CO-PHONEBK"); 
 
Just like on ["Hochberg", "Hoenigswald", "Holzman"] sort
//Hochberg, H?nigswald, Holzman
print ( Germanphonebook.compare). Join (","));

Some German words use diacritics to change the inflection, so sorting in the dictionary ignores the variable notes (except for the two words that only change the notes: Schon,sch?n).

var germandictionary = new Intl.collator ("De-de-u-co-dict"); 
 
Just like on ["Hochberg", "Hoenigswald", "Holzman"] sort
//Hochberg, H?nigswald, Holzman
print ( Germandictionary.compare). Join (","));

Or we'll sort out the Firefox versions of American English, which are deliberately misspelled (different capitalization, random accent, tone marks, extra hyphens). We want to sort by version number, so we do numeric sorting so that the numbers in the string are compared as numbers, not characters.

var firefoxen =
 ["Firef?x 3.6",
 "Fire-fox 1.0",
 "Firefox",
 "Fírefox 3.5",
 "Fírefox"]; 
var usversion =
 new Intl.collator ("en-us",
     {sensitivity: "base",
      Numeric:true,
      ignorepunctuation: true}); 
 
Fire-fox 1.0, Fírefox 3.5, Firef?x 3.6, Fírefox, Firefox
print (Firefoxen.sort (usversion.compare). Join (",") );

Finally, we'll do a region-related string lookup, looking for ignoring case and accent, and for U.S. English.


Comparisons work with both composed and decomposed forms.
var decoratedbrowsers =
 [
 "A\u0362maya",//A?maya
 "Ch\u035br?me",//Ch?r?me
 "Firefóx",
 "Safàri "," O\u0323pera ",//
 Pera
 " i\u0352e ",//  I? E
 ]; 
var fuzzysearch =
 new Intl.collator ("en-us",
     {usage: "search", Sensitivity: "base"}); 
function Findbrowser (browser)
{
 function cmp (other)
 {return
 fuzzysearch.compare (browser, Other) = = 0
 ;
 return CMP; 
} 
Print (Decoratedbrowsers.findindex (Findbrowser ("Firêfox")); 2 
Print (Decoratedbrowsers.findindex (Findbrowser ("Saf?ri"));//3 
print (Decoratedbrowsers.findindex ( Findbrowser ("? Maya")); 0 
Print (Decoratedbrowsers.findindex (Findbrowser ("? Pera"));//4 
print (Decoratedbrowsers.findindex ( Findbrowser ("Chromè")); 1 
Print (Decoratedbrowsers.findindex ("I?")) (Findbrowser);  5

Trivial

It can be useful to detect whether an operation supports a specific zone, or if a zone is supported. Intl provides the Supportedlocales () function on each constructor, and provides the Resolvedoptions () function on each prototype to expose the information.

var navajolocales =
 Intl.Collator.supportedLocalesOf (["NV"], {usage: "sort"});
Print (navajolocales.length > 0
  ?) "Navajo collation Supported"
  : "Navajo collation not supported"); 
var germanfakeregion =
 new Intl.datetimeformat ("De-xx", {timeZone: "UTC"});
var usedoptions = Germanfakeregion.resolvedoptions ();
Print (Usedoptions.locale); De
print (usedoptions.timezone);//UTC

Legacy behavior

ES5 's tolocalestring and LOCALECOMPAR functions have no specific semantics and do not accept specific options, and are basically useless. So the I18N API reorganized them according to the intl operation. Each method now accepts additional trailing locales and options parameters, which are interpreted like the Intl constructor. (except for tolocaletimestring and tolocaledatestring, they will use different default components if the options are not available)

For simple applications that don't care about precise behavior, it doesn't matter, and old methods can be used. But if you need more control or multiple formatting and comparisons, it's best to use intl directly.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.