Internationalization Functions

Table of Contents

The Collator class

Introduction

Provides string comparison capability with support for appropriate locale-sensitive sort orderings.

Class synopsis

Collator
class Collator {
/* Methods */
public __construct ( string $locale )
public bool asort ( array &$arr [, int $sort_flag ] )
public int compare ( string $str1 , string $str2 )
public static Collator create ( string $locale )
public int getAttribute ( int $attr )
public int getErrorCode ( void )
public string getErrorMessage ( void )
public string getLocale ( int $type )
public string getSortKey ( string $str )
public int getStrength ( void )
public bool setAttribute ( int $attr , int $val )
public bool setStrength ( int $strength )
public bool sortWithSortKeys ( array &$arr )
public bool sort ( array &$arr [, int $sort_flag ] )
}

Predefined Constants

Collator::FRENCH_COLLATION (integer)

Sort strings with different accents from the back of the string. This attribute is automatically set to On for the French locales and a few others. Users normally would not need to explicitly set this attribute. There is a string comparison performance cost when it is set On, but sort key length is unaffected. Possible values are:

  • Collator::ON
  • Collator::OFF(default)
  • Collator::DEFAULT_VALUE

Example #1 FRENCH_COLLATION rules

  • F=OFF cote < coté < côte < côté
  • F=ON cote < côte < coté < côté

Collator::ALTERNATE_HANDLING (integer)

The Alternate attribute is used to control the handling of the so called variable characters in the UCA: whitespace, punctuation and symbols. If Alternate is set to NonIgnorable (N), then differences among these characters are of the same importance as differences among letters. If Alternate is set to Shifted (S), then these characters are of only minor importance. The Shifted value is often used in combination with Strength set to Quaternary. In such a case, whitespace, punctuation, and symbols are considered when comparing strings, but only if all other aspects of the strings (base letters, accents, and case) are identical. If Alternate is not set to Shifted, then there is no difference between a Strength of 3 and a Strength of 4. For more information and examples, see Variable_Weighting in the » UCA. The reason the Alternate values are not simply On and Off is that additional Alternate values may be added in the future. The UCA option Blanked is expressed with Strength set to 3, and Alternate set to Shifted. The default for most locales is NonIgnorable. If Shifted is selected, it may be slower if there are many strings that are the same except for punctuation; sort key length will not be affected unless the strength level is also increased.

Possible values are:

  • Collator::NON_IGNORABLE(default)
  • Collator::SHIFTED
  • Collator::DEFAULT_VALUE

Example #2 ALTERNATE_HANDLING rules

  • S=3, A=N di Silva < Di Silva < diSilva < U.S.A. < USA
  • S=3, A=S di Silva = diSilva < Di Silva < U.S.A. = USA
  • S=4, A=S di Silva < diSilva < Di Silva < U.S.A. < USA

Collator::CASE_FIRST (integer)

The Case_First attribute is used to control whether uppercase letters come before lowercase letters or vice versa, in the absence of other differences in the strings. The possible values are Uppercase_First (U) and Lowercase_First (L), plus the standard Default and Off. There is almost no difference between the Off and Lowercase_First options in terms of results, so typically users will not use Lowercase_First: only Off or Uppercase_First. (People interested in the detailed differences between X and L should consult the Collation Customization). Specifying either L or U won't affect string comparison performance, but will affect the sort key length.

Possible values are:

  • Collator::OFF(default)
  • Collator::LOWER_FIRST
  • Collator::UPPER_FIRST
  • Collator:DEFAULT

Example #3 CASE_FIRST rules

  • C=X or C=L "china" < "China" < "denmark" < "Denmark"
  • C=U "China" < "china" < "Denmark" < "denmark"

Collator::CASE_LEVEL (integer)

The Case_Level attribute is used when ignoring accents but not case. In such a situation, set Strength to be Primary, and Case_Level to be On. In most locales, this setting is Off by default. There is a small string comparison performance and sort key impact if this attribute is set to be On.

Possible values are:

  • Collator::OFF(default)
  • Collator::ON
  • Collator::DEFAULT_VALUE

Example #4 CASE_LEVEL rules

  • S=1, E=X role = Role = rôle
  • S=1, E=O role = rôle < Role

Collator::NORMALIZATION_MODE (integer)

The Normalization setting determines whether text is thoroughly normalized or not in comparison. Even if the setting is off (which is the default for many locales), text as represented in common usage will compare correctly (for details, see UTN #5). Only if the accent marks are in noncanonical order will there be a problem. If the setting is On, then the best results are guaranteed for all possible text input. There is a medium string comparison performance cost if this attribute is On, depending on the frequency of sequences that require normalization. There is no significant effect on sort key length. If the input text is known to be in NFD or NFKD normalization forms, there is no need to enable this Normalization option.

Possible values are:

  • Collator::OFF(default)
  • Collator::ON
  • Collator::DEFAULT_VALUE

Collator::STRENGTH (integer)

The ICU Collation Service supports many levels of comparison (named "Levels", but also known as "Strengths"). Having these categories enables ICU to sort strings precisely according to local conventions. However, by allowing the levels to be selectively employed, searching for a string in text can be performed with various matching conditions. For more detailed information, see collator_set_strength chapter.

Possible values are:

  • Collator::PRIMARY
  • Collator::SECONDARY
  • Collator::TERTIARY(default)
  • Collator::QUATERNARY
  • Collator::IDENTICAL
  • Collator::DEFAULT_VALUE

Collator::HIRAGANA_QUATERNARY_MODE (integer)

Compatibility with JIS x 4061 requires the introduction of an additional level to distinguish Hiragana and Katakana characters. If compatibility with that standard is required, then this attribute should be set On, and the strength set to Quaternary. This will affect sort key length and string comparison string comparison performance.

Possible values are:

  • Collator::OFF(default)
  • Collator::ON
  • Collator::DEFAULT_VALUE

Collator::NUMERIC_COLLATION (integer)

When turned on, this attribute generates a collation key for the numeric value of substrings of digits. This is a way to get '100' to sort AFTER '2'.

Possible values are:

  • Collator::OFF(default)
  • Collator::ON
  • Collator::DEFAULT_VALUE

Collator::DEFAULT_VALUE (integer)
Collator::PRIMARY (integer)
Collator::SECONDARY (integer)
Collator::TERTIARY (integer)
Collator::DEFAULT_STRENGTH (integer)
Collator::QUATERNARY (integer)
Collator::IDENTICAL (integer)
Collator::OFF (integer)
Collator::ON (integer)
Collator::SHIFTED (integer)
Collator::NON_IGNORABLE (integer)
Collator::LOWER_FIRST (integer)
Collator::UPPER_FIRST (integer)

The NumberFormatter class

Introduction

Programs store and operate on numbers using a locale-independent binary representation. When displaying or printing a number it is converted to a locale-specific string. For example, the number 12345.67 is "12,345.67" in the US, "12 345,67" in France and "12.345,67" in Germany.

By invoking the methods provided by the NumberFormatter class, you can format numbers, currencies, and percentages according to the specified or default locale. NumberFormatter is locale-sensitive so you need to create a new NumberFormatter for each locale. NumberFormatter methods format primitive-type numbers, such as double and output the number as a locale-specific string.

For currencies you can use currency format type to create a formatter that returns a string with the formatted number and the appropriate currency sign. Of course, the NumberFormatter class is unaware of exchange rates so, the number output is the same regardless of the specified currency. This means that the same number has different monetary values depending on the currency locale. If the number is 9988776.65 the results will be:

  • 9 988 776,65 € in France
  • 9.988.776,65 € in Germany
  • $9,988,776.65 in the United States

In order to format percentages, create a locale-specific formatter with percentage format type. With this formatter, a decimal fraction such as 0.75 is displayed as 75%.

For more complex formatting, like spelled-out numbers, the rule-based number formatters are used.

Class synopsis

NumberFormatter
class NumberFormatter {
/* Methods */
public __construct ( string $locale , int $style [, string $pattern ] )
public static NumberFormatter create ( string $locale , int $style [, string $pattern ] )
public string formatCurrency ( float $value , string $currency )
public string format ( number $value [, int $type ] )
public int getAttribute ( int $attr )
public int getErrorCode ( void )
public string getErrorMessage ( void )
public string getLocale ([ int $type ] )
public string getPattern ( void )
public string getSymbol ( int $attr )
public string getTextAttribute ( int $attr )
public float parseCurrency ( string $value , string &$currency [, int &$position ] )
public mixed parse ( string $value [, int $type [, int &$position ]] )
public bool setAttribute ( int $attr , int $value )
public bool setPattern ( string $pattern )
public bool setSymbol ( int $attr , string $value )
public bool setTextAttribute ( int $attr , string $value )
}

Predefined Constants

These styles are used by the numfmt_create to define the type of the formatter.

NumberFormatter::PATTERN_DECIMAL (integer)
Decimal format defined by pattern
NumberFormatter::DECIMAL (integer)
Decimal format
NumberFormatter::CURRENCY (integer)
Currency format
NumberFormatter::PERCENT (integer)
Percent format
NumberFormatter::SCIENTIFIC (integer)
Scientific format
NumberFormatter::SPELLOUT (integer)
Spellout rule-based format
NumberFormatter::ORDINAL (integer)
Ordinal rule-based format
NumberFormatter::DURATION (integer)
Duration rule-based format
NumberFormatter::PATTERN_RULEBASED (integer)
Rule-based format defined by pattern
NumberFormatter::DEFAULT_STYLE (integer)
Default format for the locale
NumberFormatter::IGNORE (integer)
Alias for PATTERN_DECIMAL

These constants define how the numbers are parsed or formatted. They should be used as arguments to numfmt_format and numfmt_parse.

NumberFormatter::TYPE_DEFAULT (integer)
Derive the type from variable type
NumberFormatter::TYPE_INT32 (integer)
Format/parse as 32-bit integer
NumberFormatter::TYPE_INT64 (integer)
Format/parse as 64-bit integer
NumberFormatter::TYPE_DOUBLE (integer)
Format/parse as floating point value
NumberFormatter::TYPE_CURRENCY (integer)
Format/parse as currency value

Number format attribute used by numfmt_get_attribute and numfmt_set_attribute.

NumberFormatter::PARSE_INT_ONLY (integer)
Parse integers only.
NumberFormatter::GROUPING_USED (integer)
Use grouping separator.
NumberFormatter::DECIMAL_ALWAYS_SHOWN (integer)
Always show decimal point.
NumberFormatter::MAX_INTEGER_DIGITS (integer)
Maximum integer digits.
NumberFormatter::MIN_INTEGER_DIGITS (integer)
Minimum integer digits.
NumberFormatter::INTEGER_DIGITS (integer)
Integer digits.
NumberFormatter::MAX_FRACTION_DIGITS (integer)
Maximum fraction digits.
NumberFormatter::MIN_FRACTION_DIGITS (integer)
Minimum fraction digits.
NumberFormatter::FRACTION_DIGITS (integer)
Fraction digits.
NumberFormatter::MULTIPLIER (integer)
Multiplier.
NumberFormatter::GROUPING_SIZE (integer)
Grouping size.
NumberFormatter::ROUNDING_MODE (integer)
Rounding Mode.
NumberFormatter::ROUNDING_INCREMENT (integer)
Rounding increment.
NumberFormatter::FORMAT_WIDTH (integer)
The width to which the output of format() is padded.
NumberFormatter::PADDING_POSITION (integer)
The position at which padding will take place. See pad position constants for possible argument values.
NumberFormatter::SECONDARY_GROUPING_SIZE (integer)
Secondary grouping size.
NumberFormatter::SIGNIFICANT_DIGITS_USED (integer)
Use significant digits.
NumberFormatter::MIN_SIGNIFICANT_DIGITS (integer)
Minimum significant digits.
NumberFormatter::MAX_SIGNIFICANT_DIGITS (integer)
Maximum significant digits.
NumberFormatter::LENIENT_PARSE (integer)
Lenient parse mode used by rule-based formats.

Number format text attribute used by numfmt_get_text_attribute and numfmt_set_text_attribute.

NumberFormatter::POSITIVE_PREFIX (integer)
Positive prefix.
NumberFormatter::POSITIVE_SUFFIX (integer)
Positive suffix.
NumberFormatter::NEGATIVE_PREFIX (integer)
Negative prefix.
NumberFormatter::NEGATIVE_SUFFIX (integer)
Negative suffix.
NumberFormatter::PADDING_CHARACTER (integer)
The character used to pad to the format width.
NumberFormatter::CURRENCY_CODE (integer)
The ISO currency code.
NumberFormatter::DEFAULT_RULESET (integer)
The default rule set. This is only available with rule-based formatters.
NumberFormatter::PUBLIC_RULESETS (integer)
The public rule sets. This is only available with rule-based formatters. This is a read-only attribute. The public rulesets are returned as a single string, with each ruleset name delimited by ';' (semicolon).

Number format symbols used by numfmt_get_symbol and numfmt_set_symbol.

NumberFormatter::DECIMAL_SEPARATOR_SYMBOL (integer)
The decimal separator.
NumberFormatter::GROUPING_SEPARATOR_SYMBOL (integer)
The grouping separator.
NumberFormatter::PATTERN_SEPARATOR_SYMBOL (integer)
The pattern separator.
NumberFormatter::PERCENT_SYMBOL (integer)
The percent sign.
NumberFormatter::ZERO_DIGIT_SYMBOL (integer)
Zero.
NumberFormatter::DIGIT_SYMBOL (integer)
Character representing a digit in the pattern.
NumberFormatter::MINUS_SIGN_SYMBOL (integer)
The minus sign.
NumberFormatter::PLUS_SIGN_SYMBOL (integer)
The plus sign.
NumberFormatter::CURRENCY_SYMBOL (integer)
The currency symbol.
NumberFormatter::INTL_CURRENCY_SYMBOL (integer)
The international currency symbol.
NumberFormatter::MONETARY_SEPARATOR_SYMBOL (integer)
The monetary separator.
NumberFormatter::EXPONENTIAL_SYMBOL (integer)
The exponential symbol.
NumberFormatter::PERMILL_SYMBOL (integer)
Per mill symbol.
NumberFormatter::PAD_ESCAPE_SYMBOL (integer)
Escape padding character.
NumberFormatter::INFINITY_SYMBOL (integer)
Infinity symbol.
NumberFormatter::NAN_SYMBOL (integer)
Not-a-number symbol.
NumberFormatter::SIGNIFICANT_DIGIT_SYMBOL (integer)
Significant digit symbol.
NumberFormatter::MONETARY_GROUPING_SEPARATOR_SYMBOL (integer)
The monetary grouping separator.

Rounding mode values used by numfmt_get_attribute and numfmt_set_attribute with NumberFormatter::ROUNDING_MODE attribute.

NumberFormatter::ROUND_CEILING (integer)
Rounding mode to round towards positive infinity.
NumberFormatter::ROUND_DOWN (integer)
Rounding mode to round towards zero.
NumberFormatter::ROUND_FLOOR (integer)
Rounding mode to round towards negative infinity.
NumberFormatter::ROUND_HALFDOWN (integer)
Rounding mode to round towards "nearest neighbor" unless both neighbors are equidistant, in which case round down.
NumberFormatter::ROUND_HALFEVEN (integer)
Rounding mode to round towards the "nearest neighbor" unless both neighbors are equidistant, in which case, round towards the even neighbor.
NumberFormatter::ROUND_HALFUP (integer)
Rounding mode to round towards "nearest neighbor" unless both neighbors are equidistant, in which case round up.
NumberFormatter::ROUND_UP (integer)
Rounding mode to round away from zero.

Pad position values used by numfmt_get_attribute and numfmt_set_attribute with NumberFormatter::PADDING_POSITION attribute.

NumberFormatter::PAD_AFTER_PREFIX (integer)
Pad characters inserted after the prefix.
NumberFormatter::PAD_AFTER_SUFFIX (integer)
Pad characters inserted after the suffix.
NumberFormatter::PAD_BEFORE_PREFIX (integer)
Pad characters inserted before the prefix.
NumberFormatter::PAD_BEFORE_SUFFIX (integer)
Pad characters inserted before the suffix.

The Locale class

Introduction

A "Locale" is an identifier used to get language, culture, or regionally-specific behavior from an API. PHP locales are organized and identified the same way that the CLDR locales used by ICU (and many vendors of Unix-like operating systems, the Mac, Java, and so forth) use. Locales are identified using RFC 4646 language tags (which use hyphen, not underscore) in addition to the more traditional underscore-using identifiers. Unless otherwise noted the functions in this class are tolerant of both formats.

Examples of identifiers include:

  • en-US (English, United States)
  • zh-Hant-TW (Chinese, Traditional Script, Taiwan)
  • fr-CA, fr-FR (French for Canada and France respectively)

The Locale class (and related procedural functions) are used to interact with locale identifiers--to verify that an ID is well-formed, valid, etc. The extensions used by CLDR in UAX #35 (and inherited by ICU) are valid and used wherever they would be in ICU normally.

Locales cannot be instantiated as objects. All of the functions/methods provided are static.

The null or empty string obtains the "root" locale. The "root" locale is equivalent to "en_US_POSIX" in CLDR. Language tags (and thus locale identifiers) are case insensitive. There exists a canonicalization function to make case match the specification.

Class synopsis

Locale
class Locale {
/* Methods */
public static string acceptFromHttp ( string $header )
public static string canonicalize ( string $locale )
public static string composeLocale ( array $subtags )
public static bool filterMatches ( string $langtag , string $locale [, bool $canonicalize = false ] )
public static array getAllVariants ( string $locale )
public static string getDefault ( void )
public static string getDisplayLanguage ( string $locale [, string $in_locale ] )
public static string getDisplayName ( string $locale [, string $in_locale ] )
public static string getDisplayRegion ( string $locale [, string $in_locale ] )
public static string getDisplayScript ( string $locale [, string $in_locale ] )
public static string getDisplayVariant ( string $locale [, string $in_locale ] )
public static array getKeywords ( string $locale )
public static string getPrimaryLanguage ( string $locale )
public static string getRegion ( string $locale )
public static string getScript ( string $locale )
public static string lookup ( array $langtag , string $locale [, bool $canonicalize = false [, string $default ]] )
public static array parseLocale ( string $locale )
public static bool setDefault ( string $locale )
}

Predefined Constants

Locale::DEFAULT_LOCALE (null)
Used as locale parameter with the methods of the various locale affected classes, such as NumberFormatter. This constant would make the methods to use default locale.

These constants describe the choice of the locale for getLocalte method of different classes.

Locale::ACTUAL_LOCALE (string)
This is locale the data actually comes from.
Locale::VALID_LOCALE (string)
This is the most specific locale supported by ICU.

These constants define how the Locales are parsed or composed. They should be used as keys in the argument array to locale_compose and are returned from locale_parse as keys of the returned associative array.

Locale::LANG_TAG (string)
Language subtag
Locale::EXTLANG_TAG (string)
Extended language subtag
Locale::SCRIPT_TAG (string)
Script subtag
Locale::REGION_TAG (string)
Region subtag
Locale::VARIANT_TAG (string)
Variant subtag
Locale::GRANDFATHERED_LANG_TAG (string)
Grandfathered Language subtag
Locale::PRIVATE_TAG (string)
Private subtag

The Normalizer class

Introduction

Normalization is a process that involves transforming characters and sequences of characters into a formally-defined underlying representation. This process is most important when text needs to be compared for sorting and searching, but it is also used when storing text to ensure that the text is stored in a consistent representation.

The Unicode Consortium has defined a number of normalization forms reflecting the various needs of applications:

  • Normalization Form D (NFD) - Canonical Decomposition
  • Normalization Form C (NFC) - Canonical Decomposition followed by Canonical Composition
  • Normalization Form KD (NFKD) - Compatibility Decomposition
  • Normalization Form KC (NFKC) - Compatibility Decomposition followed by Canonical Composition
The different forms are defined in terms of a set of transformations on the text, transformations that are expressed by both an algorithm and a set of data files.

Class synopsis

Normalizer
class Normalizer {
/* Methods */
public static bool isNormalized ( string $input [, string $form = Normalizer::FORM_C ] )
public static string normalize ( string $input [, string $form = Normalizer::FORM_C ] )
}

Predefined Constants

The following constants define the normalization form used by the normalizer:

Normalizer::FORM_C (string)
Normalization Form C (NFC) - Canonical Decomposition followed by Canonical Composition
Normalizer::FORM_D (string)
Normalization Form D (NFD) - Canonical Decomposition
Normalizer::FORM_KC (string)
Normalization Form KC (NFKC) - Compatibility Decomposition, followed by Canonical Composition
Normalizer::FORM_KD (string)
Normalization Form KD (NFKD) - Compatibility Decomposition
Normalizer::NONE (string)
No decomposition/composition
Normalizer::OPTION_DEFAULT (string)
Default normalization options

The MessageFormatter class

Introduction

MessageFormatter is a concrete class that enables users to produce concatenated, language-neutral messages. The methods supplied in this class are used to build all the messages that are seen by end users.

The MessageFormatter class assembles messages from various fragments (such as text fragments, numbers, and dates) supplied by the program. Because of the MessageFormatter class, the program does not need to know the order of the fragments. The class uses the formatting specifications for the fragments to assemble them into a message that is contained in a single string within a resource bundle. For example, MessageFormatter enables you to print the phrase "Finished printing x out of y files..." in a manner that still allows for flexibility in translation.

Previously, an end user message was created as a sentence and handled as a string. This procedure created problems for localizers because the sentence structure, word order, number format and so on are very different from language to language. The language-neutral way to create messages keeps each part of the message separate and provides keys to the data. Using these keys, the MessageFormatter class can concatenate the parts of the message, localize them, and display a well-formed string to the end user.

MessageFormatter takes a set of objects, formats them, and then inserts the formatted strings into the pattern at the appropriate places. Choice formats can be used in conjunction with MessageFormatter to handle plurals, match numbers, and select from an array of items. Typically, the message format will come from resources and the arguments will be dynamically set at runtime.

Class synopsis

MessageFormatter
class MessageFormatter {
/* Methods */
public __construct ( string $locale , string $pattern )
public static MessageFormatter create ( string $locale , string $pattern )
public static string formatMessage ( string $locale , string $pattern , array $args )
public string format ( array $args )
public int getErrorCode ( void )
public string getErrorMessage ( void )
public string getLocale ( void )
public string getPattern ( void )
public static array parseMessage ( string $locale , string $pattern , string $source )
public array parse ( string $value )
public bool setPattern ( string $pattern )
}

The IntlCalendar class

Introduction

Class synopsis

IntlCalendar
class IntlCalendar {
/* Constants */
const integer IntlCalendar::FIELD_ERA = 0 ;
const integer IntlCalendar::FIELD_YEAR = 1 ;
const integer IntlCalendar::FIELD_MONTH = 2 ;
const integer IntlCalendar::FIELD_WEEK_OF_YEAR = 3 ;
const integer IntlCalendar::FIELD_WEEK_OF_MONTH = 4 ;
const integer IntlCalendar::FIELD_DATE = 5 ;
const integer IntlCalendar::FIELD_DAY_OF_YEAR = 6 ;
const integer IntlCalendar::FIELD_DAY_OF_WEEK = 7 ;
const integer IntlCalendar::FIELD_DAY_OF_WEEK_IN_MONTH = 8 ;
const integer IntlCalendar::FIELD_AM_PM = 9 ;
const integer IntlCalendar::FIELD_HOUR = 10 ;
const integer IntlCalendar::FIELD_HOUR_OF_DAY = 11 ;
const integer IntlCalendar::FIELD_MINUTE = 12 ;
const integer IntlCalendar::FIELD_SECOND = 13 ;
const integer IntlCalendar::FIELD_MILLISECOND = 14 ;
const integer IntlCalendar::FIELD_ZONE_OFFSET = 15 ;
const integer IntlCalendar::FIELD_DST_OFFSET = 16 ;
const integer IntlCalendar::FIELD_YEAR_WOY = 17 ;
const integer IntlCalendar::FIELD_DOW_LOCAL = 18 ;
const integer IntlCalendar::FIELD_EXTENDED_YEAR = 19 ;
const integer IntlCalendar::FIELD_JULIAN_DAY = 20 ;
const integer IntlCalendar::FIELD_MILLISECONDS_IN_DAY = 21 ;
const integer IntlCalendar::FIELD_IS_LEAP_MONTH = 22 ;
const integer IntlCalendar::FIELD_FIELD_COUNT = 23 ;
const integer IntlCalendar::FIELD_DAY_OF_MONTH = 5 ;
const integer IntlCalendar::DOW_SUNDAY = 1 ;
const integer IntlCalendar::DOW_MONDAY = 2 ;
const integer IntlCalendar::DOW_TUESDAY = 3 ;
const integer IntlCalendar::DOW_WEDNESDAY = 4 ;
const integer IntlCalendar::DOW_THURSDAY = 5 ;
const integer IntlCalendar::DOW_FRIDAY = 6 ;
const integer IntlCalendar::DOW_SATURDAY = 7 ;
const integer IntlCalendar::DOW_TYPE_WEEKDAY = 0 ;
const integer IntlCalendar::DOW_TYPE_WEEKEND = 1 ;
const integer IntlCalendar::DOW_TYPE_WEEKEND_OFFSET = 2 ;
const integer IntlCalendar::DOW_TYPE_WEEKEND_CEASE = 3 ;
const integer IntlCalendar::WALLTIME_FIRST = 1 ;
const integer IntlCalendar::WALLTIME_LAST = 0 ;
const integer IntlCalendar::WALLTIME_NEXT_VALID = 2 ;
/* Methods */
public bool add ( int $field , int $amount )
bool intlcal_add ( IntlCalendar $cal , int $field , int $amount )
public bool after ( IntlCalendar $other )
bool intlcal_after ( IntlCalendar $cal , IntlCalendar $other )
public bool before ( IntlCalendar $other )
bool intlcal_before ( IntlCalendar $cal , IntlCalendar $other )
public bool clear ([ int $field = NULL ] )
bool intlcal_clear ( IntlCalendar $cal [, int $field = NULL ] )
private __construct ( void )
public static IntlCalendar createInstance ([ mixed $timeZone = NULL [, string $locale = "" ]] )
IntlCalendar intlcal_create_instance ([ mixed $timeZone = NULL [, string $locale = "" ]] )
public bool equals ( IntlCalendar $other )
bool intlcal_equals ( IntlCalendar $cal , IntlCalendar $other )
public int fieldDifference ( float $when , int $field )
int intlcal_field_difference ( IntlCalendar $cal , float $when , int $field )
public static IntlCalendar fromDateTime ( mixed $dateTime )
IntlCalendar intlcal_from_date_time ( mixed $dateTime )
public int get ( int $field )
int intlcal_get ( IntlCalendar> $cal , int $field )
public int getActualMaximum ( int $field )
int intlcal_get_actual_maximum ( IntlCalendar $cal , int $field )
public int getActualMinimum ( int $field )
int intlcal_get_actual_minimum ( IntlCalendar $cal , int $field )
public static array getAvailableLocales ( void )
array intlcal_get_available_locales ( void )
public int getDayOfWeekType ( int $dayOfWeek )
int intlcal_get_day_of_week_type ( IntlCalendar $cal , int $dayOfWeek )
public int getErrorCode ( void )
public string getErrorMessage ( void )
public int getFirstDayOfWeek ( void )
int intlcal_get_first_day_of_week ( IntlCalendar $cal )
public int getGreatestMinimum ( int $field )
int intlcal_get_greatest_minimum ( IntlCalendar $cal , int $field )
public static Iterator getKeywordValuesForLocale ( string $key , string $locale , boolean $commonlyUsed )
Iterator intlcal_get_keyword_values_for_locale ( string $key , string $locale , boolean $commonlyUsed )
public int getLeastMaximum ( int $field )
int intlcal_get_least_maximum ( IntlCalendar $cal , int $field )
public string getLocale ( int $localeType )
string intlcal_get_locale ( IntlCalendar $cal , int $localeType )
public int getMaximum ( int $field )
int intlcal_get_maximum ( IntlCalendar $cal , int $field )
public int getMinimalDaysInFirstWeek ( void )
int intlcal_get_minimal_days_in_first_week ( IntlCalendar $cal )
public int getMinimum ( int $field )
int intlcal_get_minimum ( IntlCalendar $cal , int $field )
public static float getNow ( void )
float intlcal_get_now ( void )
public int getRepeatedWallTimeOption ( void )
int intlcal_get_repeated_wall_time_option ( IntlCalendar $cal )
public int getSkippedWallTimeOption ( void )
int intlcal_get_skipped_wall_time_option ( IntlCalendar $cal )
public float getTime ( void )
float intlcal_get_time ( IntlCalendar $cal )
public IntlTimeZone getTimeZone ( void )
IntlTimeZone intlcal_get_time_zone ( IntlCalendar $cal )
public string getType ( void )
string intlcal_get_type ( IntlCalendar $cal )
public int getWeekendTransition ( string $dayOfWeek )
int intlcal_get_weekend_transition ( IntlCalendar $cal , string $dayOfWeek )
public bool inDaylightTime ( void )
bool intlcal_in_daylight_time ( IntlCalendar $cal )
public bool isEquivalentTo ( IntlCalendar $other )
bool intlcal_is_equivalent_to ( IntlCalendar $cal , IntlCalendar $other )
public bool isLenient ( void )
bool intlcal_is_lenient ( IntlCalendar $cal )
public bool isSet ( int $field )
bool intlcal_is_set ( IntlCalendar $cal , int $field )
public bool isWeekend ([ float $date = NULL ] )
bool intlcal_is_weekend ( IntlCalendar $cal [, float $date = NULL ] )
public bool roll ( int $field , mixed $amountOrUpOrDown )
bool intlcal_roll ( IntlCalendar $cal , int $field , mixed $amountOrUpOrDown )
public bool set ( int $field , int $value )
public bool set ( int $year , int $month [, int $dayOfMonth = NULL [, int $hour = NULL [, int $minute = NULL [, int $second = NULL ]]]] )
bool intlcal_set ( IntlCalendar $cal , int $field , int $value )
bool intlcal_set ( IntlCalendar $cal , int $year , int $month [, int $dayOfMonth = NULL [, int $hour = NULL [, int $minute = NULL [, int $second = NULL ]]]] )
public bool setFirstDayOfWeek ( int $dayOfWeek )
bool intlcal_set_first_day_of_week ( IntlCalendar $cal , int $dayOfWeek )
public ReturnType setLenient ( string $isLenient )
ReturnType intlcal_set_lenient ( IntlCalendar $cal , string $isLenient )
public bool setMinimalDaysInFirstWeek ( int $minimalDays )
bool intlcal_get_minimal_days_in_first_week ( IntlCalendar $cal , int $minimalDays )
public bool setRepeatedWallTimeOption ( int $wallTimeOption )
bool intlcal_set_repeated_wall_time_option ( IntlCalendar $cal , int $wallTimeOption )
public bool setSkippedWallTimeOption ( int $wallTimeOption )
bool intlcal_set_skipped_wall_time_option ( IntlCalendar $cal , int $wallTimeOption )
public bool setTime ( float $date )
bool intlcal_set_time ( IntlCalendar $cal , float $date )
public bool setTimeZone ( mixed $timeZone )
bool intlcal_set_time_zone ( IntlCalendar $cal , mixed $timeZone )
public DateTime toDateTime ( void )
DateTime intlcal_to_date_time ( IntlCalendar $cal )
}

Predefined Constants

IntlCalendar::FIELD_ERA

Calendar field numerically representing an era, for instance 1 for AD and 0 for BC in the Gregorian/Julian calendars and 235 for the Heisei (平成) era in the Japanese calendar. Not all calendars have more than one era.

IntlCalendar::FIELD_YEAR

Calendar field for the year. This is not unique across eras. If the calendar type has more than one era, generally the minimum value for this field will be 1.

IntlCalendar::FIELD_MONTH

Calendar field for the month. The month sequence is zero-based, so Janurary (here used to signify the first month of the calendar; this may be called another name, such as Muharram in the Islamic calendar) is represented by 0, February by 1, …, December by 11 and, for calendars that have it, the 13th or leap month by 12.

IntlCalendar::FIELD_WEEK_OF_YEAR

Calendar field for the number of the week of the year. This depends on which day of the week is deemed to start the week and the minimal number of days in a week.

IntlCalendar::FIELD_WEEK_OF_MONTH

Calendar field for the number of the week of the month. This depends on which day of the week is deemed to start the week and the minimal number of days in a week.

IntlCalendar::FIELD_DATE

Calendar field for the day of the month. The same as IntlCalendar::FIELD_DAY_OF_MONTH, which has a clearer name.

IntlCalendar::FIELD_DAY_OF_YEAR

Calendar field for the day of the year. For the Gregorian calendar, starts with 1 and ends with 365 or 366.

IntlCalendar::FIELD_DAY_OF_WEEK

Calendar field for the day of the week. Its values start with 1 (Sunday, see IntlCalendar::DOW_SUNDAY and subsequent constants) and the last valid value is 7 (Saturday).

IntlCalendar::FIELD_DAY_OF_WEEK_IN_MONTH

Given a day of the week (Sunday, Monday, …), this calendar field assigns an ordinal to such a day of the week in a specific month. Thus, if the value of this field is 1 and the value of the day of the week is 2 (Monday), then the set day of the month is the 1st Monday of the month; the maximum value is 5.

Additionally, the value 0 and negative values are also allowed. The value 0 encompasses the seven days that occur immediately before the first seven days of a month (which therefore have a ‘day of week in month’ with value 1). Negative values starts counting from the end of the month – -1 points to the last occurrence of a day of the week in a month, -2 to the second last, and so on.

Unlike IntlCalendar::FIELD_WEEK_OF_MONTH and IntlCalendar::FIELD_WEEK_OF_YEAR, this value does not depend on IntlCalendar::getFirstDayOfWeek or on IntlCalendar::getMinimalDaysInFirstWeek. The first Monday is the first Monday, even if it occurs in a week that belongs to the previous month.

IntlCalendar::FIELD_AM_PM

Calendar field indicating whether a time is before noon (value 0, AM) or after (1). Midnight is AM, noon is PM.

IntlCalendar::FIELD_HOUR

Calendar field for the hour, without specifying whether itʼs in the morning or in the afternoon. Valid values are 0 to 11.

IntlCalendar::FIELD_HOUR_OF_DAY

Calendar field for the full (24h) hour of the day. Valid values are 0 to 23.

IntlCalendar::FIELD_MINUTE

Calendar field for the minutes component of the time.

IntlCalendar::FIELD_SECOND

Calendar field for the seconds component of the time.

IntlCalendar::FIELD_MILLISECOND

Calendar field the milliseconds component of the time.

IntlCalendar::FIELD_ZONE_OFFSET

Calendar field indicating the raw offset of the timezone, in milliseconds. The raw offset is the timezone offset, excluding any offset due to daylight saving time.

IntlCalendar::FIELD_DST_OFFSET

Calendar field for the daylight saving time offset of the calendarʼs timezone, in milliseconds, if active for calendarʼs time.

IntlCalendar::FIELD_YEAR_WOY

Calendar field representing the year for week of year purposes.

IntlCalendar::FIELD_DOW_LOCAL

Calendar field for the localized day of the week. This is a value betwen 1 and 7, 1 being used for the day of the week that matches the value returned by IntlCalendar::getFirstDayOfWeek.

IntlCalendar::FIELD_EXTENDED_YEAR

Calendar field for a year number representation that is continuous across eras. For the Gregorian calendar, the value of this field matches that of IntlCalendar::FIELD_YEAR for AD years; a BC year y is represented by -y + 1.

IntlCalendar::FIELD_JULIAN_DAY

Calendar field for a modified Julian day number. It is different from a conventional Julian day number in that its transitions occur at local zone midnight rather than at noon UTC. It uniquely identifies a date.

IntlCalendar::FIELD_MILLISECONDS_IN_DAY

Calendar field encompassing the information in IntlCalendar::FIELD_HOUR_OF_DAY, IntlCalendar::FIELD_MINUTE, IntlCalendar::FIELD_SECOND and IntlCalendar::FIELD_MILLISECOND. Range is from the 0 to 24 * 3600 * 1000 - 1. It is not the amount of milliseconds ellapsed in the day since on DST transitions it will have discontinuities analog to those of the wall time.

IntlCalendar::FIELD_IS_LEAP_MONTH

Calendar field whose value is 1 for indicating a leap month and 0 otherwise.

IntlCalendar::FIELD_FIELD_COUNT

The total number of fields.

IntlCalendar::FIELD_DAY_OF_MONTH

Alias for IntlCalendar::FIELD_DATE.

IntlCalendar::DOW_SUNDAY

Sunday.

IntlCalendar::DOW_MONDAY

Monday.

IntlCalendar::DOW_TUESDAY

Tuesday.

IntlCalendar::DOW_WEDNESDAY

Wednesday.

IntlCalendar::DOW_THURSDAY

Thursday.

IntlCalendar::DOW_FRIDAY

Friday.

IntlCalendar::DOW_SATURDAY

Saturday.

IntlCalendar::DOW_TYPE_WEEKDAY

Output of IntlCalendar::getDayOfWeekType indicating a day of week is a weekday.

IntlCalendar::DOW_TYPE_WEEKEND

Output of IntlCalendar::getDayOfWeekType indicating a day of week belongs to the weekend.

IntlCalendar::DOW_TYPE_WEEKEND_OFFSET

Output of IntlCalendar::getDayOfWeekType indicating the weekend begins during the given day of week.

IntlCalendar::DOW_TYPE_WEEKEND_CEASE

Output of IntlCalendar::getDayOfWeekType indicating the weekend ends during the given day of week.

IntlCalendar::WALLTIME_FIRST

Output of IntlCalendar::getSkippedWallTimeOption indicating that wall times in the skipped range should refer to the same instant as wall times with one hour less and of IntlCalendar::getRepeatedWallTimeOption indicating the wall times in the repeated range should refer to the instant of the first occurrence of such wall time.

IntlCalendar::WALLTIME_LAST

Output of IntlCalendar::getSkippedWallTimeOption indicating that wall times in the skipped range should refer to the same instant as wall times with one hour after and of IntlCalendar::getRepeatedWallTimeOption indicating the wall times in the repeated range should refer to the instant of the second occurrence of such wall time.

IntlCalendar::WALLTIME_NEXT_VALID

Output of IntlCalendar::getSkippedWallTimeOption indicating that wall times in the skipped range should refer to the instant when the daylight saving time transition occurs (begins).

The IntlTimeZone class

Introduction

Class synopsis

IntlTimeZone
class IntlTimeZone {
/* Constants */
const integer IntlTimeZone::DISPLAY_SHORT = 1 ;
const integer IntlTimeZone::DISPLAY_LONG = 2 ;
/* Methods */
public static integer countEquivalentIDs ( string $zoneId )
public static IntlTimeZone createDefault ( void )
public static IntlIterator createEnumeration ([ mixed $countryOrRawOffset ] )
public static IntlTimeZone createTimeZone ( string $zoneId )
public static IntlTimeZone fromDateTimeZone ( DateTimeZone $zoneId )
public static string getCanonicalID ( string $zoneId [, bool &$isSystemID ] )
public string getDisplayName ([ bool $isDaylight [, integer $style [, string $locale ]]] )
public integer getDSTSavings ( void )
public static string getEquivalentID ( string $zoneId , integer $index )
public integer getErrorCode ( void )
public string getErrorMessage ( void )
public static IntlTimeZone getGMT ( void )
public string getID ( void )
public integer getOffset ( float $date , bool $local , integer &$rawOffset , integer &$dstOffset )
public integer getRawOffset ( void )
public static string getTZDataVersion ( void )
public bool hasSameRules ( IntlTimeZone $otherTimeZone )
public DateTimeZone toDateTimeZone ( void )
public bool useDaylightTime ( void )
}

Predefined Constants

IntlTimeZone::DISPLAY_SHORT

IntlTimeZone::DISPLAY_LONG

The IntlDateFormatter class

Introduction

Date Formatter is a concrete class that enables locale-dependent formatting/parsing of dates using pattern strings and/or canned patterns.

This class represents the ICU date formatting functionality. It allows users to display dates in a localized format or to parse strings into PHP date values using pattern strings and/or canned patterns.

Class synopsis

IntlDateFormatter
class IntlDateFormatter {
/* Methods */
public __construct ( string $locale , int $datetype , int $timetype [, mixed $timezone = NULL [, mixed $calendar = NULL [, string $pattern = '' ]]] )
public static IntlDateFormatter create ( string $locale , int $datetype , int $timetype [, mixed $timezone = NULL [, mixed $calendar = NULL [, string $pattern = '' ]]] )
public string format ( mixed $value )
public static string formatObject ( object $object [, mixed $format = NULL [, string $locale = NULL ]] )
int getCalendar ( void )
public int getDateType ( void )
public int getErrorCode ( void )
public string getErrorMessage ( void )
public string getLocale ([ int $which ] )
public string getPattern ( void )
public int getTimeType ( void )
public string getTimeZoneId ( void )
public IntlCalendar getCalendarObject ( void )
public IntlTimeZone getTimeZone ( void )
public bool isLenient ( void )
public array localtime ( string $value [, int &$position ] )
public int parse ( string $value [, int &$position ] )
bool setCalendar ( mixed $which )
public bool setLenient ( bool $lenient )
public bool setPattern ( string $pattern )
public bool setTimeZoneId ( string $zone )
public boolean setTimeZone ( mixed $zone )
}

Predefined Constants

These constants are used to specify different formats in the constructor for DateType and TimeType.

IntlDateFormatter::NONE (integer)
Do not include this element
IntlDateFormatter::FULL (integer)
Completely specified style (Tuesday, April 12, 1952 AD or 3:30:42pm PST)
IntlDateFormatter::LONG (integer)
Long style (January 12, 1952 or 3:30:32pm)
IntlDateFormatter::MEDIUM (integer)
Medium style (Jan 12, 1952)
IntlDateFormatter::SHORT (integer)
Most abbreviated style, only essential data (12/13/52 or 3:30pm)

The following int constants are used to specify the calendar. These calendars are all based directly on the Gregorian calendar. Non-Gregorian calendars need to be specified in locale. Examples might include locale="hi@calendar=BUDDHIST".

IntlDateFormatter::TRADITIONAL (integer)
Non-Gregorian Calendar
IntlDateFormatter::GREGORIAN (integer)
Gregorian Calendar

The ResourceBundle class

Introduction

Localized software products often require sets of data that are to be customized depending on current locale, e.g.: messages, labels, formatting patterns. ICU resource mechanism allows to define sets of resources that the application can load on locale basis, while accessing them in unified locale-independent fashion.

This class implements access to ICU resource data files. These files are binary data arrays which ICU uses to store the localized data.

ICU resource bundle can hold simple resources and complex resources. Complex resources are containers which can be either integer-indexed or string-indexed (just like PHP arrays). Simple resources can be of the following typos: string, integer, binary data field or integer array.

ResourceBundle supports direct access to the data through array access pattern and iteration via foreach, as well as access via class methods. The result will be PHP value for simple resources and ResourceBundle object for complex ones. All resources are read-only.

Class synopsis

ResourceBundle
class ResourceBundle {
/* Methods */
public __construct ( string $locale , string $bundlename [, bool $fallback ] )
public int count ( void )
public static ResourceBundle create ( string $locale , string $bundlename [, bool $fallback ] )
public int getErrorCode ( void )
public string getErrorMessage ( void )
public mixed get ( string|int $index )
public array getLocales ( string $bundlename )
}

The Spoofchecker class

Introduction

Class synopsis

Spoofchecker
class Spoofchecker {
/* Constants */
const integer Spoofchecker::SINGLE_SCRIPT_CONFUSABLE = 1 ;
const integer Spoofchecker::MIXED_SCRIPT_CONFUSABLE = 2 ;
const integer Spoofchecker::WHOLE_SCRIPT_CONFUSABLE = 4 ;
const integer Spoofchecker::ANY_CASE = 8 ;
const integer Spoofchecker::SINGLE_SCRIPT = 16 ;
const integer Spoofchecker::INVISIBLE = 32 ;
const integer Spoofchecker::CHAR_LIMIT = 64 ;
/* Methods */
public bool areConfusable ( string $s1 , string $s2 [, string &$error ] )
public __construct ( void )
public bool isSuspicious ( string $text [, string &$error ] )
public void setAllowedLocales ( string $locale_list )
public void setChecks ( string $checks )
}

Predefined Constants

Spoofchecker::SINGLE_SCRIPT_CONFUSABLE

Spoofchecker::MIXED_SCRIPT_CONFUSABLE

Spoofchecker::WHOLE_SCRIPT_CONFUSABLE

Spoofchecker::ANY_CASE

Spoofchecker::SINGLE_SCRIPT

Spoofchecker::INVISIBLE

Spoofchecker::CHAR_LIMIT

The Transliterator class

Introduction

Transliterator provides transliteration of strings.

Class synopsis

Transliterator
class Transliterator {
/* Constants */
const integer Transliterator::FORWARD = 0 ;
const integer Transliterator::REVERSE = 1 ;
/* Properties */
public $id ;
/* Methods */
__construct ( void )
public static Transliterator create ( string $id [, int $direction ] )
public static Transliterator createFromRules ( string $rules [, string $direction ] )
public Transliterator createInverse ( void )
public int getErrorCode ( void )
public string getErrorMessage ( void )
public static array listIDs ( void )
public string transliterate ( string $subject [, int $start [, int $end ]] )
}

Properties

id

Predefined Constants

Transliterator::FORWARD

Transliterator::REVERSE

The IntlBreakIterator class

Introduction

A “break iterator” is an ICU object that exposes methods for locating boundaries in text (e.g. word or sentence boundaries). The PHP IntlBreakIterator serves as the the base class for all types of ICU break iterators. Where extra functionality is available, the intl extension may expose the ICU break iterator with suitable subclasses, such as IntlRuleBasedBreakIterator or IntlCodePointBreaIterator.

This class implements Traversable. Traversing an IntlBreakIterator yields non-negative integer values representing the successive locations of the text boundaries, expressed as UTF-8 code units (byte) counts, taken from the beggining of the text (which has the location 0). The keys yielded by the iterator simply form the sequence of natural numbers {0, 1, 2, …}.

Class synopsis

IntlBreakIterator
class IntlBreakIterator implements Traversable {
/* Constants */
const integer IntlBreakIterator::DONE = -1 ;
const integer IntlBreakIterator::WORD_NONE = 0 ;
const integer IntlBreakIterator::WORD_NONE_LIMIT = 100 ;
const integer IntlBreakIterator::WORD_NUMBER = 100 ;
const integer IntlBreakIterator::WORD_NUMBER_LIMIT = 200 ;
const integer IntlBreakIterator::WORD_LETTER = 200 ;
const integer IntlBreakIterator::WORD_LETTER_LIMIT = 300 ;
const integer IntlBreakIterator::WORD_KANA = 300 ;
const integer IntlBreakIterator::WORD_KANA_LIMIT = 400 ;
const integer IntlBreakIterator::WORD_IDEO = 400 ;
const integer IntlBreakIterator::WORD_IDEO_LIMIT = 500 ;
const integer IntlBreakIterator::LINE_SOFT = 0 ;
const integer IntlBreakIterator::LINE_SOFT_LIMIT = 100 ;
const integer IntlBreakIterator::LINE_HARD = 100 ;
const integer IntlBreakIterator::LINE_HARD_LIMIT = 200 ;
const integer IntlBreakIterator::SENTENCE_TERM = 0 ;
const integer IntlBreakIterator::SENTENCE_TERM_LIMIT = 100 ;
const integer IntlBreakIterator::SENTENCE_SEP = 100 ;
const integer IntlBreakIterator::SENTENCE_SEP_LIMIT = 200 ;
/* Methods */
private __construct ( void )
public static ReturnType createCharacterInstance ([ string $"locale" ] )
public static ReturnType createCodePointInstance ( void )
public static ReturnType createLineInstance ([ string $"locale" ] )
public static ReturnType createSentenceInstance ([ string $"locale" ] )
public static ReturnType createTitleInstance ([ string $"locale" ] )
public static ReturnType createWordInstance ([ string $"locale" ] )
public ReturnType current ( void )
public ReturnType first ( void )
public ReturnType following ( string $"offset" )
public ReturnType getErrorCode ( void )
ReturnType intl_get_error_code ( void )
public ReturnType getErrorMessage ( void )
ReturnType intl_get_error_message ( void )
public ReturnType getLocale ( string $"locale_type" )
public ReturnType getPartsIterator ([ string $"key_type" ] )
public ReturnType getText ( void )
public ReturnType isBoundary ( string $"offset" )
public ReturnType last ( void )
public ReturnType next ([ string $"offset" ] )
public ReturnType preceding ( string $"offset" )
public ReturnType previous ( void )
public ReturnType setText ( string $"text" )
}

Predefined Constants

IntlBreakIterator::DONE

IntlBreakIterator::WORD_NONE

IntlBreakIterator::WORD_NONE_LIMIT

IntlBreakIterator::WORD_NUMBER

IntlBreakIterator::WORD_NUMBER_LIMIT

IntlBreakIterator::WORD_LETTER

IntlBreakIterator::WORD_LETTER_LIMIT

IntlBreakIterator::WORD_KANA

IntlBreakIterator::WORD_KANA_LIMIT

IntlBreakIterator::WORD_IDEO

IntlBreakIterator::WORD_IDEO_LIMIT

IntlBreakIterator::LINE_SOFT

IntlBreakIterator::LINE_SOFT_LIMIT

IntlBreakIterator::LINE_HARD

IntlBreakIterator::LINE_HARD_LIMIT

IntlBreakIterator::SENTENCE_TERM

IntlBreakIterator::SENTENCE_TERM_LIMIT

IntlBreakIterator::SENTENCE_SEP

IntlBreakIterator::SENTENCE_SEP_LIMIT

The IntlRuleBasedBreakIterator class

Introduction

A subclass of IntlBreakIterator that encapsulates ICU break iterators whose behavior is specified using a set of rules. This is the most common kind of break iterators.

These rules are described in the » ICU Boundary Analysis User Guide.

Class synopsis

IntlRuleBasedBreakIterator
class IntlRuleBasedBreakIterator extends IntlBreakIterator implements Traversable {
/* Constants */
const integer IntlRuleBasedBreakIterator::DONE = -1 ;
const integer IntlRuleBasedBreakIterator::WORD_NONE = 0 ;
const integer IntlRuleBasedBreakIterator::WORD_NONE_LIMIT = 100 ;
const integer IntlRuleBasedBreakIterator::WORD_NUMBER = 100 ;
const integer IntlRuleBasedBreakIterator::WORD_NUMBER_LIMIT = 200 ;
const integer IntlRuleBasedBreakIterator::WORD_LETTER = 200 ;
const integer IntlRuleBasedBreakIterator::WORD_LETTER_LIMIT = 300 ;
const integer IntlRuleBasedBreakIterator::WORD_KANA = 300 ;
const integer IntlRuleBasedBreakIterator::WORD_KANA_LIMIT = 400 ;
const integer IntlRuleBasedBreakIterator::WORD_IDEO = 400 ;
const integer IntlRuleBasedBreakIterator::WORD_IDEO_LIMIT = 500 ;
const integer IntlRuleBasedBreakIterator::LINE_SOFT = 0 ;
const integer IntlRuleBasedBreakIterator::LINE_SOFT_LIMIT = 100 ;
const integer IntlRuleBasedBreakIterator::LINE_HARD = 100 ;
const integer IntlRuleBasedBreakIterator::LINE_HARD_LIMIT = 200 ;
const integer IntlRuleBasedBreakIterator::SENTENCE_TERM = 0 ;
const integer IntlRuleBasedBreakIterator::SENTENCE_TERM_LIMIT = 100 ;
const integer IntlRuleBasedBreakIterator::SENTENCE_SEP = 100 ;
const integer IntlRuleBasedBreakIterator::SENTENCE_SEP_LIMIT = 200 ;
/* Methods */
public __construct ( string $rules [, string $areCompiled ] )
public ReturnType getBinaryRules ( void )
public ReturnType getRules ( void )
public ReturnType getRuleStatus ( void )
public ReturnType getRuleStatusVec ( void )
/* Inherited methods */
private IntlBreakIterator::__construct ( void )
public static ReturnType IntlBreakIterator::createCharacterInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createCodePointInstance ( void )
public static ReturnType IntlBreakIterator::createLineInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createSentenceInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createTitleInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createWordInstance ([ string $"locale" ] )
public ReturnType IntlBreakIterator::current ( void )
public ReturnType IntlBreakIterator::first ( void )
public ReturnType IntlBreakIterator::following ( string $"offset" )
public ReturnType IntlBreakIterator::getErrorCode ( void )
ReturnType intl_get_error_code ( void )
public ReturnType IntlBreakIterator::getErrorMessage ( void )
ReturnType intl_get_error_message ( void )
public ReturnType IntlBreakIterator::getLocale ( string $"locale_type" )
public ReturnType IntlBreakIterator::getPartsIterator ([ string $"key_type" ] )
public ReturnType IntlBreakIterator::getText ( void )
public ReturnType IntlBreakIterator::isBoundary ( string $"offset" )
public ReturnType IntlBreakIterator::last ( void )
public ReturnType IntlBreakIterator::next ([ string $"offset" ] )
public ReturnType IntlBreakIterator::preceding ( string $"offset" )
public ReturnType IntlBreakIterator::previous ( void )
public ReturnType IntlBreakIterator::setText ( string $"text" )
}

Predefined Constants

IntlRuleBasedBreakIterator::DONE

IntlRuleBasedBreakIterator::WORD_NONE

IntlRuleBasedBreakIterator::WORD_NONE_LIMIT

IntlRuleBasedBreakIterator::WORD_NUMBER

IntlRuleBasedBreakIterator::WORD_NUMBER_LIMIT

IntlRuleBasedBreakIterator::WORD_LETTER

IntlRuleBasedBreakIterator::WORD_LETTER_LIMIT

IntlRuleBasedBreakIterator::WORD_KANA

IntlRuleBasedBreakIterator::WORD_KANA_LIMIT

IntlRuleBasedBreakIterator::WORD_IDEO

IntlRuleBasedBreakIterator::WORD_IDEO_LIMIT

IntlRuleBasedBreakIterator::LINE_SOFT

IntlRuleBasedBreakIterator::LINE_SOFT_LIMIT

IntlRuleBasedBreakIterator::LINE_HARD

IntlRuleBasedBreakIterator::LINE_HARD_LIMIT

IntlRuleBasedBreakIterator::SENTENCE_TERM

IntlRuleBasedBreakIterator::SENTENCE_TERM_LIMIT

IntlRuleBasedBreakIterator::SENTENCE_SEP

IntlRuleBasedBreakIterator::SENTENCE_SEP_LIMIT

The IntlCodePointBreakIterator class

Introduction

This break iterator identifies the boundaries between UTF-8 code points.

Class synopsis

IntlCodePointBreakIterator
class IntlCodePointBreakIterator extends IntlBreakIterator implements Traversable {
/* Constants */
const integer IntlCodePointBreakIterator::DONE = -1 ;
const integer IntlCodePointBreakIterator::WORD_NONE = 0 ;
const integer IntlCodePointBreakIterator::WORD_NONE_LIMIT = 100 ;
const integer IntlCodePointBreakIterator::WORD_NUMBER = 100 ;
const integer IntlCodePointBreakIterator::WORD_NUMBER_LIMIT = 200 ;
const integer IntlCodePointBreakIterator::WORD_LETTER = 200 ;
const integer IntlCodePointBreakIterator::WORD_LETTER_LIMIT = 300 ;
const integer IntlCodePointBreakIterator::WORD_KANA = 300 ;
const integer IntlCodePointBreakIterator::WORD_KANA_LIMIT = 400 ;
const integer IntlCodePointBreakIterator::WORD_IDEO = 400 ;
const integer IntlCodePointBreakIterator::WORD_IDEO_LIMIT = 500 ;
const integer IntlCodePointBreakIterator::LINE_SOFT = 0 ;
const integer IntlCodePointBreakIterator::LINE_SOFT_LIMIT = 100 ;
const integer IntlCodePointBreakIterator::LINE_HARD = 100 ;
const integer IntlCodePointBreakIterator::LINE_HARD_LIMIT = 200 ;
const integer IntlCodePointBreakIterator::SENTENCE_TERM = 0 ;
const integer IntlCodePointBreakIterator::SENTENCE_TERM_LIMIT = 100 ;
const integer IntlCodePointBreakIterator::SENTENCE_SEP = 100 ;
const integer IntlCodePointBreakIterator::SENTENCE_SEP_LIMIT = 200 ;
/* Methods */
public ReturnType getLastCodePoint ( void )
/* Inherited methods */
private IntlBreakIterator::__construct ( void )
public static ReturnType IntlBreakIterator::createCharacterInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createCodePointInstance ( void )
public static ReturnType IntlBreakIterator::createLineInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createSentenceInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createTitleInstance ([ string $"locale" ] )
public static ReturnType IntlBreakIterator::createWordInstance ([ string $"locale" ] )
public ReturnType IntlBreakIterator::current ( void )
public ReturnType IntlBreakIterator::first ( void )
public ReturnType IntlBreakIterator::following ( string $"offset" )
public ReturnType IntlBreakIterator::getErrorCode ( void )
ReturnType intl_get_error_code ( void )
public ReturnType IntlBreakIterator::getErrorMessage ( void )
ReturnType intl_get_error_message ( void )
public ReturnType IntlBreakIterator::getLocale ( string $"locale_type" )
public ReturnType IntlBreakIterator::getPartsIterator ([ string $"key_type" ] )
public ReturnType IntlBreakIterator::getText ( void )
public ReturnType IntlBreakIterator::isBoundary ( string $"offset" )
public ReturnType IntlBreakIterator::last ( void )
public ReturnType IntlBreakIterator::next ([ string $"offset" ] )
public ReturnType IntlBreakIterator::preceding ( string $"offset" )
public ReturnType IntlBreakIterator::previous ( void )
public ReturnType IntlBreakIterator::setText ( string $"text" )
}

Predefined Constants

IntlCodePointBreakIterator::DONE

IntlCodePointBreakIterator::WORD_NONE

IntlCodePointBreakIterator::WORD_NONE_LIMIT

IntlCodePointBreakIterator::WORD_NUMBER

IntlCodePointBreakIterator::WORD_NUMBER_LIMIT

IntlCodePointBreakIterator::WORD_LETTER

IntlCodePointBreakIterator::WORD_LETTER_LIMIT

IntlCodePointBreakIterator::WORD_KANA

IntlCodePointBreakIterator::WORD_KANA_LIMIT

IntlCodePointBreakIterator::WORD_IDEO

IntlCodePointBreakIterator::WORD_IDEO_LIMIT

IntlCodePointBreakIterator::LINE_SOFT

IntlCodePointBreakIterator::LINE_SOFT_LIMIT

IntlCodePointBreakIterator::LINE_HARD

IntlCodePointBreakIterator::LINE_HARD_LIMIT

IntlCodePointBreakIterator::SENTENCE_TERM

IntlCodePointBreakIterator::SENTENCE_TERM_LIMIT

IntlCodePointBreakIterator::SENTENCE_SEP

IntlCodePointBreakIterator::SENTENCE_SEP_LIMIT

The IntlPartsIterator class

Introduction

Objects of this class can be obtained from IntlBreakIterator objects. While the break iterators provide a sequence of boundary positions when iterated, IntlPartsIterator objects provide, as a convenience, the text fragments comprehended between two successive boundaries.

The keys may represent the offset of the left boundary, right boundary, or they may just the sequence of non-negative integers. See IntlBreakIterator::getPartsIterator.

Class synopsis

IntlPartsIterator
class IntlPartsIterator extends IntlIterator implements Iterator {
/* Constants */
const integer IntlPartsIterator::KEY_SEQUENTIAL = 0 ;
const integer IntlPartsIterator::KEY_LEFT = 1 ;
const integer IntlPartsIterator::KEY_RIGHT = 2 ;
/* Methods */
public ReturnType getBreakIterator ( void )
/* Inherited methods */
public ReturnType IntlIterator::current ( void )
public ReturnType IntlIterator::key ( void )
public ReturnType IntlIterator::next ( void )
public ReturnType IntlIterator::rewind ( void )
public ReturnType IntlIterator::valid ( void )
}

Predefined Constants

IntlPartsIterator::KEY_SEQUENTIAL

IntlPartsIterator::KEY_LEFT

IntlPartsIterator::KEY_RIGHT

The UConverter class

Introduction

Class synopsis

UConverter
class UConverter {
/* Constants */
const integer UConverter::REASON_UNASSIGNED = 0 ;
const integer UConverter::REASON_ILLEGAL = 1 ;
const integer UConverter::REASON_IRREGULAR = 2 ;
const integer UConverter::REASON_RESET = 3 ;
const integer UConverter::REASON_CLOSE = 4 ;
const integer UConverter::REASON_CLONE = 5 ;
const integer UConverter::UNSUPPORTED_CONVERTER = -1 ;
const integer UConverter::SBCS = 0 ;
const integer UConverter::DBCS = 1 ;
const integer UConverter::MBCS = 2 ;
const integer UConverter::LATIN_1 = 3 ;
const integer UConverter::UTF8 = 4 ;
const integer UConverter::UTF16_BigEndian = 5 ;
const integer UConverter::UTF16_LittleEndian = 6 ;
const integer UConverter::UTF32_BigEndian = 7 ;
const integer UConverter::UTF32_LittleEndian = 8 ;
const integer UConverter::EBCDIC_STATEFUL = 9 ;
const integer UConverter::ISO_2022 = 10 ;
const integer UConverter::LMBCS_1 = 11 ;
const integer UConverter::LMBCS_2 = 12 ;
const integer UConverter::LMBCS_3 = 13 ;
const integer UConverter::LMBCS_4 = 14 ;
const integer UConverter::LMBCS_5 = 15 ;
const integer UConverter::LMBCS_6 = 16 ;
const integer UConverter::LMBCS_8 = 17 ;
const integer UConverter::LMBCS_11 = 18 ;
const integer UConverter::LMBCS_16 = 19 ;
const integer UConverter::LMBCS_17 = 20 ;
const integer UConverter::LMBCS_18 = 21 ;
const integer UConverter::LMBCS_19 = 22 ;
const integer UConverter::LMBCS_LAST = 22 ;
const integer UConverter::HZ = 23 ;
const integer UConverter::SCSU = 24 ;
const integer UConverter::ISCII = 25 ;
const integer UConverter::US_ASCII = 26 ;
const integer UConverter::UTF7 = 27 ;
const integer UConverter::BOCU1 = 28 ;
const integer UConverter::UTF16 = 29 ;
const integer UConverter::UTF32 = 30 ;
const integer UConverter::CESU8 = 31 ;
const integer UConverter::IMAP_MAILBOX = 32 ;
/* Methods */
public __construct ([ string $destination_encoding [, string $source_encoding ]] )
public string convert ( string $str [, bool $reverse ] )
public mixed fromUCallback ( integer $reason , string $source , string $codePoint , integer &$error )
public static array getAliases ([ string $name ] )
public static array getAvailable ( void )
public string getDestinationEncoding ( void )
public integer getDestinationType ( void )
public integer getErrorCode ( void )
public string getErrorMessage ( void )
public string getSourceEncoding ( void )
public integer getSourceType ( void )
public static array getStandards ( void )
public string getSubstChars ( void )
public static string reasonText ([ integer $reason ] )
public void setDestinationEncoding ( string $encoding )
public void setSourceEncoding ( string $encoding )
public void setSubstChars ( string $chars )
public mixed toUCallback ( integer $reason , string $source , string $codeUnits , integer &$error )
public static string transcode ( string $str , string $toEncoding , string $fromEncoding [, array $options ] )
}

Predefined Constants

UConverter::REASON_UNASSIGNED

UConverter::REASON_ILLEGAL

UConverter::REASON_IRREGULAR

UConverter::REASON_RESET

UConverter::REASON_CLOSE

UConverter::REASON_CLONE

UConverter::UNSUPPORTED_CONVERTER

UConverter::SBCS

UConverter::DBCS

UConverter::MBCS

UConverter::LATIN_1

UConverter::UTF8

UConverter::UTF16_BigEndian

UConverter::UTF16_LittleEndian

UConverter::UTF32_BigEndian

UConverter::UTF32_LittleEndian

UConverter::EBCDIC_STATEFUL

UConverter::ISO_2022

UConverter::LMBCS_1

UConverter::LMBCS_2

UConverter::LMBCS_3

UConverter::LMBCS_4

UConverter::LMBCS_5

UConverter::LMBCS_6

UConverter::LMBCS_8

UConverter::LMBCS_11

UConverter::LMBCS_16

UConverter::LMBCS_17

UConverter::LMBCS_18

UConverter::LMBCS_19

UConverter::LMBCS_LAST

UConverter::HZ

UConverter::SCSU

UConverter::ISCII

UConverter::US_ASCII

UConverter::UTF7

UConverter::BOCU1

UConverter::UTF16

UConverter::UTF32

UConverter::CESU8

UConverter::IMAP_MAILBOX

Exception class for intl errors

Introduction

This class is used for generating exceptions when errors occur inside intl functions. Such exceptions are only generated when intl.use_exceptions is enabled.

Class synopsis

IntlException
class IntlException extends Exception {
/* Inherited properties */
protected string $message ;
protected int $code ;
protected string $file ;
protected int $line ;
/* Inherited methods */
final public string Exception::getMessage ( void )
final public Exception Exception::getPrevious ( void )
final public mixed Exception::getCode ( void )
final public string Exception::getFile ( void )
final public int Exception::getLine ( void )
final public array Exception::getTrace ( void )
final public string Exception::getTraceAsString ( void )
public string Exception::__toString ( void )
final private void Exception::__clone ( void )
}

The IntlIterator class

Introduction

This class represents iterator objects throughout the intl extension whenever the iterator cannot be identified with any other object provided by the extension. The distinct iterator object used internally by the foreach construct can only be obtained (in the relevant part here) from objects, so objects of this class serve the purpose of providing the hook through which this internal object can be obtained. As a convenience, this class also implements the Iterator interface, allowing the collection of values to be navigated using the methods defined in that interface. Both these methods and the internal iterator objects provided to foreach are backed by the same state (e.g. the position of the iterator and its current value).

Subclasses may provide richer functionality.

Class synopsis

IntlIterator
class IntlIterator implements Iterator {
/* Methods */
public ReturnType current ( void )
public ReturnType key ( void )
public ReturnType next ( void )
public ReturnType rewind ( void )
public ReturnType valid ( void )
}