ICU 72

ICU is the premier library for software internationalization, used by a wide array of companies and organizations.

Release Overview

ICU 72 updates to Unicode 15, including new characters, scripts, emoji, and corresponding API constants. It also updates to CLDR 42 locale data with various additions and corrections.

ICU 72 adds technology preview implementations for person name formatting, as well as for a new version of message formatting based on a proposed draft Unicode specification.

ICU 72 and CLDR 42 are major releases, including a new version of Unicode and major locale data improvements.

ICU 72 updates to the time zone data version 2022e (2022-oct). Note that pre-1970 data for a number of time zones has been removed, as has been the case in the upstream tzdata release since 2021b.

For more details, including migration issues, see below.

Please use the icu-support mailing list and/or find/submit error reports.

🔴🔴🔴 Do you need ICU to work on EBCDIC platforms? 🔴🔴🔴

    • We need help: Someone needs to build ICU4C on a native-EBCDIC machine (z or i), fix C++ compiler issues (if any), fix issues related to an EBCDIC codepage as the system encoding, and test frequently (or add their machine into our CI). Please contact us via the icu-support mailing list.

    • Otherwise we will remove the support code for non-ASCII-family platforms. Details: ICU-21672

Version Number

The initial release has library version number 72.1.

If there are maintenance releases, they will be 72.2, 72.3, etc. (During ICU 72 development, the library version number was 72.0.x.)

Note: There may be additional commits on the maint/maint-72 branch that are not included in the prepackaged download files.

Common Changes

    • Unicode 15

      • 2 new scripts, 20 new emoji (plus additional new sequences), 4000+ new CJK characters

      • Additional letters, numerals, and symbols that are in modern use

    • CLDR 42 (blog) :

      • CLDR has added or improved data for the following languages which are newly included in ICU:

        • Haryanvi (bgc), Bhojpuri (bho), Rajasthani (raj) — India

        • Chuvash (cv) — Russia

      • Igbo (ig) and Yoruba (yo) are now at modern coverage, suitable for full user interface i18n.

      • Word segmentation: The CLDR committee decided that by default there should be a word break after a colon (:), and that a word break after a colon should be suppressed only for Finnish and Swedish. (CLDR-15910, cldr/pull/2254) Also, the committee decided that an at sign (@) should not cause word breaks, as in email addresses. (CLDR-15767, cldr/pull/2256) — (ICU-22112, icu/pull/2159)

      • More language names and other items are consistently translated in the languages with modern coverage.

      • In many formatting patterns, ASCII spaces are replaced with Unicode spaces (e.g., a "thin space").

      • Arabic number formatting patterns have been improved for more reliable bidirectional-text behavior.

      • Plurals: Hebrew has a category removed ('many'), while Asturian, Catalan, and Maltese each have an additional category.

      • A new -u extension key is added to provide a preferred unit of measurement for temperature: Celsius, Fahrenheit, and Kelvin.

    • Improved finding of locale data given a locale ID that is not directly supported. (ICU-21125)

      • For example, finding the "de-LI" data when "de-Latn-LI" is requested (with a redundant script subtag).

    • New class DisplayOptions (C++ / Java) supersedes the DisplayContext mechanism with a more modern API and a larger set of options, including noun class and grammatical case. (ICU-21935)

      • Currently used only in the NumberFormatter, but intended to be used more broadly.

    • Number + measurement unit formatting: (ICU-22122)

      • The NumberFormatter chooses appropriate units based on several locale keywords, when a usage is specified:

        • A specific unit; currently only temperature units are supported (example: -u-mu-fahrenhe)

        • A measurement system (example: -u-ms-metric)

        • A region code (example: -u-rg-uszzzz)

    • Time zone data (tzdata) version 2022e (2022-oct). Note that pre-1970 data for a number of time zones has been removed, as has been the case in the upstream tzdata release since 2021b.

ICU4C Specific Changes

ICU4J Specific Changes

Migration Issues

    1. See CLDR 42 migration issues.

    2. Note: Upcoming currency changes:

      1. Sierra Leone currency: The new currency (SLE) is now legal tender, and the old currency (SLL) will cease to be legal tender on after 2023-03-31.

      2. Croatia currency: In Croatia, the Euro (EUR) will be legal tender starting on 2023-jan-01, and the old currency (HRK) will cease to be legal tender on 2023-jan-15.

      3. The CLDR and ICU data includes the date ranges; the code will adjust automatically.

ICU4C Platform Support

ICU4C requires C++11 and has been tested with up to C++20.

We routinely test on recent versions of Linux, macOS, and Windows.

We accept patches for other platforms.

Windows: The minimum supported version is Windows 7. (See How To Build And Install On Windows for more details.)

ICU4J Platform Support

ICU4J works on Java 8..16.

ICU4J should work on Android API level 21 and later but may require “library desugaring”.

Download

Source and binary downloads are available on the git/GitHub tag page: https://github.com/unicode-org/icu/releases/tag/release-72-1

See the Source Code Access page for how to download the ICU file tree directly from GitHub.

ICU locale data was generated from CLDR tag https://github.com/unicode-org/cldr-staging/releases/tag/release-42.

Maven dependency:

<dependency>

<groupId>com.ibm.icu</groupId>

<artifactId>icu4j</artifactId>

<version>72.1</version>

</dependency>