ICU 70

ICU is the premier library for software internationalization, used by a wide array of companies and organizations.

ICU 70 is not available yet. We are in early stages of development after ICU 69. This page exists so that we can start collecting updates.

Release Overview

ICU 70 updates to Unicode 14 and to CLDR 40 locale data with many additions and corrections. ICU 70 adds support for emoji properties of strings, and TBD also includes significant improvements for measurement unit formatting and number formatting in general, as well as many other bug fixes and enhancements.

For more details, including migration issues, see below.

Please use the icu-support mailing list and/or find/submit error reports.

🔴🔴🔴 Do you need ICU to work on EBCDIC platforms? 🔴🔴🔴

    • ICU4C used to work on IBM mainframes (OS/390, z/OS) and other native-EBCDIC platforms (OS/400, i5/OS, IBM i).

    • This has not been tested since ICU 59 moved to C++11 (2017). Apparently there are now C++11 compilers available for one or both of these platforms.

    • We need help: Someone needs to build ICU4C on such a machine, fix C++ compiler issues (if any), fix issues related to an EBCDIC codepage as the system encoding, and test frequently (or add their machine into our CI). We will assist you, especially with EBCDIC-specific character and string handling.

    • Otherwise we will remove the support code for non-ASCII-family platforms. (ICU-21672)

    • Please contact us via the icu-support mailing list. See the thread “ICU users: Do you need ICU to work on EBCDIC platforms?” there.

Version Number

The initial release has library version number 70.1.

If there are maintenance releases, they will be 70.2, 70.3, etc. (During ICU 70 development, the library version number was 70.0.x.)

TBD Note: There may be additional commits on the maint-69 branch that are not included in the prepackaged download files.

Common Changes

    • Unicode 14: 5 new scripts, 37 new emoji, the som currency sign used in the Kyrgyz Republic, 838 total new characters

    • CLDR 40 -- TBD, just rough early draft

      • Grammatical features (gender and case) for units of measurement in additional locales

      • Previously there was a trial phase for grammatical features with only 12 locales (da, de, es, fr, hi, it, nl, no, pl, pt, ru, sv).

      • This has expanded by an additional 29 locales (am, ar, bn, ca, cs, el, fi, gu, he, hr, hu, hy, is, kn, lt, lv, ml, mr, nb, pa, ro, si, sk, sl, sr, ta, te, uk, ur).

      • Of the remaining locales, many don't require grammatical information for units of measurement (such as English or Japanese), while for others the information remains to be gathered.

    • Example from CLDR 39:

    • For Norwegian, "no" is back to being the canonical code, with "nb" treated as equivalent. This aligns handling of Norwegian with other macro language codes.

      • No new locales, but many improvements and bug fixes; see the CLDR release page

    • New support for emoji properties of strings; for example, Basic_Emoji and RGI_Emoji (ICU-21652)

      • This includes a new version of hasBinaryProperty() that takes a string argument.

  • TBD Number and measurement unit formatting:

    • Measurement unit case and gender (technology preview) (ICU-21123)

        • unitDisplayCase() setter on NumberFormatter to set grammatical case

        • FormattedNumber::getGender() to return the grammatical gender of the output unit

        • These APIs currently deal in strings: this will change in a future release

    • Binary prefixes in measurement units (KiB, MiB, etc.) (ICU-21357)

      • Formatting of custom compound units (ICU-20941)

      • New option to remove fraction digits on whole numbers ($1.99, $2, $2.01) (ICU-20886)

      • New sign display option "negative" (not on negative zero) (ICU-21484)

      • New resolution for fraction-significant rounding (ICU-20019)

    • NumberRangeFormatter supports span fields (ICU-20421)

    • TBD Time zone offsets from local time: New APIs BasicTimeZone::getOffsetFromLocal() (C++ & Java) and ucal_getTimeZoneOffsetFromLocal() (ICU-21372 & ICU-21490)

  • TBD Time zone data (tzdata) version 2021a (2021-jan)

ICU4C Specific Changes

ICU4J Specific Changes

Migration Issues

New in ICU 70

    1. TBD

Other recent migration issues

    1. TBD 69:

  1. CLDR 39 changes the relationship of language codes for the Norwegian language. Formerly, "nb" was the main locale, and "no" was an alias to it. With this change, "no" is now the main locale, and both "nb" and "nn" inherit from "no". This will be visible in locale canonicalization and in APIs that return lists of available locales. Code that assumes that a locale with only a language subtag has no parent other than root may need to change.

    1. For the LocaleMatcher: Simplified Chinese (zh=zh-Hans=zh-CN) vs. traditional Chinese (zh-Hant=zh-TW) are no longer a match. This will be visible in product behavior.

    2. See also other CLDR 39 migration issues.

    3. TBD earlier:

    4. ICU4C public header files no longer define and use the macros FALSE and TRUE. (ICU-21267)

      1. This avoids collisions between these macro definitions and application code that defines enum constants with these names.

      2. The ICU macros are no longer necessary: C++11 and C99 define false & true values.

      3. ICU API continues to use the ICU-specific type UBool for now; the standard values can be assigned to UBool variables and arguments without warnings.

      4. Please change call sites to use the standard false & true values where necessary. For C (as opposed to C++), these are also macros, defined in <stdbool.h>.

      5. You may transitionally define U_DEFINE_FALSE_AND_TRUE=1 if you need time to migrate code, for example in application code before including any ICU header file, or by patching unicode/umachine.h and changing # define U_DEFINE_FALSE_AND_TRUE 0 to assign value 1 instead.

  2. Constructing a StringPiece from NULL may be ambiguous, depending on the platform. Where this is a problem, please adjust call sites from using NULL to using nullptr. (ICU-20984 PR #1044)

    1. This is similar to issues with the char16_t adoption in ICU 59.

  1. If you rebuild the ICU locale data from (possibly patched) CLDR data, note that there is a new CLDR-to-ICU converter tool in the ICU repo now, replacing the old one in the CLDR repo. (ICU-20693) See icu4c/source/data/cldr-icu-readme.txt

ICU4C Platform Support

  • All: Compiler support for C++11 is required for building the ICU libraries.

    • Some platforms (such as IBM AIX, IBM z and Solaris) may no longer be able to build ICU until an improved compiler is available.

  • ICU 69 cannot be built with a C++20 compiler. This version of the standard makes incompatible changes that will require ICU API changes. (ICU-20973)

      • Note that ICU 67 already fixed uses of u8"literals" broken by the C++20 introduction of the incompatible char8_t type (ICU-20972),

      • and added a few API overloads to reduce the need for reinterpret_cast (ICU-20984).

  • macOS: XCode 8.3 (LLVM clang 8.1.0) has been tested.

  • Solaris

  • Windows:

    • The minimum supported version is Windows 7. Windows XP and Windows Vista are no longer supported.

    • Building the Visual Studio UWP projects requires Visual Studio 2017 (or VS2019) with a version of the Windows 10 SDK installed.

      • When using "@compat=host", on versions below Windows 10 version 1703, 6 locales have date and number formatting issues (#13119).

    • The LCID conversion APIs don't round-trip Kurdish (ku) and Central Kurdish (ckb) due to Windows not having a ckb locale (#20181).

    • The pre-built binaries now use Visual Studio 2019 [MSVC2019], instead of Visual Studio 2017 [MSVC2017] (ICU-21108).

    • Windows using the ICC compiler:

      • Source File Encoding. The ICC compiler does not recognize the /utf-8 option. A work-around is known and reported to succeed. (#13251)

    • IBM AIX:

      • TBD

  • IBM z

      • TBD

ICU4J Platform Support

ICU4J works on Java 7 and on Android API level 21.

Download

Source and binary downloads are available on the git/GitHub tag page: https://github.com/unicode-org/icu/releases/tag/release-69-1

See the Source Code Access page for how to download the ICU file tree directly from GitHub.

ICU locale data was generated from CLDR tag https://github.com/unicode-org/cldr/releases/tag/release-39-beta2 (same data as in the CLDR 39 release).

Maven dependency:

<dependency>

<groupId>com.ibm.icu</groupId>

<artifactId>icu4j</artifactId>

<version>70.1</version>

</dependency>