ICU is the premier library for software internationalization, used by a wide array of companies and organizations.
ICU 73 updates to CLDR 43 [TODO: link to blog] locale data with various additions and corrections.
ICU 73 improves Japanese and Korean short-text line breaking, reduces C++ memory use in date formatting, and promotes the Java person name formatter from tech preview to draft.
ICU 73 updates to the time zone data version 2023a (2023-mar) [2023b in the final release]. Note that pre-1970 data for a number of time zones has been removed, as has been the case in the upstream tzdata release since 2021b.
For more details, including migration issues, see below.
Please use the icu-support mailing list and/or find/submit error reports.
The initial release has library version number 73.1.
Release date: planned for ca. 2023-04-13
If there are maintenance releases, they will be 73.2, 73.3, etc. (During ICU 73 development, the library version number was 73.0.x.)
Note: There may be additional commits on the maint/maint-73 branch that are not included in the prepackaged download files.
Next Release (FYI)
For the next release, ICU 74 in 2023-oct, we plan to make the following changes:
C: Require C11 (up from C99)
C++: Require C++17 (up from C++11)
Java: Switch from ant to Maven, and rearrange the source file tree to the Maven default
CLDR 43 (blog for CLDR 43 alpha / TODO: blog for release) :
CLDR 43 is a limited-submission release. Data for many languages has been improved.
In English, the name “Türkiye” is now used for the country instead of “Turkey” (the alternate spelling is also available in the data). Where appropriate, a corresponding term is used in other languages.
Person name formatting data is now complete and out of “tech preview”.
Collation: Improved sorting & matching of “fancy quotes”, Geresh, and Gershayim in the default (CLDR root) sort order. (CLDR-15946, L2/23-016)
Several punctuation marks now compare primary-equal to their single and double quote ASCII fallbacks. This makes them easier to find, and groups names together that only differ in whether ASCII quotes or typographic quotes are used.
A new unit was added for the Beaufort scale (wind speed).
Improved and expanded data for likely subtags.
Line breaking with Japanese phrase-based breaking is now using the BudouX machine learning implementation for better quality. (ICU-22100, see ICU 71 ICU-21699 for context)
Phrase-based line breaking for Korean now breaks at spaces (approximates word boundaries). (ICU-22119)
The UnicodeSet::closeOver() function has a new option for simple case folding. (ICU-6065)
C: USET_SIMPLE_CASE_INSENSITIVE / Java: UnicodeSet.SIMPLE_CASE_INSENSITIVE
Several small Calendar API additions to facilitate implementations of the proposed ECMAScript Temporal API. (ICU-22027)
Time zone data (tzdata) version 2023a (2023-mar) [2023b in the final release]. Note that pre-1970 data for a number of time zones has been removed, as has been the case in the upstream tzdata release since 2021b.
ICU4C Specific Changes
New classes SimpleNumber and SimpleNumberFormatter, with a subset of NumberFormatter functionality for less memory, more object reuse, and fewer code dependencies. (ICU-22093)
The SimpleDateFormat classes now uses SimpleNumberFormatter, significantly reducing heap memory use. (ICU-20115)
Some internal changes:
Continuous Integration with undefined-behavior sanitizer (UBSan) and alignment sanitizer, and code changes. (ICU-22224)
Continuous Integration with a subset of Control Flow Integrity checks and code changes. (ICU-21374)
Implementation code relies more on C++11 (char16_t, nullptr, override, ...) with fewer typedefs and conditional defitions. (ICU-21833)
ICU4J Specific Changes
New class PersonNameFormatter implementing the draft specification of CLDR person name formatting. (ICU-22081)
Added in ICU 72 as a technology preview.
Promoted to draft in ICU 73, with some API changes. (ICU-22287)
CLDR background on why this feature is being added and what it does.
Technology Preview since ICU 72: New class MessageFormatter implementing the draft specification of the CLDR MessageFormat Working Group. (ICU-22124, draft message syntax)
Since ICU 72: ICU now requires Java 8 but has also been tested with Java 11 & 16 (ICU-22116)
On Android, you may need to enable “library desugaring” depending on your target API level and which parts of ICU you include.
Most of the ICU 72 library code should still work with Java 7 / Android API level 21, but we no longer test with Java 7.
For ICU users who generate ICU data directly from CLDR: In the CLDR repo, the "seed" data has been merged into the "common" file tree (CLDR-6396). As a result, there are many more locale data files in CLDR "common", but many that were moved do not have usable data item coverage and are therefore not automatically added to ICU. See the CLDR Migration section for details.
Interval Formats: A small number of interval formats (like “Dec 2 – 3”) have their spacing changed for consistency. This is unlikely to cause problems, as they are similar to a large number of similar changes in CLDR 42/ICU 72.
The “gb2312” and “big5han” Chinese collation tailorings are no longer included in the ICU binary data. (ICU-22285)
These are based on the code point order of their respective legacy charsets. By contrast, the “pinyin” and “stroke” sort orders, which are the defaults for the regional variants of Chinese, are based on current Unicode Han character data.
The ICU source data files still include the data for these tailorings. See the User Guide for how to include them in the binary data.
Future versions of CLDR and ICU may remove the source data for these tailorings. (CLDR-16062)
ICU4C Platform Support
ICU4C requires C++11 and has been tested with up to C++20.
We routinely test on recent versions of Linux, macOS, and Windows.
We accept patches for other platforms.
Windows: The minimum supported version is Windows 7. (See How To Build And Install On Windows for more details.)
ICU4J Platform Support
ICU4J works on Java 8..17.
ICU4J should work on Android API level 21 and later but may require “library desugaring”.
Source and binary downloads are available on the git/GitHub tag page: https://github.com/unicode-org/icu/releases/tag/release-73-rc
See the Source Code Access page for how to download the ICU file tree directly from GitHub.
ICU locale data was generated from CLDR tag https://github.com/unicode-org/cldr-staging/releases/tag/release-43-beta2.
Maven dependency: [TODO: only available after the final release]