ICU 70 updates to Unicode 14, including new characters, scripts, emoji, and corresponding API constants. ICU 70 adds support for emoji properties of strings. It also updates to CLDR 40 locale data with many additions and corrections. ICU 70 also includes many other bug fixes and enhancements, especially for measurement unit formatting.
For more details, including migration issues, see below.
🔴🔴🔴 Do you need ICU to work on EBCDIC platforms? 🔴🔴🔴
We need help: Someone needs to build ICU4C on a native-EBCDIC machine (z or i), fix C++ compiler issues (if any), fix issues related to an EBCDIC codepage as the system encoding, and test frequently (or add their machine into our CI). Please contact us via the icu-support mailing list.
Otherwise we will remove the support code for non-ASCII-family platforms. Details: ICU-21672
The initial release has library version number 70.1.
Release date: 2021-10-27
If there are maintenance releases, they will be 70.2, 70.3, etc. (During ICU 70 development, the library version number was 70.0.x.)
Note: There may be additional commits on the maint-70 branch that are not included in the prepackaged download files.
5 new scripts, 37 new emoji (plus additional new sequences), the som currency sign used in the Kyrgyz Republic, 838 total new characters
Line breaking updated for emoji forward compatibility for emoji. Unicode 14 and ICU 70 already handle line breaking for emoji that will be added in Unicode 15/16.
Grammatical features (gender and case) for units of measurement in 29 additional locales
Many improvements and bug fixes; see the CLDR release page
This includes a new version of hasBinaryProperty() that takes a string argument.
Complementing a set via pattern syntax (^ or \P) or applyPattern() and similar performs a “code point complement” (which removes multi-character strings) (ICU-21524)
Time zone data (tzdata) version 2021e (2021-oct) except that for now we retain pre-1970 time zone data that has recently been removed from the TZ DB. (Reports as version "2021a3")
ICU4C Specific Changes
ICU operator==() and operator!=() functions now return bool instead of UBool, as an adjustment for incompatible changes in C++20 (ICU-20973)
ICU 70 can now be built and used with C++20 compilers.
Memory management continues to be reviewed and made simpler & safer by using more smart pointers.
ICU4J Specific Changes
For the LocaleMatcher: English in Canada (en-CA) and English in the Philippines (en-PH) now match American English (en=en-US) more closely than British English (en-GB). This will be visible in product behavior.
ICU4C: String search now lazy creates the internal break iterator. This can cause warnings that would have been reported by usearch_open to shift to other APIs like usearch_next, depending on when the break iterator gets created. Callers checking for exact statuses like U_ZERO_ERROR should consider using the U_SUCCESS() macro instead (ICU-21533, ICU-21710).
See also CLDR 40 migration issues.
ICU4C Platform Support
ICU4C requires C++11 and has been tested with up to C++20.
We routinely test on recent versions of Linux, macOS, and Windows.
We accept patches for other platforms.
Windows: The minimum supported version is Windows 7. (See How To Build And Install On Windows for more details.)
ICU4J Platform Support
ICU4J works on Java 7..16 and on Android API level 21 and later.
Source and binary downloads are available on the git/GitHub tag page: https://github.com/unicode-org/icu/releases/tag/release-70-1
See the Source Code Access page for how to download the ICU file tree directly from GitHub.
ICU locale data was generated from CLDR tag https://github.com/unicode-org/cldr/releases/tag/release-40.