ICU - International Components for Unicode

ICU-TC Home Page

News

2026-03-17: ICU 78.3 is now available — releases/tag/release-78.3 — Maven: com.ibm.icu / icu4j / version 78.3

This is a maintenance release. It supersedes ICU 78.1 and 78.2.

ICU 78 updates to Unicode 17 (blog), including new characters and scripts, emoji, collation & IDNA changes, and corresponding APIs and implementations.

It also updates to CLDR 48 (blog) locale data with new locales, and various additions and corrections.

In Java, there is a new Segmenter API which is easier and safer to use than BreakIterator.
In C++, there is a new set of APIs for Unicode string (UTF-8/16/32) code point iteration that works seamlessly with modern C++ iterators and ranges.

The Java implementation of the CLDR MessageFormat 2.0 specification has been updated to CLDR 48. The core API has been upgraded to “draft”, while the Data Model API remains in technology preview.

The C++ implementation of MessageFormat 2.0 is at CLDR 47 level and remains in technology preview.

ICU 78 and CLDR 48 are major releases, including a new version of Unicode and major locale data improvements.

2025-03-13: ICU 77 is now available — releases/tag/release-77-1 — Maven: com.ibm.icu / icu4j / version 77.1

ICU 77 updates to CLDR 47 locale data with new locales, and various additions and corrections.

ICU 77 is mostly focused on bug fixes, segmentation conformance, and other refinements. The technology preview implementations of the CLDR MessageFormat 2.0 specification have been updated to incorporate some, but not yet all, of the CLDR 47 changes. (Java more than C++)

2024-10-24: ICU 76 is now available. It updates to Unicode 16 (blog), including new characters and scripts, emoji, collation & IDNA changes, and corresponding APIs and implementations. It also updates to CLDR 46 (beta blog) locale data with new locales, significant updates to existing locales, and various additions and corrections. For example, the CLDR and Unicode default sort orders are now very nearly the same.

Most of the java.time (Temporal) types can now be formatted directly. There are some new APIs to make ICU easier to use with modern C++ and Java patterns. The Java and C++ technology preview implementations of the CLDR MessageFormat 2.0 specification have been updated to match recent changes. See ICU 76.

What is ICU?

ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software. ICU is released under a nonrestrictive open source license that is suitable for use with both commercial software and with other open source or free software.

Here are a few highlights of the services provided by ICU:

- Code Page Conversion: Convert text data to or from Unicode and nearly any other character set or encoding. ICU's conversion tables are based on charset data collected by IBM over the course of many decades, and is the most complete available anywhere.
- Collation: Compare strings according to the conventions and standards of a particular language, region or country. ICU's collation is based on the Unicode Collation Algorithm plus locale-specific comparison rules from the Common Locale Data Repository, a comprehensive source for this type of data.
- Formatting: Format numbers, dates, times and currency amounts according the conventions of a chosen locale. This includes translating month and day names into the selected language, choosing appropriate abbreviations, ordering fields correctly, etc. This data also comes from the Common Locale Data Repository.
- Time Calculations: Multiple types of calendars are provided beyond the traditional Gregorian calendar. A thorough set of timezone calculation APIs are provided.
- Unicode Support: ICU closely tracks the Unicode standard, providing easy access to all of the many Unicode character properties, Unicode Normalization, Case Folding and other fundamental operations as specified by the Unicode Standard.
- Regular Expression: ICU's regular expressions fully support Unicode while providing very competitive performance.
- Bidi: support for handling text containing a mixture of left to right (English) and right to left (Arabic or Hebrew) data.
- Text Boundaries: Locate the positions of words, sentences, paragraphs within a range of text, or identify locations that would be suitable for line wrapping when displaying the text.

And much more. Refer to the ICU User Guide for details.

Why Unicode?

Unicode (and the parallel ISO 10646 standard) defines the character set necessary for efficiently processing text in any language and for maintaining text data integrity. In addition to global character coverage, the Unicode standard is unique among character set standards because it also defines data and algorithms for efficient and consistent text processing. This simplifies high-level processing and ensures that all conformant software produces the same results. The widespread adoption of Unicode over the last decade made text data truly portable and formed a cornerstone of the Internet.

What is Unicode?

Globalized software, based on Unicode, maximizes market reach and minimizes cost. Globalized software is built and installed once and yet handles text for and from users worldwide and accomodates their cultural conventions. It minimizes cost by eliminating per-language builds, installations, and maintenance updates.

Why ICU4C?

The C and C++ languages and many operating system environments do not provide full support for Unicode and standards-compliant text handling services. Even though some platforms do provide good Unicode text handling services, portable application code can not make use of them. The ICU4C libraries fills in this gap. ICU4C provides an open, flexible, portable foundation for applications to use for their software globalization requirements. ICU4C closely tracks industry standards, including Unicode and CLDR (Common Locale Data Repository).

Why ICU4J?

Java provides a very strong foundation for global programs, and IBM and the ICU team played a key role in providing globalization technology into Sun's Java. But because of its long release schedule, Java cannot always keep up-to-date with evolving standards. The ICU team continues to extend Java's Unicode and internationalization support, focusing on improving performance, keeping current with the Unicode standard, and providing richer APIs, while remaining as compatible as possible with the original Java text and internationalization API design.

See Why Use ICU4J?

ICU4JNI

New versions of ICU4JNI are no longer being created. If you need the functionality of ICU4JNI, you should consider migrating to ICU4J.

Who Uses ICU?

The following is a list of products, companies and organizations reported to be using ICU. If you have any feedback on this list (corrections, additions, or details), please contact us (on icu-support).

Companies and Organizations using ICU

ABAS Software, Adobe, Amazon (Kindle), Amdocs, Apache, Appian, Apple, Argonne National Laboratory, Avaya, BAE Systems Geospatial eXploitation Products, BEA, BluePhoenix Solutions, BMC Software, Boost, BroadJump, Business Objects, caris, CERN, CouchDB, Debian Linux, Dell, Eclipse, eBay, EMC Corporation, ESRI, Facebook (HHVM), Firebird RDBMS, FreeBSD, Gentoo Linux, Google, GroundWork Open Source, GTK+, Harman/Becker Automotive Systems GmbH, HP, Hyperion, IBM, Inktomi, Innodata Isogen, Informatica, Intel, Interlogics, IONA, IXOS, Jikes, Library of Congress, LibreOffice, Mathworks, Microsoft, Mozilla, Netezza, Node.js, Oracle (Solaris, Java), Lawson Software, Leica Geosystems GIS & Mapping LLC, Mandrake Linux, OCLC, Progress Software, Python, QNX, Rogue Wave, SAP, SIL, SPSS, Software AG, SuSE, Sybase, Symantec, Teradata (NCR), ToolAware, Trend Micro, Virage, webMethods, Wikimedia Foundation [Wikipedia] MediaWiki application servers, Wine, WMS Gaming, XyEnterprise, Yahoo!, Vuo, and many others.

Apache Projects

Harmony, Lucene search library, OpenOffice, PDFBox library, Solr search engine server, Tika metadata toolkits, Xalan XSLT, Xerces XML

Products from IBM

DB2, Lotus, Websphere, Tivoli, Rational, AIX, i/OS, z/OS

Ascential Software, Cloudant, Cognos, PSD Print Architecture, COBOL, Host Access Client, InfoPrint Manager, Informix GLS, Language Analysis Systems, Lotus Notes, Lotus Extended Search, Lotus Workplace, WebSphere Message Broker, NUMA-Q, OTI, OmniFind, Pervasive Computing WECMS, Rational Business Developer and Rational Application Developer, SS&S Websphere Banking Solutions, Tivoli Presentation Services, Tivoli Identity Manager, WBI Adapter/ Connect/Modeler and Monitor/ Solution Technology Development/WBI-Financial TePI, Websphere Application Server/ Studio Workload Simulator/Transcoding Publisher, XML Parser.

Products from Google

Android developer guide: Unicode and internationalization support

Web Search, Google+, Chrome/Chrome OS, Android, Adwords, Google Finance, Google Maps, Blogger, Google Analytics, Google Groups, and others.

Products from Apple

macOS (OS & applications), iOS (iPhone, iPad, iPod touch), watchOS & tvOS, Safari for Windows & other Windows applications and related support, Apple Mobile Device Support in iTunes for Windows.

Products from Microsoft

Article: .NET globalization and ICU

Article: International Components for Unicode (ICU) – Highlights of the Globalization API services provided by ICU etc.

Windows Bridge for iOS (link), Windows 10 - Creators Update, Visual Studio 2017 [Electron], Visual Studio Code [Electron], ChakraCore

Products from Harman/Becker

The following car brands are using ICU via the Harman/Becker automotive software: Alfa Romeo, Audi, Bentley, BMW, Buick, more...

Products from Adobe

Creative Cloud apps and Document Cloud

Related Projects

There are also some related projects that wrap the existing functionality of ICU.

Page updated

Report abuse