Proposal 20070719
Proposal email sent to the icu-design list on 2007-jul-19.
time zone API: getDisplayName()
Markus Scherer <markus.icu@gmail.com>
Thu, Jul 19, 2007 at 1:50 PM
To: icu-design@lists.sourceforge.net
Dear ICU team,
Below please see an exchange from last November between John Emmons
and myself. It contains an API proposal of sorts, showing wrapper code
and suggesting something like it for ICU.
I would like to see if we could add this for ICU 3.8, even if it were
to use DateFormat under the covers right now, like my wrapper here
does.
On the question of an API that takes a "bool daylight" but not a
date/time value, I understand from John's reply that it is problematic
-- a time zone might not have used daylight savings time consistently
in the past. However, it might still be useful for getting a display
name when you have a Unix struct tm or similar so that you need not
puzzle together (or guess!) an appropriate date/time value. What do
you think? (If this is too controversial, although it follows the
current API more closely, I think I can do without it, at least for
now. If we had to use a DateFormat right now, then this variant would
not be easy to implement anyway.)
John & Mark, could you please bring me up to speed on your work on
meta time zones?
markus
Forwarded Conversation
Subject: time zone API: getDisplayName()
------------------------
From: Markus Scherer <markus.icu@gmail.com>
To: John Emmons <emmo@us.ibm.com>, Mark Davis <mark.davis@icu-project.org>
Date: Sat, Nov 4, 2006 at 8:02 AM
Hi John,
Some ICU meetings ago you said you were working on improved time zone
display name look-ups, and I said I would work with you on the API
where we need to be able to request particular forms. Sorry it took me
so long to start the discussion!
So here we go. I have created a thin wrapper around the ICU4C TimeZone
class to provide a smaller API with the requested features. I am
copying the relevant parts below. I don't know if you are working only
on getDisplayName() or also on getOffset(). This just includes the
parts for getDisplayName(). Please let me know if you are also working
on getOffset().
For getDisplayName(), I essentially added a DisplayStyle enum
parameter with the CLDR-defined choices directly selectable. These are
the preferred formats; of course there will be fallbacks as necessary.
I also have a DisplayLength enum mirroring ICU's EDisplayType (short
vs. long format).
My implementation currently uses a DateFormat, which is slow and does
not quite provide the granularity of format selection, at least in the
current implementation. (The missing granularity should probably be
fixed in DateFormat as well.) Also because of the DateFormat, I ended
up only implementing a function for now that takes a point-in-time
parameter (so that I have a time to stick into the DateFormat), rather
than the more direct function that takes the boolean daylight
selector.
The goal is to have a TimeZone::getDisplayName() function, much like
the one in my wrapper, with a selector like the DisplayStyle here so
that I can implement my wrapper much more directly, without the detour
through the DateFormat.
What do you think?
The following parts of my wrapper API include the getDisplayName().
// Constants for use with GetDisplayName(), for whether a short or
// a long display name is desired.
// Keep the constants and their numeric values in sync with
// ICU's TimeZone::EDisplayType.
enum DisplayLength {
SHORT = 1,
LONG = 2
};
// Constants for use with GetDisplayName(), selecting the
// style of time zone display name.
enum DisplayStyle {
GMT_OFFSET, // GMT+9:30
RFC822, // +0930
GENERIC, // Pacific Time
SPECIFIC, // Pacific Standard Time or Pacific Daylight Time
LOCATION, // Los Angeles (US)
STYLE_COUNT
};
// Get a display name for the time zone and the specified display locale.
// The locale should be a string like "en", "de_CH" or "zh_Hans".
// If there is no good display name available for the time zone ID, then
// the time zone ID itself is returned.
// The returned string will usually contain non-ASCII characters.
//
// TODO(mscherer): Currently ICU is missing functionality:
// If the LOCATION style is requested, the function may return
// the GENERIC or SPECIFIC style instead.
UnicodeText GetDisplayName(const DateTime &time,
DisplayStyle style,
DisplayLength length,
const string &display_locale) const;
#if 0
// TODO(mscherer): Add this API function here once ICU has a corresponding API
// function. The current icu::TimeZone::getDisplayName() takes a bool daylight
// but does not support this style parameter.
// Instead, the current GetDisplayName(time, ...) implementation
// uses an ICU DateFormat object which requires a datetime parameter.
// We would have to guess a datetime for implementing the version below.
// Overload that takes a bool daylight instead of the time value.
UnicodeText GetDisplayName(bool daylight,
DisplayStyle style,
DisplayLength length,
const string &display_locale) const;
#endif
Best regards,
markus
--------
From: John Emmons <emmo@us.ibm.com>
To: Markus Scherer <markus.icu@gmail.com>
Date: Mon, Nov 6, 2006 at 8:32 AM
Hi Markus,
Looks like a good start. However, my biggest concern, which is the
same one that Mark and I are grappling with right now, is how to deal
with Olson zones that may have a different display name depending on
the time in question. In these scenarios, it is difficult or nearly
impossible to implement a getDisplayName() function without going
through DateFormat.
For example,
America/Indiana/Knox - Includes many counties in Indiana that
currently observe CST in winter and CDT in summer. But, prior to
2006, these counties observed EST year round. So in these cases, you
can't do a reliable lookup of the time zone's display name without
knowing which time we are talking about, unless you are willing to
live with an API that returns the display name only as it applies to
the current modern time, and I question how useful such an API would
be in practice.
We are also dealing the complexities of how to deal with the fact that
often many Olson zones share a commonly used display name, and we
don't want to have to duplicate these display names everywhere in
CLDR. Things like "Atlantic Standard Time" can apply to
"America/Halifax", but also to "Atlantic/Bermuda", "America/Barbados",
etc. Since they often cross country boundaries, we have the
potential for political conflicts. For example, if I decide I'm going
to put the translations for "Central European Time" in "Europe/Paris",
and alias "Europe/Berlin" to it, do the Germans get upset? And then
what happens when "Europe/Paris" changes its rules? I think you can
appreciate the complexities involved here...
At this point, I am toying with the possibilities of having a
"meta-time zone" that we could define in CLDR for naming purposes, and
then we could define the fact that a certain Olson zone "observes" one
of the meta zones during a specific time period. Right now I'm trying
to formulate a syntax for this that would make sense and cover the
scenarios we need it to.
You're certainly welcome to participate in the discussion and design
of this. Right now Mark and I are working on it together since no one
else seems to care...
Regards,
John C. Emmons
Globalization Architect
IBM Software Group, Austin TX
Ph. 512-838-8184/512-259-9051
Internet: emmo@us.ibm.com
"Markus Scherer" <markus.icu@gmail.com>
11/04/2006 09:02 AM
To John Emmons/Austin/IBM@IBMUS, "Mark Davis" <mark.davis@icu-project.org>
cc
Subject time zone API: getDisplayName()
[Quoted text hidden]
--------
From: Markus Scherer <markus.icu@gmail.com>
To: John Emmons <emmo@us.ibm.com>
Date: Mon, Nov 6, 2006 at 11:08 AM
Hi John, thanks for the reply and the reminder that I am still
underestimating how messy time zones are!
On 11/6/06, John Emmons <emmo@us.ibm.com> wrote:
... my biggest concern, which is the same one that Mark and I are grappling with right now, is how to deal with Olson zones that may have a different display name depending on the time in question. In these scenarios, it is difficult or nearly impossible to implement a getDisplayName() function without going through DateFormat.
... America/Indiana/Knox - Includes many counties in Indiana that currently observe CST in winter and CDT in summer. But, prior to 2006, these counties observed EST year round. ...
Very good point. This does smell like deprecating versions of
getDisplayName() that do not take a date/time value, and adding ones
that do. However, I would hate for such methods to go through
DateFormat, particularly because that means creating one inside the
method, using it once, and throwing it away -- or else mutexing the
use of an owned DateFormat object. Either way is a slow bottleneck. It
seems like it should be the other way around: A new version of
TimeZone::getDisplayName() should be able to figure out the display
name based on the provided date/time, and DateFormat should call it
with the date/time and with the style and length selectors.
So in these cases, you can't do a reliable lookup of the time zone's display name without knowing which time we are talking about, unless you are willing to live with an API that returns the display name only as it applies to the current modern time, and I question how useful such an API would be in practice.
Makes sense. I think we will have to implement this "current behavior"
lookup for the current API though because we don't have the date/time
available and we can't remove the current API.
We are also dealing the complexities of how to deal with the fact that often many Olson zones share a commonly used display name...
For example, if I decide I'm going to put the translations for "Central European Time" in "Europe/Paris", and alias "Europe/Berlin" to it, do the Germans get upset? And then what happens when "Europe/Paris" changes its rules? I think you can appreciate the complexities involved here...
Somewhat. I am not sure that anyone would be upset by attaching shared
data to one or the other arbitrarily, for example by using alphabetic
order or something else neutral for choosing the anchor point for the
data.
At this point, I am toying with the possibilities of having a "meta-time zone" that we could define in CLDR for naming purposes, and then we could define the fact that a certain Olson zone "observes" one of the meta zones during a specific time period.
This seems like a nice solution even from a technical standpoint,
politics aside.
Right now I'm trying to formulate a syntax for this that would make sense and cover the scenarios we need it to.
You're certainly welcome to participate in the discussion and design of this. Right now Mark and I are working on it together since no one else seems to care...
Well, my main interest is getting to a more usable API, but I would be
happy to participate in bouncing around the data organization as well.
Best regards,
markus