Proposal email sent to the icu-design list on 2007-jul-19. time zone API: getDisplayName()
Markus Scherer <markus.icu@gmail.com> | Thu, Jul 19, 2007 at 1:50 PM | To: icu-design@lists.sourceforge.net | Dear ICU team,
Below please see an exchange from last November between John Emmons and myself. It contains an API proposal of sorts, showing wrapper code and suggesting something like it for ICU.
I would like to see if we could add this for ICU 3.8, even if it were to use DateFormat under the covers right now, like my wrapper here does.
On the question of an API that takes a "bool daylight" but not a date/time value, I understand from John's reply that it is problematic -- a time zone might not have used daylight savings time consistently in the past. However, it might still be useful for getting a display name when you have a Unix struct tm or similar so that you need not puzzle together (or guess!) an appropriate date/time value. What do you think? (If this is too controversial, although it follows the current API more closely, I think I can do without it, at least for now. If we had to use a DateFormat right now, then this variant would not be easy to implement anyway.)
John & Mark, could you please bring me up to speed on your work on meta time zones?
markus
Forwarded Conversation Subject: time zone API: getDisplayName() ------------------------
From: Markus Scherer <markus.icu@gmail.com> To: John Emmons <emmo@us.ibm.com>, Mark Davis <mark.davis@icu-project.org> Date: Sat, Nov 4, 2006 at 8:02 AM
Hi John,
Some ICU meetings ago you said you were working on improved time zone display name look-ups, and I said I would work with you on the API where we need to be able to request particular forms. Sorry it took me so long to start the discussion!
So here we go. I have created a thin wrapper around the ICU4C TimeZone class to provide a smaller API with the requested features. I am copying the relevant parts below. I don't know if you are working only on getDisplayName() or also on getOffset(). This just includes the parts for getDisplayName(). Please let me know if you are also working on getOffset().
For getDisplayName(), I essentially added a DisplayStyle enum parameter with the CLDR-defined choices directly selectable. These are the preferred formats; of course there will be fallbacks as necessary. I also have a DisplayLength enum mirroring ICU's EDisplayType (short vs. long format).
My implementation currently uses a DateFormat, which is slow and does not quite provide the granularity of format selection, at least in the current implementation. (The missing granularity should probably be fixed in DateFormat as well.) Also because of the DateFormat, I ended up only implementing a function for now that takes a point-in-time parameter (so that I have a time to stick into the DateFormat), rather than the more direct function that takes the boolean daylight selector.
The goal is to have a TimeZone::getDisplayName() function, much like the one in my wrapper, with a selector like the DisplayStyle here so that I can implement my wrapper much more directly, without the detour through the DateFormat.
What do you think?
The following parts of my wrapper API include the getDisplayName().
// Constants for use with GetDisplayName(), for whether a short or // a long display name is desired. // Keep the constants and their numeric values in sync with // ICU's TimeZone::EDisplayType. enum DisplayLength { SHORT = 1, LONG = 2 };
// Constants for use with GetDisplayName(), selecting the // style of time zone display name. enum DisplayStyle { GMT_OFFSET, // GMT+9:30 RFC822, // +0930 GENERIC, // Pacific Time SPECIFIC, // Pacific Standard Time or Pacific Daylight Time LOCATION, // Los Angeles (US) STYLE_COUNT };
// Get a display name for the time zone and the specified display locale. // The locale should be a string like "en", "de_CH" or "zh_Hans". // If there is no good display name available for the time zone ID, then // the time zone ID itself is returned. // The returned string will usually contain non-ASCII characters. // // TODO(mscherer): Currently ICU is missing functionality: // If the LOCATION style is requested, the function may return // the GENERIC or SPECIFIC style instead. UnicodeText GetDisplayName(const DateTime &time, DisplayStyle style, DisplayLength length, const string &display_locale) const;
#if 0 // TODO(mscherer): Add this API function here once ICU has a corresponding API // function. The current icu::TimeZone::getDisplayName() takes a bool daylight // but does not support this style parameter. // Instead, the current GetDisplayName(time, ...) implementation // uses an ICU DateFormat object which requires a datetime parameter. // We would have to guess a datetime for implementing the version below.
// Overload that takes a bool daylight instead of the time value. UnicodeText GetDisplayName(bool daylight, DisplayStyle style, DisplayLength length, const string &display_locale) const; #endif
Best regards, markus
-------- From: John Emmons <emmo@us.ibm.com> To: Markus Scherer <markus.icu@gmail.com> Date: Mon, Nov 6, 2006 at 8:32 AM
Hi Markus,
Looks like a good start. However, my biggest concern, which is the same one that Mark and I are grappling with right now, is how to deal with Olson zones that may have a different display name depending on the time in question. In these scenarios, it is difficult or nearly impossible to implement a getDisplayName() function without going through DateFormat.
For example,
America/Indiana/Knox - Includes many counties in Indiana that currently observe CST in winter and CDT in summer. But, prior to 2006, these counties observed EST year round. So in these cases, you can't do a reliable lookup of the time zone's display name without knowing which time we are talking about, unless you are willing to live with an API that returns the display name only as it applies to the current modern time, and I question how useful such an API would be in practice.
We are also dealing the complexities of how to deal with the fact that often many Olson zones share a commonly used display name, and we don't want to have to duplicate these display names everywhere in CLDR. Things like "Atlantic Standard Time" can apply to "America/Halifax", but also to "Atlantic/Bermuda", "America/Barbados", etc. Since they often cross country boundaries, we have the potential for political conflicts. For example, if I decide I'm going to put the translations for "Central European Time" in "Europe/Paris", and alias "Europe/Berlin" to it, do the Germans get upset? And then what happens when "Europe/Paris" changes its rules? I think you can appreciate the complexities involved here...
At this point, I am toying with the possibilities of having a "meta-time zone" that we could define in CLDR for naming purposes, and then we could define the fact that a certain Olson zone "observes" one of the meta zones during a specific time period. Right now I'm trying to formulate a syntax for this that would make sense and cover the scenarios we need it to.
You're certainly welcome to participate in the discussion and design of this. Right now Mark and I are working on it together since no one else seems to care...
Regards,
John C. Emmons Globalization Architect IBM Software Group, Austin TX Ph. 512-838-8184/512-259-9051 Internet: emmo@us.ibm.com
"Markus Scherer" <markus.icu@gmail.com>
11/04/2006 09:02 AM
To John Emmons/Austin/IBM@IBMUS, "Mark Davis" <mark.davis@icu-project.org>
cc
Subject time zone API: getDisplayName()
[Quoted text hidden] -------- From: Markus Scherer <markus.icu@gmail.com> To: John Emmons <emmo@us.ibm.com> Date: Mon, Nov 6, 2006 at 11:08 AM
Hi John, thanks for the reply and the reminder that I am still underestimating how messy time zones are!
On 11/6/06, John Emmons <emmo@us.ibm.com> wrote:
... my biggest concern, which is the same one that Mark and I are grappling with right now, is how to deal with Olson zones that may have a different display name depending on the time in question. In these scenarios, it is difficult or nearly impossible to implement a getDisplayName() function without going through DateFormat.
... America/Indiana/Knox - Includes many counties in Indiana that currently observe CST in winter and CDT in summer. But, prior to 2006, these counties observed EST year round. ...
Very good point. This does smell like deprecating versions of getDisplayName() that do not take a date/time value, and adding ones that do. However, I would hate for such methods to go through DateFormat, particularly because that means creating one inside the method, using it once, and throwing it away -- or else mutexing the use of an owned DateFormat object. Either way is a slow bottleneck. It seems like it should be the other way around: A new version of TimeZone::getDisplayName() should be able to figure out the display name based on the provided date/time, and DateFormat should call it with the date/time and with the style and length selectors.
So in these cases, you can't do a reliable lookup of the time zone's display name without knowing which time we are talking about, unless you are willing to live with an API that returns the display name only as it applies to the current modern time, and I question how useful such an API would be in practice.
Makes sense. I think we will have to implement this "current behavior" lookup for the current API though because we don't have the date/time available and we can't remove the current API.
We are also dealing the complexities of how to deal with the fact that often many Olson zones share a commonly used display name... For example, if I decide I'm going to put the translations for "Central European Time" in "Europe/Paris", and alias "Europe/Berlin" to it, do the Germans get upset? And then what happens when "Europe/Paris" changes its rules? I think you can appreciate the complexities involved here...
Somewhat. I am not sure that anyone would be upset by attaching shared data to one or the other arbitrarily, for example by using alphabetic order or something else neutral for choosing the anchor point for the data.
At this point, I am toying with the possibilities of having a "meta-time zone" that we could define in CLDR for naming purposes, and then we could define the fact that a certain Olson zone "observes" one of the meta zones during a specific time period.
This seems like a nice solution even from a technical standpoint, politics aside.
Right now I'm trying to formulate a syntax for this that would make sense and cover the scenarios we need it to.
You're certainly welcome to participate in the discussion and design of this. Right now Mark and I are working on it together since no one else seems to care...
Well, my main interest is getting to a more usable API, but I would be happy to participate in bouncing around the data organization as well.
Best regards, markus |
|
|