ICU 4.4
All
Unicode 5.2
CLDR 1.8
More compact data formats
ICU4J modularization
Apple
Tier 1
Regex using abstract text access APIs (UText), roll in work by Jordan Rose: #4521 (ensure perf OK, ensure UTF8 support)
Ensure that there are APIs providing access to all CLDR data: e.g. #4836, #5478, etc. (Google also interested) (Peter has CLDR task to enumerate the data that is missing; based on that we can file additional bugs and divide up the work)
Improved search capabilities (Peter to generate design doc) - mainly asymmetric search, i.e. type e, match e,é,è; type é, match é but probably not e and certainly not è (#7093) (Google also interested). Other possibilities (lower priority) include:
Position-dependent matching? (e.g. Arabic HEH and TEH MARBUTA should match for a search when both are at the end of a word)
Use of search object distinct from collator? (Possible optimization, may not be necessary, not of interest to others)
Reduce ICU4C dynamically-allocated memory, especially for time zone data (more compact data formats may help with this): #6873, #6879 (Google also interested) (Peter will look at porting Yoshito's ICU4J work to C; requires interpreting const in a "logical" way - can do lazy loading, just make sure thread safe. Should document this interpretation. Peter to coordinate with Andy on this)
Tier 2
Number spellout format & parse support for CJK numbers, including in dates. Note: CLDR 1.7 added relevant capability per cldrbug:1927; is there anything else that needs to be done in ICU (may work if appropriate patterns are used, Peter will do some experiments)
Support >2GB text length for search, regex, text break, encoding conversion, perhaps transliteration. Use of UText will provide appropriate interfaces for regex and RBBI with additional internal changes. #5451 is for the RBBI changes.
Encoding detection for a wider range of encodings, with some finer distinctions. For example pure ShiftJIS text should return both ShiftJIS and cp932 with 100% confidence; text including cp932 extensions should also return both but with lower confidence for ShiftJIS.
Additional conversion tables (not necessarily in default build). Don't need a ticket for this yet.
Already implemented on branch
Google
Tier 1
General
Normalization/IDNA
Formatting
Footprint
Split locale display names, time zone names, currency names out of locale data (into separate data) (#7163)
Introduce Locale base class — only id, no display names (#7164)
Introduce MessageFormat base class — only string substitution (#7165)
Smaller "core" set (without formatting data, only manipulation/algorithms) (#7161)
Generalized cache management for ICU4C (#2863, #3035, #3118, #6029, #6030, #6031, #6099, #6708)
More compact collation (rules/binary) data
Compact collation tailoring syntax for lists of characters with same level difference (#7015)
Add import rule to collation tailoring syntax (#7023, CldrBug:2268)
Locale data filtering: display names for fewer codes
(apple) Reduce C heap memory usage, especially for time zone #6873, #6879
Tier 2
General
Collation script reordering (#3984, CldrBug:2267)
Best match for locale IDs: #4712
Better C++ implementation (scoped_ptr, byte string class; see design/C++ page) (#7162)
Cast reduction — move methods into base
Formatting
Footprint
Filter out locales with insufficient data
IBM
General focus: Usability, Maintainability and Performance
Code and Data Maintainability Improvements, e.g. Separating timezone data from code.
Overriding/updating locale information in an ICU installation: 4597 6633
Collation and string search service code clean up: 4562
Misc layout bug fixes: 5589 6625 5431 6113 6182
Improved ICU performance and regression for selected service areas only, e.g. Collation
Extended IETF BCP47 support: language/locale specification for HTTP/XML/OpenJDK
Lenient parsing, e.g. DateFormat. (Already implemented by Apple on branch)
Locale service SPI
JSR-310 Date and Time APIs
@provider multiple version support, Calling old ICU service code through new ICU API
Java 5 migration (ICU4J)
Supporting generics to match JDK APIs
ICU 4.4 will no longer support Java 1.4 or older versions
Java Logging support (ICU4J)
ICU Resource Bundle footprint optimization