summaryrefslogtreecommitdiffstats
path: root/util/locale_database/dateconverter.py
Commit message (Collapse)AuthorAgeFilesLines
* Improve fidelity of approximation to CLDR zone representationsEdward Welbourne2024-04-221-4/+18
| | | | | | | | | | | | | | | | | | | | I neglected to update the CLDR dateconverter code when I expanded the range of forms we support for display of a timezone. Even that expanded range doesn't cover all the cases CLDR does, but we can at least approximate each of CLDR's options by the closest we do support. Make matching changes to how the Darwin backend for the system locale maps its ICU-derived formats to ours. This in practice changes all locales previously using t (abbreviation) as zone format to use tttt (IANA ID) instead. Test data updated to match. [ChangeLog][QtCore][QLocale] Date-time formats now more faithfully follow the CLDR data in handling timezones. In most cases this means the IANA ID is used in place of the abbreviation. Change-Id: I0276843085839ba9a7855a78922cffe285174643 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Correct handling of 'u' in CLDR date format stringsEdward Welbourne2024-04-191-0/+7
| | | | | | | | | | | It explicitly excludes having a two-digit special case like 'yy'. Correct that in qlocale_mac.mm, add support in dateconverter.py No current locale actually uses the 'u' format, so this makes no change to data. Change-Id: I16dfed2d3a7d2054b4b86f9a246bff297df9fc0a Reviewed-by: Dennis Oberst <dennis.oberst@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Fix handling of am/pm indicators in mapping from CLDR to Qt formatsEdward Welbourne2024-04-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Both qlocale_mac.mm and dateconverter.py were mapping the CLDR am/pm indicator, 'a', to the Qt format token 'AP', forcing the indicator to uppercase. The LDML spec [0] says: May be upper or lowercase depending on the locale and other options. [0] https://www.unicode.org/reports/tr35/tr35-68/tr35-dates.html#Date_Field_Symbol_Table We don't support the "other options" mentioned, but we can at least (since 6.3) preserve the the locale-appropriate case, instead of forcing upper-case. As such, this change is a follow-up to commit 4641ff0f6a1b0da6f55db5e33c58a77be2032808 Changes locale data, as expected, to use "Ap" in place of "AP" in various formats in the time_format_data[] array. [ChangeLog][QtCore][QLocale] Where CLDR specifies an am/pm indicator, the case of the CLDR-supplied indicator is used, where previously QLocale forced it to upper-case. Change-Id: Iee7d55e6f3c78372659668b9798c8e24a1fa8982 Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Cope with CLDR's "day period" format specifiersEdward Welbourne2024-04-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LDML spec includes a 'b' pattern character which is like the 'a' pattern, for AM and PM, but would rather use noon and midnight indicators for those specific times. We don't support those and using am/pm will be right enough of the time to be better than simply discarding this option, if it ever gets used (which it currently isn't), so treat as an alias for 'a'. No locale in CLDR currently uses this. CLDR also has a 'B' specifiers for "flexible day periods", including things like "at night" and "in the day". At present only zh_Hant uses 'B'. As a result, this change only affects zh_Hant's formats for time and datetime, which only zh_Hant_TW uses - zh_Hant_HK overrides them to use am/pm markers and zh_Hant_MO inherits that from zh_Hant_HK. Based on this and user feed-back, I've opted to treat 'B' as another synonym of 'a'. This removes an entry from the time_format_data[] table (it happened to occupy one whole twelve-character row), causing many other locales' offsets into that table to be shifted by 12. Only zh_Hant_TW has an actual change to which entry in the table it uses. Added a test-case. [ChangeLog][QtCore][QLocale] CLDR's 'B' (flexible day period, e.g. "at night" &c.) field, not currently supported, is now handled as a synonym for the AM/PM field 'a', instead of leaving the B as literal text. Only affects zh_TW at present. Fixes: QTBUG-123872 Change-Id: I6ba008c0a048190bf7af8c7df7629a885b05804f Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Rewrite CLDR-ingestion's date-time format conversionEdward Welbourne2024-04-191-78/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework the somewhat ad-hoc handling of format blocks. Instead of converting one character at a time, then coming back to map contiguous chunks of various lengths to Qt's best match, use the first non-separator character to select a function that looks ahead to see what to consume with it. Quoted text can be handled the same way, with a look-ahead. This potentially allows for more flexible parsing in future. In the process, matching qlocale_mac.mm, treat all unquoted letters as reserved. The LDML spec says: Currently, A..Z and a..z are reserved for use as pattern characters (unless they are quoted, see next item). and its description of literal text explcitly says these reserved characters are not to be understood as literals. Document the letters we do know about as unsupported pattern characters, but don't do anything specific to handle them. This transiently changes zh_TW's "Bh" hour fields to plain "h" but an imminent commit will change that again and there is no other change to data, so the locale data is not regenerated in this commit, to save churn. This makes the parsing front-end function more straightforward and makes it easier to document the quirks of the different format letters and the impedance mismatches between CLDR's and Qt's. In the process, recognize C, like j and J, as special magic to ignore and harmonize with what qlocale_mac.cpp's macToQtFormat() does, where it's right and dateconverter.py differed. Document the need to stay in sync with this last. Task-number: QTBUG-123872 Change-Id: I490d395b37751c9b8d6f3ee5ed4edbc0d405db5b Reviewed-by: Mate Barany <mate.barany@qt.io>
* Use SPDX license identifiersLucie Gérard2022-05-161-27/+2
| | | | | | | | | | | | | Replace the current license disclaimer in files by a SPDX-License-Identifier. Files that have to be modified by hand are modified. License files are organized under LICENSES directory. Task-number: QTBUG-67283 Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
* Convert CLDR scripts to Python 3Ievgenii Meshcheriakov2021-07-151-1/+1
| | | | | | | | | | | The convertion is moslty done using 2to3 script with manual cleanup afterwards. Task-number: QTBUG-83488 Pick-to: 6.2 Change-Id: I4d33b04e7269c55a83ff2deb876a23a78a89f39d Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io> Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* dateconverter.py: Remove shebang and executable attributeIevgenii Meshcheriakov2021-07-051-1/+0
| | | | | | | | | | | This is not a script that can be run independently. Task-number: QTBUG-83488 Pick-to: 6.2 Change-Id: I82a93b9ab37ae759b789058d48e94298ecd29b6f Reviewed-by: Friedemann Kleint <Friedemann.Kleint@qt.io> Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io> Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Rename util/locale_database/ to include the e that was missingEdward Welbourne2019-05-201-0/+107
It was misnamed local_database, quite missing the point of its name. Change-Id: I73a4fdf24f53daac12304de1f443636d89afacb2 Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>