summaryrefslogtreecommitdiffstats
path: root/src/corelib/text/qlocale_data_p.h
Commit message (Collapse)AuthorAgeFilesLines
* Improve fidelity of approximation to CLDR zone representationsEdward Welbourne2024-04-221-711/+719
| | | | | | | | | | | | | | | | | | | | I neglected to update the CLDR dateconverter code when I expanded the range of forms we support for display of a timezone. Even that expanded range doesn't cover all the cases CLDR does, but we can at least approximate each of CLDR's options by the closest we do support. Make matching changes to how the Darwin backend for the system locale maps its ICU-derived formats to ours. This in practice changes all locales previously using t (abbreviation) as zone format to use tttt (IANA ID) instead. Test data updated to match. [ChangeLog][QtCore][QLocale] Date-time formats now more faithfully follow the CLDR data in handling timezones. In most cases this means the IANA ID is used in place of the abbreviation. Change-Id: I0276843085839ba9a7855a78922cffe285174643 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Fix handling of am/pm indicators in mapping from CLDR to Qt formatsEdward Welbourne2024-04-191-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | Both qlocale_mac.mm and dateconverter.py were mapping the CLDR am/pm indicator, 'a', to the Qt format token 'AP', forcing the indicator to uppercase. The LDML spec [0] says: May be upper or lowercase depending on the locale and other options. [0] https://www.unicode.org/reports/tr35/tr35-68/tr35-dates.html#Date_Field_Symbol_Table We don't support the "other options" mentioned, but we can at least (since 6.3) preserve the the locale-appropriate case, instead of forcing upper-case. As such, this change is a follow-up to commit 4641ff0f6a1b0da6f55db5e33c58a77be2032808 Changes locale data, as expected, to use "Ap" in place of "AP" in various formats in the time_format_data[] array. [ChangeLog][QtCore][QLocale] Where CLDR specifies an am/pm indicator, the case of the CLDR-supplied indicator is used, where previously QLocale forced it to upper-case. Change-Id: Iee7d55e6f3c78372659668b9798c8e24a1fa8982 Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Cope with CLDR's "day period" format specifiersEdward Welbourne2024-04-191-38/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LDML spec includes a 'b' pattern character which is like the 'a' pattern, for AM and PM, but would rather use noon and midnight indicators for those specific times. We don't support those and using am/pm will be right enough of the time to be better than simply discarding this option, if it ever gets used (which it currently isn't), so treat as an alias for 'a'. No locale in CLDR currently uses this. CLDR also has a 'B' specifiers for "flexible day periods", including things like "at night" and "in the day". At present only zh_Hant uses 'B'. As a result, this change only affects zh_Hant's formats for time and datetime, which only zh_Hant_TW uses - zh_Hant_HK overrides them to use am/pm markers and zh_Hant_MO inherits that from zh_Hant_HK. Based on this and user feed-back, I've opted to treat 'B' as another synonym of 'a'. This removes an entry from the time_format_data[] table (it happened to occupy one whole twelve-character row), causing many other locales' offsets into that table to be shifted by 12. Only zh_Hant_TW has an actual change to which entry in the table it uses. Added a test-case. [ChangeLog][QtCore][QLocale] CLDR's 'B' (flexible day period, e.g. "at night" &c.) field, not currently supported, is now handled as a synonym for the AM/PM field 'a', instead of leaving the B as literal text. Only affects zh_TW at present. Fixes: QTBUG-123872 Change-Id: I6ba008c0a048190bf7af8c7df7629a885b05804f Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Revise CLDR-generated data header files to use Unicode-3.0 for SPDXEdward Welbourne2024-04-171-1/+1
| | | | | | | | | | This is the correct license for the data they contain. The types that are used to package them can be made available under the same. Pick-to: 6.7 6.5 Task-number: QTBUG-121653 Change-Id: I7fb5f332f9e7f4f6db0f1f0c3964a36ce03bccb2 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
* Update QLocale and calendar data to CLDR v44.1Edward Welbourne2024-02-021-4154/+4469
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (This turns out to be identical to v44, for our purposes.) The CLDR license has been revised at v44 to "UNICODE LICENSE V3", which is now included (as LICENSES/UNICODE-3.0.txt) in addition to the old license (still in use, presumably, by UCD - at least until its next update). Some new QLocale::Language entries are needed. There is no change to the time-zone data. Some tests needed changes: * Various Arabic locales now use U+0623 (Arabic letter aleph with hamza above) in exponent separator, replacing plain U+0627 (Arabic letter aleph); it is still followed by U+0633 (Arabic letter seen). * Where likely sub-tags used to fill in world, 001, as territory for a language, they now (e.g. for Prussian and Yiddish) give specific countries. * Tamil locales now have something of a mix of inherited and localized forms for AM/PM, which looks a lot like a mistake in CLDR. * New likely sub-tag rules fix ctor(und_US) and ctor(und_GB), which previously failed. [ChangeLog][Third-Party Code] Updated QLocale's data extracted from the Unicode Common Locale Data Repository (CLDR) to v44.1. The license changed to Unicode License V3. Pick-to: 6.7 6.6 6.5 Fixes: QTBUG-121485 Task-number: QTBUG-121325 Change-Id: Ide1a68016129526d7a5aa3fc67f1a674858696bc Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Rewrite a comment on measurement systemsEdward Welbourne2024-02-011-3/+5
| | | | | | | | | | | | | | | | | | | CLDR does supply information on measurement system, which may be richer than what we're currently working with, but * I don't see any hint to which measurement system to use for each locale by default; * there is some support for selecting combinations of locale and measurement system, suggesting it doesn't believe in such a default in any case; and * even if it were there, adding it to locale_data[] would take up more memory than special-casing the few locales that use anything but SI. Revise the comment to reflect this. Change-Id: Idfa603fc9a5a55d0bd0da122ac66c76b0edf9f57 Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* qlocale_data_p.h: add missing #includeSAhmad Samir2023-09-011-2/+4
| | | | | | | | | | | Otherwise static analyzers, e.g. clazy, complain about unknown types. Remove qglobal_p.h include, not needed. Regroup includes. Change-Id: Ia8d2418911d6d3587eb5a77dde30052175e45da4 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Change enumdata.py names so comments read more naturallyEdward Welbourne2023-08-091-41/+41
| | | | | | | | Now that the "and" is only seen in enumdata.py and comments, we can s/And/and/ in all the various territory names that used it. Change-Id: Ic376d5904b6f5ab54931f96230c1dd5b7f357b8d Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
* Use CLDR's names in QLocale::*ToName() for language, script, territoryEdward Welbourne2023-08-091-710/+710
| | | | | | | | | | | | | | | Various comments need to continue using the enumdata.py names, as they associate data with particular enum members, but we can now correctly use the en.xml versions of their names when we report them, rather than the enum-friendly names we use in the code. Since this now means the data may stray outside plain ASCII - it'll be UTF-8-encoded - this implies replacing the QLatin1StringView()s of the code that formerly read this data with QString::fromUtf8(). Fixes: QTBUG-94460 Change-Id: Id3b08875a46af58c0555c3e303b0e15a19441509 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Revise enumdata.py's names to more closely match CLDR'sEdward Welbourne2023-08-091-288/+288
| | | | | | | | | | | | We could already use dashes in some, rather than spaces, and now no longer need to capitalize each word. This changes the *_name_list[] entries for affected languages to more closely match what CLDR gives as their names. It also amends various comments. Added tests for the QLocale::*ToString() functions to cover the entries changed. Task-number: QTBUG-94460 Change-Id: I0163795cb282881f15a97be00a5311c1936c3a09 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Wrap QLocale's CLDR-derived string data tables to a 100-column marginEdward Welbourne2023-08-031-2398/+3990
| | | | | | | | | | The string data tables have a mix of digit-length tokens, up to four-digit hex; we can fit 12 of those per line within our margins. Leave the one-row-per-locale tables as they are, though, despite long lines. Change-Id: I655fddecf24133c26d16187b7a5a8fbc25553e07 Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
* Update QLocale to CLDR v43Edward Welbourne2023-08-021-2800/+3164
| | | | | | | | | | | | Ran the scripts, added the new enum members to docs. Updated tests: * Two of the new languages are right-to-left, * Canada has replaced a silly date format with a sensible one. Fixes: QTBUG-111550 Change-Id: Ie6f1e6e94477167c9e2b5c67e6518ca0f6a7e7fb Reviewed-by: Mate Barany <mate.barany@qt.io> Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
* Pack languageCodeList tighterMate Barany2023-03-151-343/+356
| | | | | | | | | | | | | | Pack some of the arrays that contain locale data more tightly. The AlphaCode struct is a char[4] but always holds only [a-z]{,3} which could be fit into 16 bits, halving the size of an AlphaCode struct. With the new constructor the initialization of the AlphaCode struct also changes - modify qlocalexml2cpp.py to reflect this change and regenerate the languageCodeList. Fixes: QTBUG-105050 Change-Id: I2b1e93ab7cc3f2d667bf67b45769b74a15211931 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Update CLDR to v42Mate Barany2023-02-071-2550/+2638
| | | | | | | | | | | | | | | | | | | | New languages (and one local for each) added with v42 - Haryanvi - Moksha - Northern Frisian - Obolo - Pijin - Rajasthani - Toki Pona It also appears that Canada has changed its date format. Modify the relevant test case to reflect this change. Task-number: QTBUG-110333 Pick-to: 6.5 Change-Id: Ia8975c2866cd54c9e565543d05bacd52f4987909 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io> Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QLocaleData: fix AlphaCode::op== for C++20Marc Mutz2023-02-021-1/+2
| | | | | | | | | | | | | | | | | | The old function, bool AlphaCode::operator==(AlphaCode code) const noexcept is not symmetric: the LHS argument is passed by cref and the RHS one by (non-const) value. I didn't test, but this asymmetry might actually make the operator ambiguous with its reversed version in C++20. Fix by making a hidden friend. Even if it doesn't fix anything, hidden friend relational operators are still where we want our code base to migrate to, eventually (QTBUG-87973). Pick-to: 6.5 Change-Id: Icb74c24802a3fe6c2987c1db86880c0d72a7abdf Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QLocale: port to qsizetype [1/N]: indexed to ranged loopsMarc Mutz2022-07-261-2/+0
| | | | | | | | | | | | Ranged for loops are independent of the container's size_type, so port what we can to them. Pick-to: 6.4 6.3 Task-number: QTBUG-103531 Change-Id: I0fd5c9c721e892ea617f0b56b8ea423e7a9f0d04 Reviewed-by: Sona Kurazyan <sona.kurazyan@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Core: make CLDR data constexprYuhang Zhao2022-05-271-25/+25
| | | | | | | Task-number: QTBUG-100485 Pick-to: 6.3 6.2 Change-Id: Ib8c5160ca0994662a5fdc2293dc734c1bdcac4f2 Reviewed-by: Marc Mutz <marc.mutz@qt.io>
* Use SPDX license identifiersLucie Gérard2022-05-161-38/+2
| | | | | | | | | | | | | Replace the current license disclaimer in files by a SPDX-License-Identifier. Files that have to be modified by hand are modified. License files are organized under LICENSES directory. Task-number: QTBUG-67283 Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
* Update CLDR-derived data to newly-released v41Ievgenii Meshcheriakov2022-04-081-2270/+2290
| | | | | | | Fixes: QTBUG-101214 Change-Id: I42f3e76d03b4759d3cda9ab81856d0b6d7506d8e Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QLocale: Extend support for language codesIevgenii Meshcheriakov2021-12-091-333/+354
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit extends functionality for QLocale::codeToLanguage() and QLocale::languageToCode() by adding an additional argument that allows selection of the ISO 639 code-set to consider for those operations. The following ISO 639 codes are supported: * Part 1 * Part 2 bibliographic * Part 2 terminological * Part 3 As a result of this change the codeToLanguage() overload without the additional argument now returns a Language value if it matches any know code. Previously a valid language was returned only if the function argument matched the first code defined for that language from the above list. [ChangeLog][QtCore][QLocale] Added overloads for codeToLanguage() and languageToCode() that support specifying which ISO 639 codes to consider. Fixes: QTBUG-98129 Change-Id: I4da8a89e2e68a673cf63a621359cded609873fa2 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QLocale: Add support for Kaingang and Nheengatu languagesIevgenii Meshcheriakov2021-11-101-6/+42
| | | | | | | | | | Update the locale generation script to support Kaingang and Nheengatu languages. These are new in CLDR v40. Regenerate the locale data. Task-number: QTBUG-94358 Change-Id: I5195d5161d8c4d9f17129bbcfde39dfd3fcf1cd5 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Update CLDR-derived data to newly-released v40Ievgenii Meshcheriakov2021-11-101-2402/+2404
| | | | | | | | | | | | | Update tst_qlocale to take into account "narrow" day representation change for Russian locales. This version of CLDR changes narrow forms to one letter. Previously those forms were identical to short forms (two letter). The new representation is consistent with other languages and so does not appear to be a bug. Fixes: QTBUG-94358 Pick-to: 6.2 Change-Id: I9724c281a250685da8232e5c05c9c375a8c79253 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Make locale ordering transitiveEdward Welbourne2021-07-141-20/+20
| | | | | | | | | | | | | | | | | | | | | | | The ordering function used to sort the locale data generated for QLocale attempted to sort the default territory for a given language and script before other territories, but was too tangled for it to be obvious this is what it was doing. The result turned out to be non-transitive. Replace with code that implements the same preference but only applies it where the result is compatible with transitivity. This leads to a shuffling of the order of the Serbian-language locales, which sorts the Cyrillic ones before the Latin ones. This is consistent with my reading of the CLDR data, which fills in Cyrillic and Serbia for Serbian; Serbian/Cyrillic/Serbia did previously sort before all other Serbian variants. Thanks to Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io> for discovering the non-transitivity. Pick-to: 6.2 Change-Id: I0ce9f78e620e714f980f32b85b7100ed0f92ad74 Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io> Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
* Update CLDR-derived data to newly-released v39Edward Welbourne2021-04-261-446/+442
| | | | | | | | | | | | | Routine update with minor changes to locale data, no new languages, territories or scripts. Various Spanish locales change m_grouping_top from 1 to 2, reversing a change to a test of Costa Rica's currency formatting made in commit bb6a73260ec8272647265f42180963604ad0f755. Includes updates to time-zone IDs. Fixes: QTBUG-91478 Pick-to: 6.1 6.0 5.15 Change-Id: I78ee161275b3c456c5800a7317a96947c932cf8e Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Add the "Territory" enumerated type for QLocaleJiDe Zhang2021-04-151-10/+10
| | | | | | | | | | | | | | | | | | | The use of "Country" is misleading as some entries in the enumeration are not countries (eg, HongKong), for all that most are. The Unicode Consortium's Common Locale Data Repository (CLDR, from which QLocale's data is taken) calls these territories, so introduce territory-based names and prepare to deprecate the country-based ones in due course. [ChangeLog][QtCore][QLocale] QLocale now has Territory as an alias for its Country enumeration, and associated territory-based names to match its country-named methods, to better match the usage in relevant standards. The country-based names shall in due course be deprecated in favor of the territory-based names. Fixes: QTBUG-91686 Change-Id: Ia1ae1ad7323867016186fb775c9600cd5113aa42 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QLocale: simplify currency display name lookupEdward Welbourne2020-11-171-1424/+781
| | | | | | | | | | | | | | | We were extracting several candidate display names from CLDR for each currency, joining them with semicolons, storing in a table, then using only the first entry from the list - where we should probably have used the first non-empty entry in any case. So instead extract the first non-empty candidate name from CLDR and store that simply, saving the need for semicolon-joining or parsing out the first entry from the thus-joined list. This significantly reduces the size of the currency name data table. Change-Id: I201d0528348d5fcb9eceb5df86211b9c77de3485 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Implement binary search in QLocale's likely sub-tag lookupEdward Welbourne2020-11-081-395/+395
| | | | | | | | | | | | | | Follow through on a comment from 2012: sort the likely subtag array (in the CLDR update script) and use bsearch to find entries in it. This simplifies QLocaleXmlReader.likelyMap() slightly, moving the detection of last entry to LocaleDataWriter.likelySubtags(), but requires collecting all likely sub-tag mapping pairs (rather than just passing them through from read to write via generators) in order to sort them. Change-Id: Ieb6875ccde1ddbd475ae68c0766a666ec32b7005 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Update CLDR to v38Edward Welbourne2020-11-081-2881/+2871
| | | | | | | | | | | | | Fresh on the heels of our update to v37, they've released a new version. No new languages to complicate life, fortunately. Updated license (year range) and attribution. One test also needed an update: Catalan's long time format now parenthesizes the zone. Task-number: QTBUG-87925 Pick-to: 5.15 Change-Id: I54fb9b7f084b5cd019c983c1e3862dc03865a272 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Reorder locale enums alphabeticallyEdward Welbourne2020-11-081-5564/+5566
| | | | | | | | | | | | | Binary-incompatible change: change the numeric values of QLocale's Language, Script and Country enums, as encouraged by a comment in the generator script enumdata.py and clarify documentation around that. In the process (since I was changing almost every line anyway), convert the dictionary values from (mutable) lists of length two to tuples, since they are (and should be) immutable data. Change-Id: I26222bce45b9f5074b1d81ed70015a75ac34adcd Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Use newer names for various languages, territories and scriptsEdward Welbourne2020-11-081-997/+742
| | | | | | | | | | Our enumdata.py namings of countries had fallen somewhat out of sync with CLDR's names. In the process, support including hyphenation in the unsquashed name, along with spacing. Distinguish, in comments, between older renamings and those first seen in Qt6. Change-Id: I91ec444bf35222ab6a9332e389ace19cca0e4fdf Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Update CLDR to v37, adding Nigerian Pidgin as a new languageEdward Welbourne2020-10-261-1708/+1857
| | | | | | | | | | | | | | | | Routine update by running scripts, ignoring clang-format's extensive grumbles. Added notes to util/locale_database/'s README, on the need for that, and enumdata.py, on when to add entries. As usual, several new locales are also added, for existing languages, territories and scripts. [ChangeLog][QtCore][QLocale] Updated to new version of CLDR (the Unicode Consortium's Common Locale Data Repository) v37. Fixes: QTBUG-84669 Pick-to: 5.15 Change-Id: Ib76848bf4bd1219180faf46820077e8d8049a4e3 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Support digit-grouping correctlyEdward Welbourne2020-07-141-597/+597
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Read three more values from CLDR and add a byte to the bit-fields at the end of QLocaleData, indicating the three group sizes. This adds three new parameters to various low-level formatting functions. At the same time, rename ThousandsGroup to GroupDigits, more faithfully expressing what this (internal) option means. This replaces commit 27d139128013c969a939779536485c1a80be977e with a fuller implementation that handles digit-grouping in any of the ways that CLDR supports. The formerly "Indian" formatting now also applies to at least some locales for Bangladesh, Bhutan and Sri Lanka. Fixed Costa Rica currency formatting test that wrongly put a separator after the first digit; the locale (in common with several Spanish locales) requires at least two digits before the first separator. [ChangeLog][QtCore][Important Behavior Changes] Some locales require more than one digit before the first grouping separator; others use group sizes other than three. The latter was partially supported (only for India) at 5.15 but is now systematically supported; the former is now also supported. Task-number: QTBUG-24301 Fixes: QTBUG-81050 Change-Id: I4ea4e331f3254d1f34801cddf51f3c65d3815573 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Merge remote-tracking branch 'origin/5.15' into devQt Forward Merge Bot2020-04-081-501/+499
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: examples/opengl/doc/src/cube.qdoc src/corelib/global/qlibraryinfo.cpp src/corelib/text/qbytearray_p.h src/corelib/text/qlocale_data_p.h src/corelib/time/qhijricalendar_data_p.h src/corelib/time/qjalalicalendar_data_p.h src/corelib/time/qromancalendar_data_p.h src/network/ssl/qsslcertificate.h src/widgets/doc/src/graphicsview.qdoc src/widgets/widgets/qcombobox.cpp src/widgets/widgets/qcombobox.h tests/auto/corelib/tools/qscopeguard/tst_qscopeguard.cpp tests/auto/widgets/widgets/qcombobox/tst_qcombobox.cpp tests/benchmarks/corelib/io/qdiriterator/qdiriterator.pro tests/manual/diaglib/debugproxystyle.cpp tests/manual/diaglib/qwidgetdump.cpp tests/manual/diaglib/qwindowdump.cpp tests/manual/diaglib/textdump.cpp util/locale_database/cldr2qlocalexml.py util/locale_database/qlocalexml.py util/locale_database/qlocalexml2cpp.py Resolution of util/locale_database/ are based on: https://codereview.qt-project.org/c/qt/qtbase/+/294250 and src/corelib/{text,time}/*_data_p.h were then regenerated by running those scripts. Updated CMakeLists.txt in each of tests/auto/corelib/serialization/qcborstreamreader/ tests/auto/corelib/serialization/qcborvalue/ tests/auto/gui/kernel/ and generated new ones in each of tests/auto/gui/kernel/qaddpostroutine/ tests/auto/gui/kernel/qhighdpiscaling/ tests/libfuzzer/corelib/text/qregularexpression/optimize/ tests/libfuzzer/gui/painting/qcolorspace/fromiccprofile/ tests/libfuzzer/gui/text/qtextdocument/sethtml/ tests/libfuzzer/gui/text/qtextdocument/setmarkdown/ tests/libfuzzer/gui/text/qtextlayout/beginlayout/ by running util/cmake/pro2cmake.py on their changed .pro files. Changed target name in tests/auto/gui/kernel/qaction/qaction.pro tests/auto/gui/kernel/qaction/qactiongroup.pro tests/auto/gui/kernel/qshortcut/qshortcut.pro to ensure unique target names for CMake Changed tst_QComboBox::currentIndex to not test the currentIndexChanged(QString), as that one does not exist in Qt 6 anymore. Change-Id: I9a85705484855ae1dc874a81f49d27a50b0dcff7
| * Change QLocale to use CLDR's accounting formats for currenciesEdward Welbourne2020-04-021-433/+433
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In particular, this changed the US currency formats for negative amounts to be parenthesised versions of the positive amount forms, rather than having a minus sign after the $ sign. Test updated. [ChangeLog][QtCore][QLocale] Currency formats are now based on CLDR's accounting formats, where they were previously mostly based (more or less by accident) on standard formats. In particular, this now means negative currency formats are specified, where available, where they (mostly) were not previously. Task-number: QTBUG-79902 Change-Id: Ie0c07515ece8bd518a74a6956bf97ca85e9894eb Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
| * Take CLDR's distinguished attributes into accountEdward Welbourne2020-04-021-146/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing XPATH searches, child nodes that have distinguished attributes that were not asked for should be skipped. This is part of the LDML spec and matters when resolving locale inheritance. Scan the LDML DTD (previously only scanned for the CLDR version) to find which attributes of which tags are ignorable - all others are distinguished - and take the result into account when performing XPATH searches. The XPath we were using for currency formats wasn't excluding currencyFormatLength elements with type="short" and patterns specific to thousands (and larger multiples); this is fixed by taking distinguished attributes into account. However, the XPATH also wasn't specifying the always distinguished attribute type="standard" that was, in practice, used for nearly all locales that weren't (wrongly) using short-forms for thousands; so type="standard" is now made explicit, so as to minimize the diff. This leaves only twenty-one locales with a negative currency formats. A later commit shall switch to using accounting by default (it falls back via an alias to standard, in any case), thereby restoring the two mentioned below that were using it by accident, but the present change gives the minimal diff here. Thousands-specific formats replaced with sensible ones: * zh_Hant_{HK,MO} (Traditional Mandarin, Hong Kong and Macau) * eo_001 (Esperanto) * fr_CA (Canadian French) * ha_* (Hausa, when not written in Arabic) * es_{GT,MX,US} (Spanish - Guatemala, Mexico, USA) * sw_KE (Swahili, Kenya) * yi_001 (Yiddish) * mfe_MU (Morisyen, Mauritius) * lag_TZ (Langi, Tanzania) * mgh_MZ (Makhuwa Meetto, Mozambique) * wae_CH (Walser, Switzerland) * kkj_CM (Kako, Cameroon) * lkt_US (Lakota, USA) * pa_Arab_PK (Punjabi, in Arabic script, as used in Pakistan; uses arabext number system, whose currency falls back to latn's, for which pa_Arab over-rides the thousands-format). Format changed from an over-ridden type="accounting" to standard (so these lost a negative-specific form) in: * en_SI (English, Slovenia) * es_DO (Spanish, Dominican Republic; same) For some locales we were picking up over-rides of narrow or short list formats, or formats for or-lists or unit-lists rather than and-lists, in place of the standard list format, that these locales don't over-ride, provided by a parent locale. This changed list formats for: * en_CA, en_IN (dropped "Oxford" comma before "and") * qu_* (Quechua; dropped "utaq", presumably meaning "and") * ur_IN (Urdu, India; was using unit-list formats) [ChangeLog][QtCore][QLocale] Data used for currency formats in several locales and list patterns in some locales have changed due to now parsing the CLDR data more faithfully. Fixes: QTBUG-81344 Change-Id: I6b95c6c37db92df167153767c1b103becfb0ac98 Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
| * Take number system into account in currency format look-upEdward Welbourne2020-04-021-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CLDR's currency formats do have number system variation, so take it into account. (The old xpathlite code clearly intended to do this, but failed at it due to looking for the wrong component of an XPATH to fix.) This changes the currency formats in use for * all Dutch locales (because nl.xml lists a currency format for arab before the one for latn, and they differ), * Punjabi, Urdu - specifically pa_Guru_IN, ur_Arab_PK (both like Dutch, arabext before latn; which is correct for pa_Arab_PK and ur_Arab_IN), * Sindi (whose over-ride of latn currency format we were using, where we should be using arab's format, supplied by root's default), * Tatar (which specifies a generic currency format, which we were using, before one specific to latn, which we now use), * Tongan (same as Dutch), * Konkani (like Dutch, deva before latn) and * several North African Arabic locales (whose default number system is latn, rather than arab, but previously used arab's formats). Task-number: QTBUG-79902 Change-Id: I18d8ec16bfd3a516d1bcd2f63bc7f7f15179a3f4 Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
| * Rework qlocalexml2cpp.py to use writers based on TranscriberEdward Welbourne2020-04-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This saves repetition of temporary-file manipulation code. In the process, ensure that we tidy away temporary files on failure. Moved a comment in qlocale.h to *outside* the re-written portion, to save having to rewrite it every time. Added blank lines to separate script data from country data in the generated output. Changed 0s in one comment to zeros, to match another comment. Isolated use of sys to the __main__ block. Isolated use of enumdata to the new LocaleHeaderWriter class. Modernised all the string-formatting I touched. Task-number: QTBUG-81344 Change-Id: I5768e45d9a8ea23facc303b3dd8af8b3ccbf7ff2 Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
* | Use char16_t in favor of ushort for locale data tablesEdward Welbourne2020-02-171-13/+13
| | | | | | | | | | | | Change-Id: I890dd2b52c1b786db1081744c8ca343baba93de4 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* | Allow surrogate pairs for various "single character" locale dataEdward Welbourne2020-02-171-787/+818
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extract the character in its proper unicode form and encode it in a new single_character_data table of locale data. Record each entry as the range within that table that encodes it. Also added an assertion in the generator script to check that the digits CLDR gives us are a contiguous sequence in increasing order, as has been assumed by the C++ code for some time. Lots of number-formatting code now has to take account of how wide the digits are. This leaves nowhere for updateSystemPrivate() to record values read from sys_locale->query(), so we must always consult that function when accessing these members of the systemData() object. Various internal users of these single-character fields need the system-or-CLDR value rather than the raw CLDR value, so move QLocalePrivate's methods to supply them down to QLocaleData and ensure they check for system values, where appropriate first. This allows us to finally support the Chakma language and script, for whose number system UTF-16 needs surrogate pairs. Costs 10.8 kB in added data, much of it due to adding two new locales that need surrogates to represent digits. [ChangeLog][QtCore][QLocale] Various QLocale methods that returned single QChar values now return QString values to accommodate those locales which need a surrogate pair to represent the (single character) return value. Fixes: QTBUG-69324 Fixes: QTBUG-81053 Change-Id: I481722d6f5ee266164f09031679a851dfa6e7839 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | Separate offsets from sizes in QLocale's dataEdward Welbourne2020-01-301-2611/+2533
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables us to make the sizes quint8 and benefit from the resulting packing, making the locale data smaller. The sizes for long month-name lists (which concatenate twelve names with semicolon as separator) can overflow an 8-bit member, so use quint16 where needed. Re-ordered the data in QLocaleData and QCalendarLocale. Now all long-short(-narrow) families arise in that order; and any standalone is grouped with the one of the same length. (This cost 20 bytes in the date-format table, which optimises out more duplication if short is before long, but the saving in the (smaller) time-format table more than make up for it; and 20 bytes isn't worth the confusion that being inconsistent in ordering might cause.) At the same time, drop trailing semicolons from list entries (which join various names with semicolon) as they're not needed: we know where the end of the list is, because we know the size of the string that results from concatenation. The code that parses such lists can even correctly handle empty entries at the end. Saves 26 kB of data in the compiled binaries. Task-number: QTBUG-81053 Change-Id: If6ccc96a6910828817aa605d10fd814f567ae1e8 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* | Deduplicate locale data tablesEdward Welbourne2020-01-301-1093/+1078
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some entries in tables were sub-strings (e.g. prefixes) of others. Since we store start-index and length (with no need for terminators), any entry that appears as a sub-string of an earlier entry can be recorded without making a separate copy of its content, just by recording where it appeared as a sub-string of an earlier entry. (Sadly this doesn't apply to month- or day-names and their short-forms: for those, we store ';'-joined lists. Thus, although each short-form is a prefix of its long-form, the short-form is stored in a list with other short-forms; and this is not a prefix of the list of matching long-forms.) The savings are modest (780 bytes at present), but cost us nothing except when running the python script that generates the data files (it takes a little longer now), which usually only happens at a CLDR update. Change-Id: I05bdaa9283365707bac0190ae983b31f074dd6ed Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* | Minor tidy-up in qlocalexml2cpp.pyEdward Welbourne2020-01-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split a long line. Use pythonic chained comparison to save some repetition. Comment on a field not currently in actual use. Say "zeros" rather than "0s" in one comment to match another. Added a .h suffix to the main locale data tempfile to match the naming of the tempfiles used for calendar data. Simplify generation of the blank line between Language and Script; and include a matching blank between Script and Country. This adds one blank line to qlocale.h Removed a stray space that misaligned locale data lines. This produces a space-only change in the generated *_data_p.h files. Change-Id: I974a9e8923c3dfd2178855d2cf1d6a5074e130b3 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* | Preserve the case of the exponent separator CLDR suppliesEdward Welbourne2020-01-301-537/+537
|/ | | | | | | | | | | | | | | | | | | | | | We have long (since 4.5.1) coerced it to lower-case, for no readily apparent, much less documented, reason. CLDR says most locales use an upper-case E for this - let's actually use what CLDR says we should use. The code that matches the exponent separator was doing so case-insensitively in any case; that needed adaptation now that the separator's case isn't pre-determined; and, in any case, should have been done using case-folding rather than upper-casing. In the process, removed some spurious checks for "'e' or 'E'" in the result, since the exponent separator is always represented by 'e' (and an 'e' might also be present for the separate reason of its use as a beyond-decimal digit representing fourteen). [ChangeLog][QtCore][QLocale] QLocale::exponential() now preserves the case of the CLDR source, where previously it was lower-cased. Change-Id: Ic9ac02136cff79cb9f136d72141b5dbf54d9e0a6 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Update CLDR to v36Edward Welbourne2019-10-251-2836/+2876
| | | | | | | | | | | | | | | | | | | | | | Released on October 4th. Adds Windows names for two time zones, Qyzylorda and Volgograd. Added languages Chickasaw (cic), Muscogee (mus) and Silesian (szl). Norwegian number formatting has flipped back to using colon rather than dot as time separator; it's flipped back and forth over the last several CLDR releases. The dot form is present as a variant, the colon form was long given as the normal pattern, then went away; but now it's back as a contributed draft and that's what we pick up. The MS-Win time-zone ID script was iterating a dict, causing random reshuffling when new entries are added. Fixed that by doing the critical iteration in sorted order. Omitted locales ccp_BD and ccp_IN due to QTBUG-69324. Task-number: QTBUG-79418 Change-Id: I43869ee1810ecc1fe876523947ddcbcddf4e550a Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Add support for calendars beside GregorianSoroush Rabiei2019-08-201-2569/+589
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add QCalendarBackend as a base class for calendar implementations and QCalendar as a facade via which to access it. QDate's implicit implementation of the Gregorian calendar becomes QGregorianCalendar and QDate methods now support choice of calendar. Convert QLocale's CLDR data for month names to a locale-data component of each supported calendar and relevant QLocale methods now support choice of calendar. Adapt Python scripts for locale data generation to extract month name data from CLDR (keeping on version v35.1) into the new calendar-locale files. The locale data for the Gregorian calendar is held in a Roman calendar base, for sharing with other calendars. Add tests for basic uses of the new API. [ChangeLog][QtCore][QCalendar] Added QCalendar to support diverse calendars, supported by implementing QCalendarBackend. [ChangeLog][QtCore][QDate] Allow choice of calendar in various operations, with Gregorian remaining the default. Done-with: Lars Knoll <lars.knoll@qt.io> Done-with: Edward Welbourne <edward.welbourne@qt.io> Fixes: QTBUG-17110 Fixes: QTBUG-950 Change-Id: I9d6278f394269a183aee8156e990cec4d5198ab8 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
* Move text-related code out of corelib/tools/ to corelib/text/Edward Welbourne2019-07-101-0/+8814
This includes byte array, string, char, unicode, locale, collation and regular expressions. Change-Id: I8b125fa52c8c513eb57a0f1298b91910e5a0d786 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>