summaryrefslogtreecommitdiffstats
path: root/src/corelib/text/qlocale_tools.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use SPDX license identifiersLucie Gérard2022-05-161-39/+3
| | | | | | | | | | | | | Replace the current license disclaimer in files by a SPDX-License-Identifier. Files that have to be modified by hand are modified. License files are organized under LICENSES directory. Task-number: QTBUG-67283 Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
* Implement support for '0b' prefix in toInt() etcMarc Mutz2022-04-281-14/+26
| | | | | | | | | | | | | | | | | | | | [ChangeLog][QtCore][QByteArray/QByteArrayView/QLatin1String/QString/QStringView] The string-to-integer conversion functions (toInt() etc) now support the 0b prefix for binary literals. That means that base = 0 will recognize 0b to mean base = 2 and an explicit base = 2 argument will make toInt() (etc) skip an optional 0b. [ChangeLog][QtCore][Important Behavior Changes] Due to the newly-introduced support for 0b (binary) prefixes in integer parsing, some strings that were previously rejected as invalid now parse as valid. E.g., Qt 6.3 with autodetected bases would have tried to parse "0b1" as an octal value and fail, whereas 6.4 will parse it as the binary literal and return 1. Fixes: QTBUG-85002 Change-Id: Id4eff72d63619080e5afece4d059b6ffd52f28c8 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
* QtCore: Replace remaining uses of QLatin1String with QLatin1StringViewSona Kurazyan2022-03-261-2/+2
| | | | | | | Task-number: QTBUG-98434 Change-Id: Ib7c5fc0aaca6ef33b93c7486e99502c555bf20bc Reviewed-by: Edward Welbourne <edward.welbourne@qt.io> Reviewed-by: Marc Mutz <marc.mutz@qt.io>
* Don't allocate in qt_asciiToDouble()Marc Mutz2022-02-251-17/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The sscanf implementation ensured NUL-termination of the input data, by copying it, and appending NUL. Since this function is ignoring trailing garbage and reports the progress back, we could be parsing the first double in a multi-MiB buffer. And we'd been copying and copying the buffer for every double scanned. This is clearly not acceptable. An alternative is to use the max-field-width feature of scanf. By giving the size of the input data as the maximum field width in the format string, we stop sscanf from reading more than the available data. This code should let everyone's alarm bells go off: a format string constructed at run-time is really the last thing one should consider, but I haven't found a way to pass the field width as an argument, so bite the bullet and go through with it. Copying potentially MiBs of data is the worse of the two evils. Pick-to: 6.3 Fixes: QTBUG-101178 Change-Id: Ibaf8142f6b3dab4d5e3631c3cc8cc6699bceb320 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Fix overflow issue on parsing min-qint64 with its minus sign repeatedEdward Welbourne2021-10-261-1/+16
| | | | | | | | | | | The call to std::from_chars() accepts a sign, but we've already dealt with a sign, so that would be a second sign. Check the first character after any prefix is in fact a digit (for the base in use). This is a follow-up to commit 5644af6f8a800a1516360a42ba4c1a8dc61fc516. Fixes: QTBUG-97521 Change-Id: I65fb144bf6a8430da90ec5f65088ca20e79bf02f Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Convert various callers of strtou?ll() to call strntou?ll()Edward Welbourne2021-08-301-1/+1
| | | | | | | | Where size is known or can readily be determined. Change-Id: I442e7ebb3757fdbf7d021a15e19aeba533b590a5 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Replace FreeBSD's strtou?ll() with std::from_chars()-based strntou?ll()Edward Welbourne2021-08-301-37/+80
| | | | | | | | | | | | | | | Remove third-party code in favor of STL. Implement (for now) strtou?ll() as inlines on strntou?ll() calling strlen() for the size parameter. (This is not entirely safe, as a string lacking '\0'-termination but with at least some non-matching text after the numeric portion would formerly be parsed just fine, but would now produce a crash. However, strtou?ll() are internal and callers should be ensuring '\0'-termination.) Task-number: QTBUG-74286 Change-Id: I0c8ca7d4f6110367e93b4c0164854a82c5a545e1 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* QByteArray: Disentangle number(double) from QLocaleMårten Nordheim2021-08-171-0/+6
| | | | | | | | | | | | | | Previously number(double) would go through QLocale which takes a lot of factors into consideration (which we don't need in this case) and outputs a QString in the end, which we then have to convert back to QByteArray. Avoid all that extra work and format it directly into a QByteArray. The other number() functions do not use QLocale, so are left alone for now. Task-number: QTBUG-88484 Change-Id: I4c2eaf101a55ba16e858f95017fb171589a0184e Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Make double-formatting code ready for QByteArrayMårten Nordheim2021-08-171-30/+45
| | | | | | | | | | | Split off the actual logic in qdtoBasicLatin into a templated function, qdtoString, which supports both QByteArray and QString. Since it uses qullToBasicLatin_helper as part of its fallback path make the same change to it. Task-number: QTBUG-88484 Change-Id: Icac75ee74ba6a9ddc3aa8d4782a981ef50a88db4 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QString::number(double): Disentangle from the QLocale pathMårten Nordheim2021-08-171-0/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By writing code to do formatting without considering locale The code tries to not do any unnecessary (re)allocations, and as such it reserves at the beginning and only appends. Cuts execution times of benchmarks by between 30% and 80%: PASS : tst_QString::initTestCase() PASS : tst_QString::number_double(0, format 'f', precision 0) RESULT : tst_QString::number_double():"0, format 'f', precision 0": - 0.0001774 msecs per iteration (total: 2,661, iterations: 15000000) + 0.0001238 msecs per iteration (total: 1,858, iterations: 15000000) PASS : tst_QString::number_double(0, format 'f', precision 0) RESULT : tst_QString::number_double():"0, format 'f', precision 0": - 0.0002472 msecs per iteration (total: 3,709, iterations: 15000000) + 0.0001407 msecs per iteration (total: 2,111, iterations: 15000000) PASS : tst_QString::number_double(0.12340, format 'f', precision 5) RESULT : tst_QString::number_double():"0.12340, format 'f', precision 5": - 0.0004769 msecs per iteration (total: 7,155, iterations: 15000000) + 0.0001638 msecs per iteration (total: 2,458, iterations: 15000000) PASS : tst_QString::number_double(-0.12340, format 'f', precision 5) RESULT : tst_QString::number_double():"-0.12340, format 'f', precision 5": - 0.0005759 msecs per iteration (total: 8,639, iterations: 15000000) + 0.0001664 msecs per iteration (total: 2,497, iterations: 15000000) PASS : tst_QString::number_double(1.618033988749895, format 'f', precision 15) RESULT : tst_QString::number_double():"1.618033988749895, format 'f', precision 15": - 0.0003644 msecs per iteration (total: 5,467, iterations: 15000000) + 0.0001869 msecs per iteration (total: 2,804, iterations: 15000000) PASS : tst_QString::number_double(2.220446049e-16, format 'g', precision 10) RESULT : tst_QString::number_double():"2.220446049e-16, format 'g', precision 10": - 0.00070580 msecs per iteration (total: 10,587, iterations: 15000000) + 0.0002277 msecs per iteration (total: 3,416, iterations: 15000000) PASS : tst_QString::number_double(1.0E-04, format 'E', precision 1) RESULT : tst_QString::number_double():"1.0E-04, format 'E', precision 1": - 0.00082213 msecs per iteration (total: 12,332, iterations: 15000000) + 0.0002018 msecs per iteration (total: 3,028, iterations: 15000000) PASS : tst_QString::number_double(1.0E+08, format 'E', precision 1) RESULT : tst_QString::number_double():"1.0E+08, format 'E', precision 1": - 0.00082459 msecs per iteration (total: 12,369, iterations: 15000000) + 0.0002016 msecs per iteration (total: 3,025, iterations: 15000000) PASS : tst_QString::number_double(-1.0E+08, format 'E', precision 1) RESULT : tst_QString::number_double():"-1.0E+08, format 'E', precision 1": - 0.00093840 msecs per iteration (total: 14,076, iterations: 15000000) + 0.0002074 msecs per iteration (total: 3,111, iterations: 15000000) PASS : tst_QString::cleanupTestCase() -Totals: 11 passed, 0 failed, 0 skipped, 0 blacklisted, 153777ms +Totals: 11 passed, 0 failed, 0 skipped, 0 blacklisted, 48753ms Task-number: QTBUG-88484 Change-Id: I23234467801243b163dff5cccf8a9fe9d90c3e2a Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QString::number(int): Avoid going through qlocale machineryMårten Nordheim2021-07-261-44/+48
| | | | | | | | | | | | | For increased performance, as measured with the number_qu?longlong QString benchmarks. The results are all good, reducing runtime by anywhere between 40 and 220 nanoseconds. The slowest test previously completing at ~330ns, now completing at ~105ns. Task-number: QTBUG-88484 Change-Id: Ie96e86e57b635bac01389aed531a6d9f087df983 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QLocale: add a Q_CHECK_PTRGiuseppe D'Angelo2021-01-201-0/+1
| | | | | | | | | Avoid complaints from static analyzers that the pointer returned by malloc might be null. Change-Id: I3ec3ba03d0b5283dd569200a3040a5fe5990f763 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Mark two impossible code-paths with Q_UNREACHABLEEdward Welbourne2020-11-271-1/+2
| | | | | | Change-Id: I8c04f512b078d4c13d759854b65f4d39b7b80e75 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io> Reviewed-by: Andrei Golubev <andrei.golubev@qt.io>
* Performance improvement for integer->QString conversionAndreas Buhr2020-10-311-4/+43
| | | | | | | | | | | | | The compiler can generate better code when the base is known in integer to string conversion. This patch creates separate branches for known bases and leaves generic code as a fallback. Saved about 12ns per conversion of 12345678 in one measurement. Task-number: QTBUG-87330 Change-Id: I44c9bb467cf211f7e617ed55104476062296bba6 Reviewed-by: Andrei Golubev <andrei.golubev@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
* Revert changes in strto(u)ll.c to avoid integer overflowsRobert Loehning2020-10-261-0/+4
| | | | | | | | Found in oss-fuzz issue 26045. Pick-to: 5.12 5.15 Change-Id: Id9eac1b4f67ad9bbe2d92dd69cd03338a6ced74e Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Use char32_t for QLocaleData::zeroUcs() and friendsEdward Welbourne2020-08-281-6/+6
| | | | | | | | | | Also catch some stray ushort that should be char16_t by now, use unicode character values for some constants and rename a UCS2 variable to not claim it's UCS4. Change-Id: I374b791947f5c965eaa22ad5b16060b475081c9d Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QLocale: update qt_asciiToDouble to use qsizetypeThiago Macieira2020-07-311-10/+10
| | | | | | | | No need to change the output variable from int to qsizetype. That would complicate the use of libdouble-conversion, which uses ints. Change-Id: Iea47e0f8fc8b40378df7fffd1624bfdba1189d81 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QLocale: fix conversion of "\0" to doubleThiago Macieira2020-07-281-1/+1
| | | | | | | | That is not a valid conversion. An empty string is a valid conversion; a string containing a null should fail. Change-Id: Iea47e0f8fc8b40378df7fffd1624c088f3bd1b14 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* QByteArray::toDouble: fix buffer overflow reads on fromRawData()Thiago Macieira2020-07-281-6/+25
| | | | | | | | | | | | | If Qt was not compiled with libdouble-conversion, sscanf() requires null-termination, which fromRawData() does not require. This could be fixed by making QByteArray pass a reallocated copy if it is operating on raw data, but fixing qt_asciiToDouble() means we catch all cases and we optimize for the common case of not-horribly-long strings. Fixes: QTBUG-85580 Pick-to: 5.15 5.12 Change-Id: Iea47e0f8fc8b40378df7fffd16246f6163b01442 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Fix handling of Suzhou numbering systemEdward Welbourne2020-07-171-2/+2
| | | | | | | | | | | | | | | | This only arises when the system locale tells us to use its zero as our zero digit, since no CLDR locale uses it by default. Adapt an MS-specific QLocale::system() test to use Suzhou numbering, so as to test this. While updating the locale-restoration code to also restore the digits being set in that test, add restore code for the long time format, where previously only the short time format was restored. Add a comment to make it less likely one of those shall be missed in future. Fixes: QTBUG-85409 Change-Id: I343324bb563ee0e455dfe77d4825bf8c3082ca30 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Support digit-grouping correctlyEdward Welbourne2020-07-141-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Read three more values from CLDR and add a byte to the bit-fields at the end of QLocaleData, indicating the three group sizes. This adds three new parameters to various low-level formatting functions. At the same time, rename ThousandsGroup to GroupDigits, more faithfully expressing what this (internal) option means. This replaces commit 27d139128013c969a939779536485c1a80be977e with a fuller implementation that handles digit-grouping in any of the ways that CLDR supports. The formerly "Indian" formatting now also applies to at least some locales for Bangladesh, Bhutan and Sri Lanka. Fixed Costa Rica currency formatting test that wrongly put a separator after the first digit; the locale (in common with several Spanish locales) requires at least two digits before the first separator. [ChangeLog][QtCore][Important Behavior Changes] Some locales require more than one digit before the first grouping separator; others use group sizes other than three. The latter was partially supported (only for India) at 5.15 but is now systematically supported; the former is now also supported. Task-number: QTBUG-24301 Fixes: QTBUG-81050 Change-Id: I4ea4e331f3254d1f34801cddf51f3c65d3815573 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QLocalePrivate: rearrange number format statics and toolsEdward Welbourne2020-07-141-85/+0
| | | | | | | | | | | | Instead of passing lots of instance data around among public static methods and functions in qlocale_tools, do the work in instance methods that can access the relevant attributes of the locale when they need them. Incidentally reduces clutter in the global namespace. Add a signPrefix() to handle a repeated computation. Keep new internal methods private. Change-Id: I9556a960acac9fb645872337c61f509fb902984e Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Fix floating-point 'g'-format's choice between 'e' and 'f' formsEdward Welbourne2020-07-141-0/+2
| | | | | | | | | | | | | | During review of a refactor (coming shortly), Thiago wondered what the magic numbers were. On closer examination, I concluded that they were wrong and wrote some tests to prove it. This commit adds those tests; replaces the misguided old code with something that passes them; and documents the reasons for the various parts of its decisions. In the process, tidy up QLocaleData::doubleToString() somewhat and rename some of its variables to conform to Qt coding style. Change-Id: Ibee43659b1bdb0707639cdb444cfe941c31d409f Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Use numeric_limits instead of hand-coded equivalentsEdward Welbourne2020-07-131-6/+7
| | | | | | | | | | | | | | As a comment noted, the reason for QLocaleData rolling its own values describing the ranges of digits and exponents in a double were all about std::numeric_limits's constants not being constexpr - which they have now been since C++11, so we can do away with our own. One of the constants was used in two places in the same way; so abstract that use out into an inline function in qlocale_tools, to save duplication and give somewhere to document it. Change-Id: I7e3740ece9b499c0ec434de18d70abe69e1fe079 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Port Q_STATIC_ASSERT(_X) to static_assertGiuseppe D'Angelo2020-06-191-1/+1
| | | | | | | | | | | | | | | | | There is no reason for keep using our macro now that we have C++17. The macro itself is left in for the moment being, as well as its detection logic, because it's needed for C code (not everything supports C11 yet). A few more cleanups will arrive in the next few patches. Note that this is a mere search/replace; some places were using double braces to work around the presence of commas in a macro, no attempt has been done to fix those. tst_qglobal had just some minor changes to keep testing the macro. Change-Id: I1c1c397d9f3e63db3338842bf350c9069ea57639 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Allow surrogate pairs for various "single character" locale dataEdward Welbourne2020-02-171-51/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extract the character in its proper unicode form and encode it in a new single_character_data table of locale data. Record each entry as the range within that table that encodes it. Also added an assertion in the generator script to check that the digits CLDR gives us are a contiguous sequence in increasing order, as has been assumed by the C++ code for some time. Lots of number-formatting code now has to take account of how wide the digits are. This leaves nowhere for updateSystemPrivate() to record values read from sys_locale->query(), so we must always consult that function when accessing these members of the systemData() object. Various internal users of these single-character fields need the system-or-CLDR value rather than the raw CLDR value, so move QLocalePrivate's methods to supply them down to QLocaleData and ensure they check for system values, where appropriate first. This allows us to finally support the Chakma language and script, for whose number system UTF-16 needs surrogate pairs. Costs 10.8 kB in added data, much of it due to adding two new locales that need surrogates to represent digits. [ChangeLog][QtCore][QLocale] Various QLocale methods that returned single QChar values now return QString values to accommodate those locales which need a surrogate pair to represent the (single character) return value. Fixes: QTBUG-69324 Fixes: QTBUG-81053 Change-Id: I481722d6f5ee266164f09031679a851dfa6e7839 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Tidy nullptr usageAllan Sandfeld Jensen2019-12-061-1/+1
| | | | | | | | | | | Move away from using 0 as pointer literal. Done using clang-tidy. This is not complete as run-clang-tidy can't handle all of qtbase in one go. Change-Id: I1076a21f32aac0dab078af6f175f7508145eece0 Reviewed-by: Friedemann Kleint <Friedemann.Kleint@qt.io> Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Use quiet NaNs instead of signaling onesEdward Welbourne2019-09-041-2/+2
| | | | | | | | | | | | | | | I see no good reason why the NaN returned when reading "nan" as a double should be a signaling one; a quiet one should be just fine. [ChangeLog][QtCore][QLocale] The NaN obtained when reading "nan" as a floating-point value is now quiet rather than signaling. [ChangeLog][QtCore][QTextStream] The NaN obtained when reading "nan" as a floating-point value is now quiet rather than signaling. Change-Id: Ife477a30bfb813c611b13a33c38ea82f9e8a93eb Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Move text-related code out of corelib/tools/ to corelib/text/Edward Welbourne2019-07-101-0/+578
This includes byte array, string, char, unicode, locale, collation and regular expressions. Change-Id: I8b125fa52c8c513eb57a0f1298b91910e5a0d786 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>