summaryrefslogtreecommitdiffstats
path: root/src/corelib/text/qchar.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use checked string iteration in case conversionsEdward Welbourne2020-08-291-0/+1
| | | | | | | | | | | | | | | | | | The Unicode table code can only be safely called on valid code-points. So code that calls it must only pass it valid Unicode data. The string iterator's Unchecked Unchecked methods only provide this guarantee when the string being iterated is guaranteed to be valid UTF-16; while client code should only use QString, QStringView and friends on valid UTF-16 data, we have no way to be sure they have respected that. So take the few extra cycles to actually check validity in the course of iterating strings, when the resulting code-points are to be passed to the Unicode table look-ups. Add tests that case mapping doesn't access Unicode tables out of range (it'll trigger the new assertion). Added some comments to qchar.h that helped me understand surrogates. Change-Id: Iec2c3106bf1a875bdaa1d622f6cf94d7007e281e Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Fix a number of qdoc warnings related to deprecationFriedemann Kleint2020-07-241-32/+0
| | | | | | | Remove obsolete documentation. Change-Id: Iaf4b6f9852a883dea0f256c5c89e74f6ebbe85f3 Reviewed-by: Sona Kurazyan <sona.kurazyan@qt.io>
* QChar: purge deprecated APIEdward Welbourne2020-07-201-51/+1
| | | | | | | | Since 5.3 joining() and old Joining type Replaced by JoiningType joiningType() Change-Id: Iefee50aaf94cec6d67b5fc004b3e68357b2015c5 Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* Port QString to qsizetypeLars Knoll2020-07-061-20/+20
| | | | | | Change-Id: Id9477ccfabadd578546bb265a9483f128efb6736 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Port Q_STATIC_ASSERT(_X) to static_assertGiuseppe D'Angelo2020-06-191-4/+4
| | | | | | | | | | | | | | | | | There is no reason for keep using our macro now that we have C++17. The macro itself is left in for the moment being, as well as its detection logic, because it's needed for C code (not everything supports C11 yet). A few more cleanups will arrive in the next few patches. Note that this is a mere search/replace; some places were using double braces to work around the presence of commas in a macro, no attempt has been done to fix those. tst_qglobal had just some minor changes to keep testing the macro. Change-Id: I1c1c397d9f3e63db3338842bf350c9069ea57639 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* QString: throughly port internals to char16_tMarc Mutz2020-05-191-25/+0
| | | | | | | | | | This includes allocating QString data as char16_t instead of ushort. This isn't the end of the port, but an important milestone: the traditional foldChar() functions are now all unused. Change-Id: I766bebc2d70b6972e2045d3474c7f5770f4676d9 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* QChar: make fullConvertCase()'s result type more usableMarc Mutz2020-05-171-4/+15
| | | | | | | | Move it into the function, give it an explicit size and make it iterable and indicate to QStringView that it's a compatible container. Change-Id: I483d9225ac73ad93f2037489f2d32473c377e8b7 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Modernize foldCase() internal functionsMarc Mutz2020-05-111-0/+25
| | | | | | | | | Overload uint/ushort versions with new char16_t/char32_t ones, and [[deprecate]] the old ones. There's too much churn for doing the replacement in one patch. Change-Id: Ib1f90a1c46b80aa0fb1ea00ce614af49f49bd712 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* QChar: add fromUcs{2,4}()Marc Mutz2020-05-091-25/+51
| | | | | | | | | | | | | | | | | | | | The fromUcs2() named ctor is designed to replace all the non-char integral-type constructors of QChar which make it very hard to control the implicit QChar conversions, which have caused a few bugs in Qt itself. As a classical named contructor, it simply returns QChar. The fromUcs4() named "ctor", however, needs to expand surrogate pairs, and thus can't just return QChar. Instead, it returns a small struct that contains one or two char16_t's, can be iterated over and be implicitly converted to QStringView. To avoid bikeshedding the name (FromUcs4Result, of course :), it's defined inline and thus can't be named outside the function. This function replaces most uses of QChar::requiresSurrogates() in QtBase. [ChangeLog][QtCore][QChar] Added fromUcs2(), fromUcs4(). Change-Id: I803708c14001040f75cb599e33c24a3fb8d2579c Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QChar/QString: centralize case folding in qchar.cppMarc Mutz2020-05-091-0/+23
| | | | | | | | | | | | | | | | | | | | | There are (at least) two implementations of the low-level case-folding algorithm, one of which (for QChar::toLower()) seems to be wrong (it doesn't deal with special cases which expand to more than one code point). The algoithm hidden in QString and entangled with the QString detaching code makes reusing the code much harder. At the same time, the dependency of the algorithm on the unicode tables makes exposing a non-allocating result type in the public API hard. std::u16string would be an alternative if we can assure that all implementations use SSO with at least four characters. So, for the time being, leave this as internal API for use in an upcoming QStringView::toLower() as well as case-insensitive hashing. Change-Id: Iabb2611846f6176776aa20e634f44d8464f3305c Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QChar: finish port to char16_tMarc Mutz2020-05-081-4/+4
| | | | | | Change-Id: If38405da34543f836e674474c05f2d98ed610a23 Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QUnicodeTables: port to charNN_tMarc Mutz2020-04-271-13/+13
| | | | | | | | | | | This makes existing calls passing uint or ushort ambiguous, so fix all the callers. There do not appear to be callers outside QtBase. In fact, the ...BreakClass() functions appear to be utterly unused. Change-Id: I1c2251920beba48d4909650bc1d501375c6a3ecf Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* QChar: port low-level functions from uint/ushort to char32/16_tMarc Mutz2020-04-241-41/+115
| | | | | | | | | | | | | | | | | | Now that the standard gives us proper types for UTF-16 and UTF-32 characters, use them. Will eventually make the code much easier to read than today, where uint could be an index as well as a char32_t. It also ensures that the result of e.g. QChar::highSurrogate() can still be implicitly converted to a QChar now that the QChar(non-characater-integral-types) ctors are being made explicit. [ChangeLog][QtCore][QChar] All low-level functions (e.g. highSurrogate()) now take and return char16_t instead of ushort and char32_t instead of uint. Change-Id: I9cd8ebf6fb998fe1075dae96c7c4484a057f0b91 Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* Update UCD to Revision 26Edward Welbourne2020-03-141-0/+5
| | | | | | | | | | | | | | Include WordBreakTest.html, since a test uses sample strings from it, albeit without actually reading the file. Had to comment out more of the new tests, as at Revision 24, pending an update to harfbuzz and the text boundary detection code. Task-number: QTBUG-79631 Task-number: QTBUG-79418 Task-number: QTBUG-82747 Change-Id: I0082294b09d67ffdc6a9b5c15acf77ad3b86f65f Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Fix some qdoc warningsFriedemann Kleint2020-01-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | src/corelib/serialization/qjsonvalue.cpp:174: (qdoc) warning: No such parameter 'n' in QJsonValue::QJsonValue() ... examples/widgets/doc/src/icons.qdoc:584: (qdoc) warning: Command '\snippet (//! [24])' failed at end of file 'widgets/icons/mainwindow.cpp' src/corelib/text/qbytearray.cpp:5177: (qdoc) warning: clang found diagnostics parsing \fn QByteArray::FromBase64Result::operator QByteArray() const error: out-of-line definition of 'operator QByteArray' does not match any declaration in 'QByteArray::FromBase64Result' src/corelib/serialization/qjsonarray.cpp:178: (qdoc) warning: Overrides a previous doc src/corelib/serialization/qjsonarray.cpp:140: (qdoc) warning: (The previous doc is here) src/corelib/serialization/qjsonobject.cpp:1016: (qdoc) warning: clang found diagnostics parsing \fn QJsonValueRef QJsonObject::iterator::operator[](int j) const error: out-of-line definition of 'operator[]' does not match any declaration in 'QJsonObject::iterator' src/corelib/serialization/qjsonobject.cpp:1267: (qdoc) warning: clang found diagnostics parsing \fn QJsonValue QJsonObject::const_iterator::operator[](int j) const error: out-of-line definition of 'operator[]' does not match any declaration in 'QJsonObject::const_iterator' src/corelib/tools/qhash.cpp:2641: (qdoc) warning: Overrides a previous doc src/corelib/tools/qhash.cpp:1492: (qdoc) warning: (The previous doc is here) src/corelib/tools/qhash.cpp:2659: (qdoc) warning: Can't link to 'unit()' src/corelib/text/qchar.cpp:274: (qdoc) warning: Undocumented enum item 'Script_Sundanese' in QChar::Script src/corelib/text/qchar.cpp:274: (qdoc) warning: No such enum item 'Script_Sundaneseo' in QChar::Script src/network/ssl/qsslsocket.cpp:1514: (qdoc) warning: Can't link to 'QSslConfiguration::addDefaultCaCertificate()' src/widgets/widgets/qtabwidget.cpp:581: (qdoc) warning: Undocumented parameter 'visible' in QTabWidget::setTabVisible() Change-Id: I05c2a4884873850b684fa94036cd90db1a6e7726 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Add missing docs for UCD additions at 5.15Edward Welbourne2019-11-281-0/+14
| | | | | | | | | Also remove two stray commas pointed out in code-review and some others noticed on checking for similar. This amends commit c3eb521a0f10112df6b61d2592351c4eef2e1f9b. Change-Id: If20c5146b740defe8d25ff61d399031b5c66ded1 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Add Since markers to QChar::Script docs and sort in alphabetic orderEdward Welbourne2019-11-271-122/+122
| | | | | Change-Id: I4aedaf87b8b424fe946eb1618ce77a79cfddc111 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* docs: Mark QPair and QLatin1Char as reentrantKavindra Palaraja2019-10-041-0/+1
| | | | | | Change-Id: I7d37eb13809a6fa4d1c2c74fd8aea35bdf235996 Fixes: QTBUG-78552 Reviewed-by: Andy Shaw <andy.shaw@qt.io>
* QUnicodeTables: use array for case folding tablesMarc Mutz2019-09-041-13/+13
| | | | | | | | | | | | | | | | | Instead of four pairs of :1 :15 bit fields, use an array of four :1, :15 structs. This allows to replace the case folding traits classes with a simple enum that indexes into said array. I don't know what the WASM #ifdef'ed code is supposed to effect (a :0 bit-field is only useful to separate adjacent bit-field into separate memory locations for multi-threading), but I thought it safer to leave it in, and that means the array must be a 64-bit block of its own, so I had to move two fields around. Saves ~4.5KiB in text size on optimized GCC 10 LTO Linux AMD64 builds. Change-Id: Ib52cd7706342d5227b50b57545d073829c45da9a Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* QChar: add FormFeed (FF) special characterThiago Macieira2019-08-121-0/+1
| | | | | | | | [ChangeLog][QtCore][QChar] Added FormFeed (FF) special character. Fixes: QTBUG-77089 Change-Id: I1024ee42da0c4323953afffd15b245a508f545f0 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Move text-related code out of corelib/tools/ to corelib/text/Edward Welbourne2019-07-101-0/+2059
This includes byte array, string, char, unicode, locale, collation and regular expressions. Change-Id: I8b125fa52c8c513eb57a0f1298b91910e5a0d786 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>