summaryrefslogtreecommitdiffstats
path: root/src/corelib/codecs/qutfcodec.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Move QTextCodec support out of QtCoreKarsten Heimrich2020-06-201-228/+0
| | | | | | | | | | * Assume UTF-8 on all Unix like systems * Export some functions to be able to compile QTextCodec once moved to Qt5Compat. Task-number: QTBUG-75665 Change-Id: I52ec47a848bc0ba72e9c7689668b1bcc5d736c29 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Ensure the conversion methods in qstringconverter always get a valid stateLars Knoll2020-05-141-0/+21
| | | | | | | | | | | | Make sure that the conversion methods always get a valid state. This is already the ecase then using the new QStringConverter API, ensure the old QTextCodec API also passes in a valid state. This helps simplify the logic inside those methods. Change-Id: I1945e98cdefd46bf1427e11984337f1d62abcaa2 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Alex Blasche <alexander.blasche@qt.io>
* Move the UTF conversion methods to qstringconverterLars Knoll2020-05-141-940/+0
| | | | | | | | | | | | | | Separate them from the qutfcodec, so that the codec can later on be moved out of Qt Core. Fix the QUtf methods to take qsizetype instead of int for length arguments. This also makes it possible to not build QTextCodec into the bootstrap lib anymore. Change-Id: I0b4f83139d61b19c651520a2f3a5012aa7e85cb8 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* QtCore: use new QChar::fromUcs{2,4}()Marc Mutz2020-05-121-7/+3
| | | | | | | | | | Also replace one case of QChar(0) with QChar::Null. These were errors in my local tree, which means they're included in bootstrap builds (incl. qmake). Change-Id: I3dffa9383fd1a30aa43fe2491ad95bb2b1869b40 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Remove QTextCodec dependency from QClipboardLars Knoll2020-04-211-0/+19
| | | | | | | | | | | | | QClipboard used QTextCodec to convert the war clipboard data to a QString. HTML is nowadays always encoded as utf8, and we were only supporting utf based encodings for other text. Add a qFromUtfEncoded() to our UTF helpers that auto detects utf16 and utf32 byte order marks, and assumes utf8 otherwise, to keep this compatible with what we have been doing in Qt 5. Change-Id: I5a9fccb67a88dff27cbbdecff9bd548d31aa1c6c Reviewed-by: Simon Hausmann <simon.hausmann@qt.io>
* Only read the first BOM as a BOM; the rest are ZWNBS !Edward Welbourne2020-02-141-0/+1
| | | | | | | | | | QUtf32::convertToUnicode() was forgetting to set headerdone when it dealt with the header (for contrast, Utf16::convertToUnicode() does). Fixes: QTBUG-62011 Change-Id: Ia254782ce0967a6cf9ce0e81eb06d41521150eed Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Be less laissez-faire with implicit conversions to QCharMarc Mutz2019-07-091-3/+3
| | | | | | | | | | | | | QChar currently is convertible from nearly every integral type. This is bad code hygiene and should be fixed come Qt 6. The present patch is the result of compile fixes from marking these constructors explicit. As is clear from the distribution of fixes, only low-level string handling code used these implicit conversions, an indication that they're not in widespread use elsewhere. Change-Id: Ief5336f21e6d181e03ab92893b3d13a14adc7cb0 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
* Replace Q_DECL_NOEXCEPT with noexcept in corelibAllan Sandfeld Jensen2019-04-031-2/+2
| | | | | | | In preparation of Qt6 move away from pre-C++11 macros. Change-Id: I44126693c20c18eca5620caab4f7e746218e0ce3 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Merge remote-tracking branch 'origin/5.11' into 5.12Liang Qi2018-11-091-2/+2
|\ | | | | | | | | | | | | | | | | | | Conflicts: .qmake.conf qmake/Makefile.unix src/gui/text/qtextdocument.cpp src/gui/text/qtextdocument.h Change-Id: Iba26da0ecbf2aa4ff4b956391cfb373f977f88c9
| * Modernize the "textcodec" featureLiang Qi2018-11-071-2/+2
| | | | | | | | | | | | | | | | | | Also clean up QTextCodec usage in qmake build and some includes of qtextcodec.h. Change-Id: I0475b82690024054add4e85a8724c8ea3adcf62a Reviewed-by: Edward Welbourne <edward.welbourne@qt.io> Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@qt.io>
* | QUtf8Codec: Use one 32-byte load instead of two 16-byte ones on AVX2Thiago Macieira2018-11-081-1/+6
| | | | | | | | | | | | | | | | | | The number of instructions is the same. But if the CPU can issue 32-byte-wide loads, this will be faster. For CPUs that would do two 16-byte loads, this is no worse than current code. Change-Id: I8f261579aad648fdb4f0fffd1553d060b4fc852f Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
* | Improve the UTF-16 and UTF-32 codecs with <qendian.h>Thiago Macieira2018-07-041-28/+12
| | | | | | | | | | | | | | | | This is just the low-hanging fruit. Those algorithms could be much further improved, but they are so seldom-used that it's not worth it. Change-Id: I6a540578e810472bb455fffd15332b2a7a1ac901 Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
* | QString: insert a number of 8-character SIMD loopsThiago Macieira2018-05-151-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | We don't have _mm_cvtsi64_si128() (the REX.W expansion of MOVD [0F 6E]), but we do have _mm_loadl_epi64(), the SSE2 expansion of the MMX MOVQ at opcode 0F 7E. Ditto for _mm_cvtsi128_si64() and _mm_storel_epi64(). And those work even in 32-bit mode. By doing this, we can reduce the tail unrolled loops by half, reducing code size. I'm not adding these new SIMD sections to -Os builds. Change-Id: Ib48364abee9f464c96c6fffd152e405310ef67be Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
* | QUtf8: add AVX2 code for isValidUtf8Thiago Macieira2018-05-151-0/+22
|/ | | | | Change-Id: I5d0ee9389a794d80983efffd152d2beca86c5779 Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
* QUtf8: Add some UTF-8 text operation functionsThiago Macieira2018-02-031-1/+162
| | | | | | | | | | | The first, isValidUtf8(), as the name says, returns true if the string is valid UTF-8. As a bonus, it also returns whether it's valid US-ASCII. The other two are meant to compare an UTF-8 string to either a Latin1 one or an UTF-8 one, without memory allocation. Change-Id: Ic38ec929fc3f4bb795dafffd150ad0d63e28cd32 Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Aarch64: vectorize ascii de-/encoding.Erik Verbruggen2016-06-091-1/+72
| | | | | | | | | | | This works only on Aarch64, because the vaddv instruction is only available on 64bit ARM. Doing something equivalent on 32bit ARM has the high chance to run into micro-architecture differences: on an Cortex-a8, transferring a single vector element from NEON to the regular CPU registers takes 20 cycles(!). Change-Id: Iccbfe84da82abb9b10f3f3dc35c8b950df69e251 Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
* Remove _bit_scan_{forward,reverse}Erik Verbruggen2016-05-311-3/+13
| | | | | | | | | | Use qCountTrailingZeroBits and qCountLeadingZeroBits from qalgorithms.h instead. Also extended these versions for MSVC. The _bit_scan_* versions stem from a time before the glorious days of qalgorithms.h. A big advantage is that these functions can be used on all platforms. Change-Id: I5a1b886371520310a7fe16e617635ea335046beb Reviewed-by: Simon Hausmann <simon.hausmann@qt.io>
* Merge "Merge remote-tracking branch 'origin/5.6' into dev" into refs/staging/devLiang Qi2016-01-261-3/+6
|\
| * Merge remote-tracking branch 'origin/5.6' into devLiang Qi2016-01-211-3/+6
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: src/corelib/io/qiodevice_p.h src/corelib/kernel/qvariant_p.h src/corelib/tools/qsimd.cpp src/gui/kernel/qguiapplication.cpp tests/auto/network/socket/qtcpsocket/tst_qtcpsocket.cpp Change-Id: I742a093cbb231b282b43e463ec67173e0d29f57a
| | * Merge remote-tracking branch 'origin/5.5' into 5.6Liang Qi2016-01-191-3/+6
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: config.tests/common/atomic64/atomic64.cpp configure src/3rdparty/forkfd/forkfd.c src/corelib/io/forkfd_qt.cpp src/widgets/kernel/qwidgetwindow.cpp tests/auto/corelib/statemachine/qstatemachine/tst_qstatemachine.cpp tests/auto/network/socket/qtcpsocket/tst_qtcpsocket.cpp tests/auto/widgets/widgets/qcombobox/tst_qcombobox.cpp tools/configure/configureapp.cpp Change-Id: Ic6168d82e51a0ef1862c3a63bee6722e8f138414
| | | * Fix utf8->utf16 BOM/ZWNBSP decoding.Erik Verbruggen2015-12-211-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the byte sequence for a BOM occurs in the middle of a utf8 stream, it is a ZWNBSP. When a ZWNBSP occurs in the middle of a utf8 character sequence, and the SIMD conversion does some work (meaning: the length is at least 16 characters long), it would not recognize the fact some charactes were already decoded. So the conversion would then strip the ZWNBSP out, thinking it's a BOM. The non-SIMD conversion did not have this problem: the very first character conversion would already set the headerdone flag. Change-Id: I39aacf607e2e068107106254021a8042d164f628 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | | | Update the Intel copyright yearThiago Macieira2016-01-211-1/+1
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | Not that we require it, but since The Qt Company did it for all files they have copyright, even if they haven't touched the file in years (especially not in 2016), I'm doing the same. Change-Id: I7a9e11d7b64a4cc78e24ffff142b4c9d53039846 Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
* | | Updated license headersJani Heikkinen2016-01-151-14/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From Qt 5.7 -> LGPL v2.1 isn't an option anymore, see http://blog.qt.io/blog/2016/01/13/new-agreement-with-the-kde-free-qt-foundation/ Updated license headers to use new LGPL header instead of LGPL21 one (in those files which will be under LGPL v3) Change-Id: I046ec3e47b1876cd7b4b0353a576b352e3a946d9 Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
* | | Add a QUtf8::convertToUnicode() overload that operates on an existing bufferMarc Mutz2015-11-191-3/+26
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... and therefore doesn't need to allocate and thus can be marked as nothrow. Technically, the function doesn't have a wide contract, because it has the precondition that the buffer pointed to by the first argument needs to be large enough to hold the result. But that precondition can't be checked from within the function, so no failure can be generated for it and thus the nothrow guarantee is acceptable (and desireable). Change-Id: Iaf6ea6788ef6b4bbb6d8de59a3d0b14d66582307 Reviewed-by: Olivier Goffart (Woboq GmbH) <ogoffart@woboq.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* / QUtf8: remove an unused variableMarc Mutz2015-11-051-2/+1
|/ | | | | | | | 'need' was never anything but zero, so drop it. Change-Id: I4b52107afc7ed47c19ae1942cef0c92cbd0e1a36 Reviewed-by: Olivier Goffart (Woboq GmbH) <ogoffart@woboq.com> Reviewed-by: Jędrzej Nowacki <jedrzej.nowacki@theqtcompany.com>
* QtCore: Fix const correctness in old style castsThiago Macieira2015-07-201-6/+6
| | | | | | | Found with GCC's -Wcast-qual. Change-Id: Ia0aac2f09e9245339951ffff13c8d4b2920a11fb Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Update copyright headersJani Heikkinen2015-02-111-7/+7
| | | | | | | | | | | | | | | | | | Qt copyrights are now in The Qt Company, so we could update the source code headers accordingly. In the same go we should also fix the links to point to qt.io. Outdated header.LGPL removed (use header.LGPL21 instead) Old header.LGPL3 renamed to header.LGPL3-COMM to match actual licensing combination. New header.LGPL-COMM taken in the use file which were using old header.LGPL3 (src/plugins/platforms/android/extract.cpp) Added new header.LGPL3 containing Commercial + LGPLv3 + GPLv2 license combination Change-Id: I6f49b819a8a20cc4f88b794a8f6726d975e8ffbe Reviewed-by: Matti Paaso <matti.paaso@theqtcompany.com>
* Update license headers and add new license filesMatti Paaso2014-09-241-19/+11
| | | | | | | | | - Renamed LICENSE.LGPL to LICENSE.LGPLv21 - Added LICENSE.LGPLv3 - Removed LICENSE.GPL Change-Id: Iec3406e3eb3f133be549092015cefe33d259a3f2 Reviewed-by: Iikka Eklund <iikka.eklund@digia.com>
* Merge remote-tracking branch 'origin/stable' into devJ-P Nurmi2014-06-051-11/+8
|\ | | | | | | | | | | | | | | | | | | Conflicts: mkspecs/features/qt.prf src/plugins/platforms/xcb/qxcbwindow.h src/tools/qdoc/qdocindexfiles.cpp src/widgets/kernel/qwidget_qpa.cpp Change-Id: I214f57b03bc2ff86cf3b7dfe2966168af93a5a67
| * UTF-8: always store the SIMD result, even if invalidThiago Macieira2014-05-271-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For ASCII content, this improves the throughput because the conditional is no longer on the codepath to storing, so the processor can perform the store at the same time as it's doing the movemask operation. However, the gain is mostly theoretical: benchmarking with mostly ASCII content shows the algorithm running within 0.5% of the previous result (which is noise). For non-ASCII content, we're comparing the cost of doing a 16-byte store (which may be completely overwritten) with the loop copying and shifting left. Benchmarking shows a slight gain of a few percent. Change-Id: I28ef0021dffc725a922c539cc5976db367f36e78 Reviewed-by: Allan Sandfeld Jensen <allan.jensen@digia.com>
* | Merge remote-tracking branch 'origin/stable' into devSimon Hausmann2014-05-221-8/+36
|\| | | | | | | Change-Id: Ia36e93771066d8abcf8123dbe2362c5c9d9260fc
| * Fix stateful handling of invalid UTF-8 straddling buffer bordersThiago Macieira2014-05-131-8/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a UTF-8 sequences is too short, QUtf8Functions::fromUtf8 returns EndOfString. If the decoder is stateful, we must save the state and then restart it when more data is supplied. The new stateful decoder (8dd47e34b9b96ac27a99cdcf10b8aec506882fc2) mishandled the Error case by advancing the src pointer by a negative number, thus causing a buffer overflow (the issue of the task). And it also did not handle the len == 0 case properly, though neither did the older decoder. Task-number: QTBUG-38939 Change-Id: Ie03d7c55a04e51ee838ccdb3a01e5b989d8e67aa Reviewed-by: Kai Koehne <kai.koehne@digia.com> Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* | Improve a few string operations with AVX2Thiago Macieira2014-05-211-15/+32
|/ | | | | | | | | | | | | | AVX2 brings the new PMOVZXBW instruction that extends from one 128-bit SSE register to an 256-bit AVX register. With that, the main decoding code is just two instructions (the loop requires a couple more to maintain the offset counter and do the end-of-loop check). This buys us another 4% performance improvement in the fromLatin1 code, calculated on top of the VEX-encoded SSE2 code (which is already a little better than plain SSE2). Change-Id: I675fa24de4fa97683b662f19d146047251f77359 Reviewed-by: Allan Sandfeld Jensen <allan.jensen@digia.com>
* Restore handling of BOMs in QString::fromUtf8Thiago Macieira2014-04-241-15/+29
| | | | | | | | | | | 8dd47e34b9b96ac27a99cdcf10b8aec506882fc2 removed the handling of the BOMs but did not document it. This brings the behavior back and adds a unit test so we don't break it again. Discussed-on: http://lists.qt-project.org/pipermail/development/2014-April/016532.html Change-Id: Ifb7a9a6e5a494622f46b8ab435e1d168b862d952 Reviewed-by: Olivier Goffart <ogoffart@woboq.com> Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* Fix off-by-one error: the next ASCII character is next oneThiago Macieira2014-02-221-2/+2
| | | | | | | | | The bit scan function returns the index of the last non-ASCII character. The next ASCII is the one after this. This means all the benchmarks were made while reentering the SIMD loop uselessly... Change-Id: If7de485a63428bfa36d413049d9239ddda1986aa Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* QUtfCodec: don't encode invalid UCS-4 codepointsGiuseppe D'Angelo2014-02-071-8/+9
| | | | | | | | | | | | | | | | | | | The code didn't check for malformed surrogate pairs. That means that - high surrogates followed by *anything* were decoded as they formed a valid surrogate pair; - stray low surrogates were returned as-is. We can't return surrogate values in UCS-4, so properly detect these cases and return U+FFFD instead. [ChangeLog][QtCore][QTextCodec] Encoding a QString in UTF-32 will now replace malformed UTF-16 subsequences in the string with the Unicode replacement character (U+FFFD). Change-Id: I5cd771d6aa21ffeff4dd9d9e5a7961cf692dc457 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* Add support for UTF-8 encoding/decoding with SIMDThiago Macieira2014-01-311-15/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Decoding from UTF-8 is easy: if the high bit is set, we fall back to the byte-by-byte decoding. Encoding to UTF-8 requires a little bit more work: to detect anything between 0x0080 and 0xffff, we have several options but none as easy as above. Multiple alternatives are in the benchmark code. In both loops, we do two things once we run into a non-ASCII character: first, we continue the loop for the remainder of ASCII characters in the buffer (which we can tell by checking the bits set in the mask), then we find the last non-ASCII character in that 16-character group, so we don't reenter the SSE code too soon. For the UTF-8 encoding, I have chosen the alternative that results in the best performance. It's closely tied to the alternative running the PMIN instruction, but that requires SSE 4.1. It's not worth the complexity. And quite counter-intuitively, the dedicated string instruction from SSE 4.2 performs most poorly of all solutions. This begs re-visiting the performance of the toLatin1 encoder. The best of 10 benchmark runs of this code were measured on my SandyBridge CPU @ 2.66 GHz (turbo @ 3.3 GHz), both as CPU cycles and as CPU ticks: Compared to: ICU Qt 4.7 non-SSE Qt 5.3 Data set fromUtf8 toUtf8 fromUtf8 toUtf8 fromUtf8 toUtf8 ASCII only 7.50x 6.22x 6.94x 7.60x 4.45x 4.90x 2-char UTF-8 1.17x 1.33x 1.64x 1.56x 1.01x 1.02x 3-char UTF-8 1.08x 1.18x 1.48x 1.33x 0.97x 0.92x 4-char UTF-8 1.05x 1.19x 1.20x 1.21x 0.97x 0.97x Creator data 3.62x 2.16x 2.60x 1.25x 1.78x 1.23x As shown by the numbers, the SSE-based code is slightly worse than the non-SSE code for dense non-ASCII strings. However, as evident in the Qt Creator data, most strings manipulated by applications are either pure ASCII or mostly so, so there's a net gain. Done-with: H. Peter Anvin <hpa@linux.intel.com> Change-Id: Ia74fbdfdcd7b088f6cba5048c03a153c01f5dbc1 Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* Add a new UTF-8 decoder, similar to the encoder we've just addedThiago Macieira2014-01-091-86/+89
| | | | | | | | | | | | | | | | Like before, this is taken from the existing QUrl code and is optimized for ASCII handling (for the same reasons). And like previously, make QString::fromUtf8 use a stateless version of the codec, which is faster. There's a small change in behavior in the decoding: we insert a U+FFFD for each byte that cannot be decoded properly. Previously, it would "eat" all bad high-bit bytes and replace them all with one single U+FFFD. Either behavior is allowed by the UTF-8 specifications, even though this new behavior will cause misalignment in the Bradley Kuhn sample UTF-8 text. Change-Id: Ib1b1f0b4291293bab345acaf376e00204ed87565 Reviewed-by: Olivier Goffart <ogoffart@woboq.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Add a new UTF-8 encoder and use it from QStringThiago Macieira2014-01-091-45/+41
| | | | | | | | | | | | | | | | | This is a new and faster UTF-8 encoder, based on the code from QUrl. This code specializes for ASCII, which is the most common case anyway, especially since QString's "ascii" mode is actually UTF-8 now. In addition, make QString::toUtf8 use a stateless encoder. Stateless means that the function doesn't handle state between calls in the form of QTextCodec::ConverterState. This allows it to be faster than otherwise. The new code is in the form of a template so that it can be used from QJsonDocument and QUrl, which have small modifications to how the encoding is handled. Change-Id: I305ee0fd8523cc4ec74c2678cb9ea88b75bac7ac Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Allow non-character codes in utf8 stringsKurt Pattyn2013-10-171-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | Changed the processing of non-character code handling in the UTF8 codec. Non-character codes are now accepted in QStrings, QUrls and QJson strings. Unit tests were adapted accordingly. For more info about non-character codes, see: http://www.unicode.org/versions/corrigendum9.html [ChangeLog][QtCore][QUtf8] UTF-8 now accepts non-character unicode points; these are not replaced by the replacement character anymore [ChangeLog][QtCore][QUrl] QUrl now fully accepts non-character unicode points; they are encoded as percent characters; they can also be pretty decoded [ChangeLog][QtCore][QJson] The Writer and the Parser now fully accept non-character unicode points. Change-Id: I77cf4f0e6210741eac8082912a0b6118eced4f77 Task-number: QTBUG-33229 Reviewed-by: Lars Knoll <lars.knoll@digia.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Update copyright year in Digia's license headersSergio Ahumada2013-01-181-1/+1
| | | | | Change-Id: Ic804938fc352291d011800d21e549c10acac66fb Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* Change copyrights from Nokia to DigiaIikka Eklund2012-09-221-24/+24
| | | | | | | | Change copyrights and license headers from Nokia to Digia Change-Id: If1cc974286d29fd01ec6c19dd4719a67f4c3f00e Reviewed-by: Lars Knoll <lars.knoll@digia.com> Reviewed-by: Sergio Ahumada <sergio.ahumada@digia.com>
* QChar: add isSurrogate() and isNonCharacter() to the public APIKonstantin Ritt2012-05-161-4/+3
| | | | | | | | + QChar::LastValidCodePoint enum value that supercede the UNICODE_LAST_CODEPOINT macro replace uses of hardcoded values with the new API; remove leftovers Change-Id: I1395c9840b85fcb6b08e241b131794a98773c952 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* add some useful methods to QUnicodeTables::Konstantin Ritt2012-05-101-15/+3
| | | | | | | in order to reduce code duplication and prepare the ground for upcoming changes Change-Id: I980244149f65384c9484bbec4682de8b7b848b08 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* replace hardcoded values with a surrogate handling methodsKonstantin Ritt2012-04-131-3/+3
| | | | | | Change-Id: Ib41e08d835f2e8ca2e32b4025c6f5a99840f2e27 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com> Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* fix QUtf8 codec to disallow codes in range [U+fdd0..U+fdef]Konstantin Ritt2012-04-111-1/+1
| | | | | | | | | 0xfdef-0xfdd0 is definitely 31 and not 15 :) also fix all copy-pastes of this code (greping for '0xfdd0' helps ;) Change-Id: I8f3bd4fd9d85f9de066f0f5df378b9188c12bd48 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Denis Dzyubenko <denis.dzyubenko@nokia.com>
* Remove "All rights reserved" line from license headers.Jason McDonald2012-01-301-1/+1
| | | | | | | | | | As in the past, to avoid rewriting various autotests that contain line-number information, an extra blank line has been inserted at the end of the license text to ensure that this commit does not change the total number of lines in the license header. Change-Id: I311e001373776812699d6efc045b5f742890c689 Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
* Update contact information in license headers.Jason McDonald2012-01-231-1/+1
| | | | | | | Replace Nokia contact email address with Qt Project website. Change-Id: I431bbbf76d7c27d8b502f87947675c116994c415 Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
* Update copyright year in license headers.Jason McDonald2012-01-051-1/+1
| | | | | Change-Id: I02f2c620296fcd91d4967d58767ea33fc4e1e7dc Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
* Remove duplicate check in utf endian detectionKent Hansen2011-10-061-14/+12
| | | | | | | | | | | This was excessive paranoia. Task-number: QTBUG-20482 Change-Id: Ia0c76651773e12f25ec5d62675d6f317b8d2df13 Reviewed-on: http://codereview.qt-project.org/6045 Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com> Reviewed-by: Jędrzej Nowacki <jedrzej.nowacki@nokia.com> Reviewed-by: Lars Knoll <lars.knoll@nokia.com>