summaryrefslogtreecommitdiffstats
path: root/util/unicode
Commit message (Collapse)AuthorAgeFilesLines
* Clean-up the Unicode tables generator code and the generated headerKonstantin Ritt2012-06-221-432/+435
| | | | | | | | | | | | This fixes the blocks and memory consumption reports, the whitespace issues and makes the code a bit cleaner. Since I'm the only one who does change this code, such a no-op commit could not hurt anyone or even git blame ;) Change-Id: Ib069f925a3791c82e16c368c8392bcffbfd68c53 Reviewed-by: Lars Knoll <lars.knoll@nokia.com> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* Make QUnicodeTables::script() support SMP code pointsKonstantin Ritt2012-06-143-277/+145
| | | | | | | | | | | | | | | | | | | Instead of expanding the scripts table with script values for the code points >= 0x10000, it has been merged with the properties table in order to increase perfomance of the script itemization code (not affected yet). (Stats: the properties table grew up in 97428-89800 = 7628 bytes; the old scripts table was of size 7680 bytes) The outdated ScriptsInitial.txt and ScriptsCorrections.txt file has been removed (they were just empty, the "corrigendum" script corrections should be applied to Scripts.txt directly, *no customization allowed*!). More script testcases has been added - at least one per supported script. Task-number: QTBUG-6530 Change-Id: I40a9e76f681e2dd552fd4c61af0808d043962e79 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* Line Breaking Algorithm: handle the Object Replacement CharacterKonstantin Ritt2012-06-101-7/+6
| | | | | | | | See http://www.unicode.org/reports/tr14/#CB and http://www.unicode.org/reports/tr14/#LB20 for details Change-Id: Ice0aa2b2ce81f6e39839a353240420436eddd754 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* Update the Unicode data files up to v6.1.0Konstantin Ritt2012-06-1014-1317/+24169
| | | | | Change-Id: I20b94634b1f4ebff10757c2348cfdbbd906e8797 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* Update the qunicodetables generator to deal with UCD 6.1 filesKonstantin Ritt2012-06-101-34/+92
| | | | | Change-Id: If22018ff83cfc6b9c984f689648da038fce11d84 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* Move ScriptSentinel enum from header to .cppKonstantin Ritt2012-05-251-4/+4
| | | | | Change-Id: Ic74e8e2471e92aa2014735f6ab0bb4f3b88de206 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* QChar: add isSurrogate() and isNonCharacter() to the public APIKonstantin Ritt2012-05-161-25/+6
| | | | | | | | + QChar::LastValidCodePoint enum value that supercede the UNICODE_LAST_CODEPOINT macro replace uses of hardcoded values with the new API; remove leftovers Change-Id: I1395c9840b85fcb6b08e241b131794a98773c952 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* significant unicodetables generator performance optimizationKonstantin Ritt2012-05-111-41/+47
| | | | | | | | | | since the entire range of a valid unicode code points is in use, QHash is suboptimal and could be replaced with QList; taking the value by ref and not inserting it back to the map + not calculating the default value over and over gains us up to 60% performance boost! Change-Id: I48c54a8e88472cf76c79c0aac44e65eeefa44861 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* add some useful methods to QUnicodeTables::Konstantin Ritt2012-05-101-1/+42
| | | | | | | in order to reduce code duplication and prepare the ground for upcoming changes Change-Id: I980244149f65384c9484bbec4682de8b7b848b08 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* add support for non-BMP ligaturesKonstantin Ritt2012-05-041-14/+84
| | | | | | | | | | | | | | | | | | | | | > http://www.unicode.org/versions/Unicode5.2.0/ D. Character Additions: There are three new characters in the newly-encoded Kaithi script that will require changes in implementations which make hard-coded assumptions about composition during normalization. Most new characters added to the standard with decompositions cannot be generated by the operations toNFC() or toNFKC), but these three can. Implementers should check their code carefully to ensure that it handles these three characters correctly. U+1109A KAITHI LETTER DDDHA U+1109C KAITHI LETTER RHA U+110AB KAITHI LETTER VA UCD 6.1 adds two more of them: U+1112E CHAKMA VOWEL SIGN O U+1112F CHAKMA VOWEL SIGN AU Change-Id: I781a26848078d8b83a182b0fd4e681be2a6d9a27 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* qunicodetables generator: improve the output and the generated codeKonstantin Ritt2012-04-241-90/+109
| | | | | | | | | | | better memory usage report; an additional asserts with conditions the implementation is depends on; a namespace for the internal static data; styling fixes Change-Id: Id4048ff6104c56b5f590f9ac6fbf7c0bce79ec47 Reviewed-by: Lars Knoll <lars.knoll@nokia.com> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* workaround issue where casing diff overflows signed shortKonstantin Ritt2012-04-241-17/+41
| | | | | | | | | | | | | | | | | | | there are two such codepoints were added in the Unicode 5.1: U+1D79 LATIN SMALL LETTER INSULAR G U+A77D LATIN CAPITAL LETTER INSULAR G two more of them were added in the Unicode 6.0: U+0265 LATIN SMALL LETTER TURNED H U+A78D LATIN CAPITAL LETTER TURNED H and two more were added in the Unicode 6.1: U+0266 LATIN SMALL LETTER H WITH HOOK U+A7AA LATIN CAPITAL LETTER H WITH HOOK we map them like special cases with length == 1 (note: all are in BMP which is checked explicitly in the generator) Change-Id: I8a34164eb3ee2e575b7799cc12d4b96ad5bcd9c6 Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com> Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* UCD-5.0: apply Corrigendum #6Konstantin Ritt2012-04-152-25/+14
| | | | | | | | | | | | | http://unicode.org/versions/corrigendum6.html: > in Unicode 5.0, the list of characters with the Bidi_Mirrored property > was made consistent for brackets and quotation marks, in preparation for > new constraints on bidi mirroring. However, after publication of > Unicode 5.0.0 it was discovered that this change adversely affected > several quotation mark characters in deployed data. Task-number: QTBUG-25169 Change-Id: Id49caf401af2d5a1e6dbcc32b2f350aa20b7f901 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* replace hardcoded values with a surrogate handling methodsKonstantin Ritt2012-04-111-10/+10
| | | | | | Change-Id: Iba079953c46a29404232d2dacbe0c90170097d51 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com> Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* minor improvement for NormalizationCorrectionsKonstantin Ritt2012-04-111-2/+5
| | | | | | | | let's don't hardcode the latests affected version value and simply use the one parsed from NormalizationCorrections.txt Change-Id: I37021e8238d77deada4c5ba7a2d160c87186b9dd Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* clean up qmake-generated projectsOswald Buddenhagen2012-02-241-2/+0
| | | | | | | | remove "header" and assignmets which are defaults or bogus, reorder some assignments. Change-Id: I67403872168c890ca3b696753ceb01c605d19be7 Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
* optimize QString::toLower()/toUpper() for special cases, step 2Konstantin Ritt2012-02-211-3/+7
| | | | | | | | | | | from now, QUnicodeTables::specialCaseMap[] starts with a placeholder; so, if somethingCaseSpecial is true, then somethingCaseDiff is always greater than 0 Change-Id: Ibb1870512836eee71b1521564c0745096c05b2f9 Merge-request: 70 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com> Reviewed-by: Olivier Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
* optimize QString::toLower()/toUpper() for special cases, step 1Konstantin Ritt2012-02-211-18/+31
| | | | | | | | | | | reorganize QUnicodeTables::specialCaseMap as follows: specialCaseMap contains sequence entries in form { length, a, b, .. } Change-Id: Iea1f80bc2f4dc1f505428dad981cde26daaa52c7 Merge-request: 70 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com> Reviewed-by: Olivier Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
* Remove "All rights reserved" line from license headers.Jason McDonald2012-01-303-4/+4
| | | | | | | | | | As in the past, to avoid rewriting various autotests that contain line-number information, an extra blank line has been inserted at the end of the license text to ensure that this commit does not change the total number of lines in the license header. Change-Id: I311e001373776812699d6efc045b5f742890c689 Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
* Update contact information in license headers.Jason McDonald2012-01-233-4/+4
| | | | | | | Replace Nokia contact email address with Qt Project website. Change-Id: I431bbbf76d7c27d8b502f87947675c116994c415 Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
* Update copyright year in license headers.Jason McDonald2012-01-053-4/+4
| | | | | Change-Id: I02f2c620296fcd91d4967d58767ea33fc4e1e7dc Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
* replace 'const QChar &' with 'QChar ' for QChar and QStringRitt Konstantin2011-10-261-2/+2
| | | | | | | | Merge-request: 69 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com> Change-Id: I61f5a54b783252029fcad95677958fa6a2130d01 Reviewed-by: Olivier Goffart <ogoffart@kde.org>
* drop an obsolete QChar::NoCategory enum valueRitt Konstantin2011-07-131-5/+2
| | | | | | | | | | | | there is no such category in the Unicode specs. the QChar::NoCategory was a subject of bugs since it was introduced. int 4.6 it's meaning was limited to mention ucs4 > UNICODE_LAST_CODEPOINT only (which is useless anyways) in order to preserve the old (wrong) behavior. fix it now for qtbase Change-Id: I630534824e071090b39772881e747c1fdb758719 Reviewed-on: http://codereview.qt.nokia.com/1584 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
* Update licenseheader text in source files for qtbase Qt moduleJyri Tahtela2011-05-243-69/+69
| | | | | | | Updated version of LGPL and FDL licenseheaders. Apply release phase licenseheaders for all source files. Reviewed-by: Trust Me
* Initial import from the monolithic Qt.Qt by Nokia2011-04-2727-0/+64513
This is the beginning of revision history for this module. If you want to look at revision history older than this, please refer to the Qt Git wiki for how to use Git history grafting. At the time of writing, this wiki is located here: http://qt.gitorious.org/qt/pages/GitIntroductionWithQt If you have already performed the grafting and you don't see any history beyond this commit, try running "git log" with the "--follow" argument. Branched from the monolithic repo, Qt master branch, at commit 896db169ea224deb96c59ce8af800d019de63f12