| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add enumerator for the new Unicode version to QChar::UnicodeVersion.
Remap new line breaking classes to their Unicode 15.0 values:
* AK, AP and AS to AL,
* VI and VF to CM.
These are classes for new line breaking support for Indic scripts
that require more work.
Blacklist failing tests for now:
* tst_QUrlUts46::idnaTestV2
* tst_QTextBoundaryFinder::lineBoundariesDefault
* tst_QTextBoundaryFinder::graphemeBoundariesDefault
Regenerate the source files.
Task-number: QTBUG-121529
Change-Id: I869cc9fbaa53765d8ae6265c22cdbef9f19d05bf
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This corresponds to Unicode version 15.0.0.
Added the following scripts:
* Kawi
* Nag Mundari
Full support of these scripts requires harfbuzz version 5.2.0,
this version adds support for Unicode 15.0:
https://github.com/harfbuzz/harfbuzz/releases/tag/5.2.0
Fixes: QTBUG-106810
Change-Id: Ib06c526e49b0f01ef9f21123bcf875c6b19f2601
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
| |
There are no commented out test cases remaining, so the normal
test vectors are identical to full test vectors.
Fixes: QTBUG-97537
Pick-to: 6.3
Change-Id: I987f178f192e1c8e2d998d36499fdce84f237e77
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
UAX #14, revision 45 (Unicode 13) has changed rule LB30 to only
trigger if the open parentheses is non-wide:
(AL | HL | NU) × [OP-[\p{ea=F}\p{ea=W}\p{ea=H}]]
This fixes the remaining 24 line break tests.
Task-number: QTBUG-97537
Pick-to: 6.3
Change-Id: I9870588c04bf0f6ae0a98289739bef8490f67f69
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement part of LB30b introduced by UAX #14, revision 47
(Unicode 14.0.0):
[\p{Extended_Pictographic}&\p{Cn}] × EM
This fixes one line breaking test.
Task-number: QTBUG-97537
Pick-to: 6.3
Change-Id: I3fd2372a057b7391d8846e9c146f69a54686ea61
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
| |
Word breaking rule WB3d should not be affected by WB4.
This fixes the remaining word break test.
Task-number: QTBUG-97537
Pick-to: 6.2 6.3
Change-Id: I99aee831d7c54fafcd2a9d526a3e078b12c5bfad
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adjust handling of WB3c rule to UAX #29, revision 33 (Unicode 11.0.0).
The rule reads:
ZWJ × \p{Extended_Pictographic}
This fixes 9 word break tests.
Task-number: QTBUG-97537
Pick-to: 6.2 6.3
Change-Id: I818d4048828e6663d5c090aa372d83f5099fdffe
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Disable break between sequences of WSegSpace characters (rule WB3d,
introduced in UAX #29, version 33, Unicode 11.0.0). Also disable breaks
between WSegSpace and (Extend | Format | ZWJ) due to rule WB4.
Adjust "words4" test to take the above changes into account (space
character belongs to WSegSpace).
Mention the full class name in a comment inside the word break table.
This fixes 34 word break tests.
Task-number: QTBUG-97537
Pick-to: 6.2 6.3
Change-Id: I7dfe8367e45c86913bb7d7fe2adb053711978487
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rule was simplified in version UTS #14 version 45 (Unicode 13.0.0)
to read:
× IN
Re-enabled 28 fixed line break tests.
Task-number: QTBUG-97537
Pick-to: 6.2 6.3
Change-Id: I1c5565a8c1633428c22379917215d4e424ff0055
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adjust implementation of rule LB8a of UAX #14. The rule was changed
in version 41 (corresponding to Unicode 11.0.0):
ZWJ × (ID | EB | EM) ⇒ ZWJ ×
Fixing this rule fixes 9 line break tests. Those are re-enabled.
Task-number: QTBUG-97537
Pick-to: 6.2 6.3
Change-Id: I1570719590a46ae28c98ed7d5053e72b12915db7
Reviewed-by: Øystein Heskestad <oystein.heskestad@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This corresponds to Unicode version 14.0.0.
Added the following scripts:
* CyproMinoan
* OldUyghur
* Tangsa
* Toto
* Vithkuqi
Full support of these scripts requires harfbuzz version 3.0.0,
this version adds support for Unicode 14.0:
https://github.com/harfbuzz/harfbuzz/releases/tag/3.0.0
With this release 10 test cases in tst_qurluts46 were fixed, one
additional test case is failing in tst_qtextboundaryfinder and
is commented out. In total 62 line break test cases and 44 word
break test cases are failing.
A comment in src/corelib/text/qt_attribution.json was updated to
include the URL of the page containing UCD version number.
Fixes: QTBUG-94359
Change-Id: Iefc9ff13f3df279f91cbdb1246d56f75b20ecb35
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
UAX #29 in Unicode 11 changed the EGC algorithm to its current form.
Although Qt has upgraded the Unicode tables all the way up to
Unicode 13, the algorithm has never been adapted; in other words,
it has been working by chance for years. Luckily, MOST
of the cases were dealt with correctly, but emoji handling
actually manages to break it.
This commit:
* Adds parsing of emoji-data.txt into the unicode table generator.
That is necessary to extract the Extended_Pictographic property,
which is used by the EGC algorithm.
* Regenerates the tables.
* Removes some obsoleted grapheme cluster break properties, and
adds the ones added in the meanwhile.
* Rewrites the EGC algorithm according to Unicode 13. This is
done by simplifying a lot the lookup table. Some rules (GB11,
GB12, GB13) can't be done by the table alone so some hand-rolled
code is necessary in that case.
* Thanks to these fixes, the complete upstream GraphemeBreakTest
now passes. Remove the "edited" version that ignored some rows
(because they were failing).
Change-Id: Iaa07cb2e6d0ab9deac28397f46d9af189d2edf8b
Pick-to: 6.1 6.0 5.15
Fixes: QTBUG-92822
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include WordBreakTest.html, since a test uses sample strings from it,
albeit without actually reading the file.
Had to comment out more of the new tests, as at Revision 24, pending
an update to harfbuzz and the text boundary detection code.
Task-number: QTBUG-79631
Task-number: QTBUG-79418
Task-number: QTBUG-82747
Change-Id: I0082294b09d67ffdc6a9b5c15acf77ad3b86f65f
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Had to teach the update program to accept category Lm as for
Joining_Transparent, for the sake of a new ArabicShaping.txt entry.
Added three new Unicode versions, several new scripts and a new
word-break class.
Updated UCD's test data for tst_QTextBoundaryFinder. This left 57
tests failing; I have commented out the data rows for those tests,
pending someone with more knowledge addressing this.
Task-number: QTBUG-79631
Task-number: QTBUG-79418
Change-Id: Ic33d3b3551195d47a84d98e84020f57a68f0b201
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
|
|
This includes byte array, string, char, unicode, locale, collation and
regular expressions.
Change-Id: I8b125fa52c8c513eb57a0f1298b91910e5a0d786
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
|