| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
+ reorder conditions in getWordBreaks() to make further updates more clear
Change-Id: I1ca9adde066c3a48830f310202f7181585fac194
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
| |
See http://www.unicode.org/reports/tr14/#CB
and http://www.unicode.org/reports/tr14/#LB20 for details
Change-Id: Ice0aa2b2ce81f6e39839a353240420436eddd754
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
| |
Change-Id: I8362663454e4c6604ecb6289ae8009d47c78aeb1
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to make it conformant to the Unicode 6.1 specifications #14 and #29.
The most important changes are:
* The implementation has been reworked from scratch to fix all known bugs;
* Separate-out the grapheme and the line breaking implementation to eliminate
an overhead due to calculating unnecessary breaks;
* Stop using deprecated SG class in favor of resolving pairs of surrogates;
* A proper support for SMP code points;
* Support for extended grapheme clusters (a drop-in replacement for the legacy
grapheme clusters as of Unicode 5.1);
* The hardcoded tailoring of UBA has been eliminated which breaks the 7 years-old
lineBreaking test. Some later, we'll investigate if such a tailoring is still needed.
Change-Id: I9f5867b3cec753b4fc120bc5a7e20f9a73d89370
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to keep them consistent with positions for all other flags.
This changes the internal behavior so that attributes[0].lineBreakType now means
"break opportunity at start of the text (before the first character in the string)"
and is always assigned with HB_NoBreak to conform rule LB2
(see http://www.unicode.org/reports/tr14/#LB2).
The current implementation is based on the sample implementation from tr14
that aimed to be as simple as possible rather than to be optimal.
From now, we can use pieces of the attributes array "as is"
without having to adjust some positions. Or we can analize some long text
by chunks (e.g. paragraph by paragraph) and consume less memory.
This introduces a minor overhead that will be eliminated shortly.
Change-Id: Ic873a05a9d5203b1c3d5aff2e4445a3f034c4bd2
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The white spaces determination doesn't belong to the text breaking algorithm.
A proper breaking implementation shouldn't assume spaces are
break opportunities (actually, space is allowed to be a grapheme base);
However, the whiteSpace flag should never be checked alone while iterating
over the text to find the space sequence; the grapheme boundaries should always
be taken into account. This covers the SMP code points in UTF-16 text and
graphemes that consist of a space followed with one or more grapheme extenders.
This introduces a minor overhead that would be eliminated some later.
Change-Id: Ic2cc7f485631fd0b436fc256ce112ded5f94fc07
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
| |
e.g. for the text "Aaa bbb ccc.\r\nDdd eee fff." the \r\n wasn't treated
as a hard line break or as a line break opportunity, ever.
Quite ancient bug...
Change-Id: I8d8497c55a3a4d51c27de99ccfe1e31f3bf4de77
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add QUnicodeTools namespace and rename qGetCharAttributes to initCharAttributes;
Make it possible to disable tailoring globally by overriding
qt_initcharattributes_default_algorithm_only value
(useful for i.e. running the specification conformance tests);
This is mostly a preparation step for the upcoming patches.
Change-Id: I783879fd17b63b52d7983e25dad5b820f0515e7f
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
there are several reasons to do this:
* text breaking is not a shaper's job;
* since the text breaking rules are bound to a specific Unicode version,
updating Qt's internal unicode data would require updating the data in HB as well;
* makes porting to HurfBuzz-NG some easier
Change-Id: I0bbf8e8a343bc074696f4ddf2ae4e7fa32a61629
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|