| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Since 5.0, script is QChar::Script which isn't of 1:1 mapping to HB_Script
Change-Id: I2d88f929d7d3c3c994076a4e92ea22370962c41c
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
src/corelib/io/qsavefile_p.h
src/corelib/tools/qregularexpression.cpp
src/gui/util/qvalidator.cpp
src/gui/util/qvalidator.h
Change-Id: I58fdf0358bd86e2fad5d9ad0556f3d3f1f535825
|
| |
| |
| |
| |
| | |
Change-Id: Ic804938fc352291d011800d21e549c10acac66fb
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
examples/widgets/painting/shared/shared.pri
src/corelib/tools/qharfbuzz_p.h
src/corelib/tools/qunicodetools.cpp
src/plugins/platforms/windows/accessible/qwindowsaccessibility.cpp
src/plugins/platforms/windows/qwindowsfontdatabase.cpp
Change-Id: Ibc9860abf570e5ce8b052fb88feb73ec35e64bd3
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
we were already installing them into QtCore/private, so turn them into
proper private headers to start with. this cleans up our project files.
Change-Id: I0795f79e03b60b5854de9e4dc339e9b5a5e6fd87
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Joerg Bornemann <joerg.bornemann@digia.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
...and remove the outdated QUnicodeTables::Script enum.
QFontEngineData now has one extra slot that never used
(engines[QChar::Script_Inherited]). engines[QChar::Script_Unknown],
if accessed, would be set with a Box engine instance, and could be used
as a minor optimization some time later.
In order to preserve the existing behavior, we map all scripts up to Latin to Common.
Change-Id: Ide4182a0f8447b4bf25713ecc3fe8097b8fed040
Reviewed-by: Pierre Rossi <pierre.rossi@gmail.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of Unicode 5.1, some punctuation marks were mapped to MidLetter and MidNumLet
for better URL and abbreviations handling which caused "hi.there" to be treated
like if it were just a single word;
until we have the Unicode Text Segmentation tailoring mechanism, retain
the old behavior by remapping (some of) those characters back to their old values.
Change-Id: I49dea6064f2ea40a82fc0b1bc3c4f0b4e803919f
Reviewed-by: David Faure <david.faure@kdab.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
|
|
|
|
|
|
|
|
|
|
| |
that will be returned by boundaryReasons() when the boundary finder
is at the line end position (CR, LF, NewLine Function, End of Text, etc.).
The MandatoryBreak flag, if set, means the text should be wrapped at a given position.
Change-Id: I32d4f570935d2e015bfc5f18915396a15f009fde
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Version 6.2 of the Unicode Standard is a special release
dedicated to the early publication of the newly encoded Turkish lira sign.
In addition, there are some significant changes to the Unicode algorithms
for text segmentation and line breaking to improve breaking for emoji symbols.
For more details, see http://www.unicode.org/versions/Unicode6.2.0/
Change-Id: I21cfd4f307e41b41a19d36cce87f7a44c2661bc2
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A simple heuristic is used to detect the word beginning and ending by
looking at the word break property value of surrounding characters.
This behaves better than the white-spaces based implementation used before
and makes it possible to tailor the default algorithm for complex scripts.
BIG FAT WARNING: The QCharAttributes buffer now has to have a length
of string length + 1 for the flags at end of text.
Task-Id: QTBUG-6498
Change-Id: I5589b191ffde6a50d2af0c14a00430d3852c67b4
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
|
|
|
|
|
|
|
|
| |
Change copyrights and license headers from Nokia to Digia
Change-Id: If1cc974286d29fd01ec6c19dd4719a67f4c3f00e
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Sergio Ahumada <sergio.ahumada@digia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce QCharAttributes and use it instead of HB_CharAttributes everywhere in Qt
(in Harfbuzz, the HB_CharAttributes is only used in the text segmentation algorithm
which has been moved from HB to Qt (well, most of it)).
Rename some members to better reflect their meaning,
remember to keep HB_CharAttributes in sync with QCharAttributes.
Also replace HB_ScriptItem with a (temporary) QUnicodeTools::ScriptItem struct
that will be replaced with a more efficient/friendly solution a bit later.
The soft hyphen and the mandatory break detection has been factored out
of the default text breaking algorithm to a higher level in order to refactor
the QCharAttributes bitfields and to optimize the implementation for the common case.
Change-Id: Ieb365623ae954430f1c8b2dfcd65c82973143eec
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
|
|
|
|
|
|
|
|
|
| |
Just like for the QChar::ByteOrderMark, `ch == QChar::SoftHyphen`
is much more readable than `ch == 0x00ad // (soft-hyphen)`, etc.
Change-Id: I9c85f14cfd979037d35103c3259a435fd729b869
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
enums GraphemeBreak, WordBreak, and SentenceBreak has been renamed to
GraphemeBreakClass, WordBreakClass, and SentenceBreakClass respectively,
their values has been renamed to contain a '_' as logical enum-value separator
(just like many other nums in Qt, e.g. LineBreakClass);
*BreakFormat has been replaced with *Break_Extend (some format characters are
kind of subtype of the extender characters, not vice versa).
Change-Id: I9ddbcf8848da87409736c2d6d1798a62fa28cab8
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
| |
+ reorder conditions in getWordBreaks() to make further updates more clear
Change-Id: I1ca9adde066c3a48830f310202f7181585fac194
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
| |
See http://www.unicode.org/reports/tr14/#CB
and http://www.unicode.org/reports/tr14/#LB20 for details
Change-Id: Ice0aa2b2ce81f6e39839a353240420436eddd754
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
| |
Change-Id: I8362663454e4c6604ecb6289ae8009d47c78aeb1
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to make it conformant to the Unicode 6.1 specifications #14 and #29.
The most important changes are:
* The implementation has been reworked from scratch to fix all known bugs;
* Separate-out the grapheme and the line breaking implementation to eliminate
an overhead due to calculating unnecessary breaks;
* Stop using deprecated SG class in favor of resolving pairs of surrogates;
* A proper support for SMP code points;
* Support for extended grapheme clusters (a drop-in replacement for the legacy
grapheme clusters as of Unicode 5.1);
* The hardcoded tailoring of UBA has been eliminated which breaks the 7 years-old
lineBreaking test. Some later, we'll investigate if such a tailoring is still needed.
Change-Id: I9f5867b3cec753b4fc120bc5a7e20f9a73d89370
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to keep them consistent with positions for all other flags.
This changes the internal behavior so that attributes[0].lineBreakType now means
"break opportunity at start of the text (before the first character in the string)"
and is always assigned with HB_NoBreak to conform rule LB2
(see http://www.unicode.org/reports/tr14/#LB2).
The current implementation is based on the sample implementation from tr14
that aimed to be as simple as possible rather than to be optimal.
From now, we can use pieces of the attributes array "as is"
without having to adjust some positions. Or we can analize some long text
by chunks (e.g. paragraph by paragraph) and consume less memory.
This introduces a minor overhead that will be eliminated shortly.
Change-Id: Ic873a05a9d5203b1c3d5aff2e4445a3f034c4bd2
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The white spaces determination doesn't belong to the text breaking algorithm.
A proper breaking implementation shouldn't assume spaces are
break opportunities (actually, space is allowed to be a grapheme base);
However, the whiteSpace flag should never be checked alone while iterating
over the text to find the space sequence; the grapheme boundaries should always
be taken into account. This covers the SMP code points in UTF-16 text and
graphemes that consist of a space followed with one or more grapheme extenders.
This introduces a minor overhead that would be eliminated some later.
Change-Id: Ic2cc7f485631fd0b436fc256ce112ded5f94fc07
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
| |
e.g. for the text "Aaa bbb ccc.\r\nDdd eee fff." the \r\n wasn't treated
as a hard line break or as a line break opportunity, ever.
Quite ancient bug...
Change-Id: I8d8497c55a3a4d51c27de99ccfe1e31f3bf4de77
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add QUnicodeTools namespace and rename qGetCharAttributes to initCharAttributes;
Make it possible to disable tailoring globally by overriding
qt_initcharattributes_default_algorithm_only value
(useful for i.e. running the specification conformance tests);
This is mostly a preparation step for the upcoming patches.
Change-Id: I783879fd17b63b52d7983e25dad5b820f0515e7f
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|
|
there are several reasons to do this:
* text breaking is not a shaper's job;
* since the text breaking rules are bound to a specific Unicode version,
updating Qt's internal unicode data would require updating the data in HB as well;
* makes porting to HurfBuzz-NG some easier
Change-Id: I0bbf8e8a343bc074696f4ddf2ae4e7fa32a61629
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
|