Improve the Unicode script itemization implementation

Make it closer to the Unicode specs (UAX#24): * Common now inherits the preceding character's script, if any; * In a combining character sequence, if the base character is of Common script, the entire sequence is treated like if it were of the first non-Inherited, non-Common script in the sequence. See http://www.unicode.org/reports/tr24/tr24-21.html for more details. [ChangeLog][QtGui] Fixed regression in arabic text rendering. Task-number: QTBUG-28813 Task-number: QTBUG-29930 (related) Task-number: QTBUG-35836 Change-Id: Id85761965b08ca94c674d5f3613fe58b82b2ce9c Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@digia.com> Reviewed-by: Ahmed Saidi <justroftest@gmail.com>
author: Konstantin Ritt <ritt.ks@gmail.com> 2014-04-10 13:50:53 +0300
committer: The Qt Project <gerrit-noreply@qt-project.org> 2014-04-14 06:43:57 +0200
commit: 0ec07b68ad34e135451dd5291732bf73d297ba0c (patch)
tree: 33a7cafbe977729a2d97beb70c18853a31203154 /src/gui
parent: ef77c16cb00ef3788e58da8b35cb91eaf2eaa591 (diff)
1 files changed, 1 insertions, 16 deletions
diff --git a/src/gui/text/qtextengine.cpp b/src/gui/text/qtextengine.cpp
index 967ba24fcf..34788dc4dc 100644
--- a/src/gui/text/qtextengine.cpp
+++ b/src/gui/text/qtextengine.cpp
@@ -122,20 +122,9 @@ private:
             return;
         const int end = start + length;
         for (int i = start + 1; i < end; ++i) {
-            // According to the unicode spec we should be treating characters in the Common script
-            // (punctuation, spaces, etc) as being the same script as the surrounding text for the
-            // purpose of splitting up text. This is important because, for example, a fullstop
-            // (0x2E) can be used to indicate an abbreviation and so must be treated as part of a
-            // word.  Thus it must be passed along with the word in languages that have to calculate
-            // word breaks.  For example the thai word "ครม." has no word breaks but the word "ครม"
-            // does.
-            // Unfortuntely because we split up the strings for both wordwrapping and for setting
-            // the font and because Japanese and Chinese are also aliases of the script "Common",
-            // doing this would break too many things.  So instead we only pass the full stop
-            // along, and nothing else.
             if (m_analysis[i].bidiLevel == m_analysis[start].bidiLevel
                 && m_analysis[i].flags == m_analysis[start].flags
-                && (m_analysis[i].script == m_analysis[start].script || m_string[i] == QLatin1Char('.'))
+                && m_analysis[i].script == m_analysis[start].script
                 && m_analysis[i].flags < QScriptAnalysis::SpaceTabOrObject
                 && i - start < MaxItemLength)
                 continue;
@@ -1515,26 +1504,22 @@ void QTextEngine::itemize() const
     while (uc < e) {
         switch (*uc) {
         case QChar::ObjectReplacementCharacter:
-            analysis->script = QChar::Script_Common;
             analysis->flags = QScriptAnalysis::Object;
             break;
         case QChar::LineSeparator:
             if (analysis->bidiLevel % 2)
                 --analysis->bidiLevel;
-            analysis->script = QChar::Script_Common;
             analysis->flags = QScriptAnalysis::LineOrParagraphSeparator;
             if (option.flags() & QTextOption::ShowLineAndParagraphSeparators)
                 *const_cast<ushort*>(uc) = 0x21B5; // visual line separator
             break;
         case QChar::Tabulation:
-            analysis->script = QChar::Script_Common;
             analysis->flags = QScriptAnalysis::Tab;
             analysis->bidiLevel = control.baseLevel();
             break;
         case QChar::Space:
         case QChar::Nbsp:
             if (option.flags() & QTextOption::ShowTabsAndSpaces) {
-                analysis->script = QChar::Script_Common;
                 analysis->flags = QScriptAnalysis::Space;
                 analysis->bidiLevel = control.baseLevel();
                 break;
author	Konstantin Ritt <ritt.ks@gmail.com>	2014-04-10 13:50:53 +0300
committer	The Qt Project <gerrit-noreply@qt-project.org>	2014-04-14 06:43:57 +0200
commit	0ec07b68ad34e135451dd5291732bf73d297ba0c (patch)
tree	33a7cafbe977729a2d97beb70c18853a31203154 /src/gui
parent	ef77c16cb00ef3788e58da8b35cb91eaf2eaa591 (diff)