summaryrefslogtreecommitdiffstats
path: root/util
diff options
context:
space:
mode:
authorEdward Welbourne <edward.welbourne@qt.io>2020-07-31 14:55:57 +0200
committerEdward Welbourne <edward.welbourne@qt.io>2020-08-29 18:15:27 +0200
commit78cf89c07d7fa834a455afa5862823e50171415c (patch)
tree4cd4fc6347c3df2771f49763231fb2d91b5582d9 /util
parentab5e444c8fd754321ef0a6b084248e431e84a7a8 (diff)
Use checked string iteration in case conversions
The Unicode table code can only be safely called on valid code-points. So code that calls it must only pass it valid Unicode data. The string iterator's Unchecked Unchecked methods only provide this guarantee when the string being iterated is guaranteed to be valid UTF-16; while client code should only use QString, QStringView and friends on valid UTF-16 data, we have no way to be sure they have respected that. So take the few extra cycles to actually check validity in the course of iterating strings, when the resulting code-points are to be passed to the Unicode table look-ups. Add tests that case mapping doesn't access Unicode tables out of range (it'll trigger the new assertion). Added some comments to qchar.h that helped me understand surrogates. Change-Id: Iec2c3106bf1a875bdaa1d622f6cf94d7007e281e Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Diffstat (limited to 'util')
-rw-r--r--util/unicode/main.cpp1
1 files changed, 1 insertions, 0 deletions
diff --git a/util/unicode/main.cpp b/util/unicode/main.cpp
index 167f632e37..df806eff0b 100644
--- a/util/unicode/main.cpp
+++ b/util/unicode/main.cpp
@@ -2501,6 +2501,7 @@ static QByteArray createPropertyInfo()
out += "Q_DECL_CONST_FUNCTION static inline const Properties *qGetProp(char32_t ucs4) noexcept\n"
"{\n"
+ " Q_ASSERT(ucs4 <= QChar::LastValidCodePoint);\n"
" if (ucs4 < 0x" + QByteArray::number(BMP_END, 16) + ")\n"
" return uc_properties + uc_property_trie[uc_property_trie[ucs4 >> "
+ QByteArray::number(BMP_SHIFT) + "] + (ucs4 & 0x"