summaryrefslogtreecommitdiffstats
path: root/util/unicode
diff options
context:
space:
mode:
authorIevgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>2021-10-04 14:57:28 +0200
committerIevgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>2021-10-05 20:38:02 +0200
commit394f8a9c4b8c2d7c3c813c4855a17f2529af3d95 (patch)
treeb4a0d10f9cfb2b18ab7e065eb330195647cc6f4e /util/unicode
parent2a546690bf457f4bfee0910ba979441511843f8b (diff)
Unicode: Add script to facilitate UCD update
Add a script that downloads UCD data for a given Unicode version, unpacks it, and copies the used files to appropriate locations inside the Qt source code. Also update the README and use an HTTPS link for the UCD data file. FTP links are no longer supported by some browsers. Task-number: QTBUG-94359 Change-Id: I2aa70a588f675e411fa6b3ce5b4444a7c07ed707 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Diffstat (limited to 'util/unicode')
-rw-r--r--util/unicode/README23
-rwxr-xr-xutil/unicode/update_ucd.sh78
2 files changed, 90 insertions, 11 deletions
diff --git a/util/unicode/README b/util/unicode/README
index 580bf9d8c0..faf9201b1c 100644
--- a/util/unicode/README
+++ b/util/unicode/README
@@ -1,21 +1,22 @@
Unicode is used to generate the unicode data in src/corelib/text/.
To update:
-* Find the data (UAX #44, UCD; not the XML version) at
- ftp://www.unicode.org/Public/zipped/$Version/
-* Unpack the zip file; for each file in data/, replace with the new
- version; find the *BreakProperty.txt in auxiliary/ and emoji-data.txt
- in emoji/.
-* In tst_QTextBoundaryFinder's data/ sub-directory, update its files
- from the auxiliary/ sub-directory of the UCD data.
+* Run `./update_ucd.sh $Version`. This automates the following steps:
+ * Find the data (UAX #44, UCD; not the XML version) at
+ https://www.unicode.org/Public/zipped/$Version/
+ * Unpack the zip file; for each file in data/, replace with the new
+ version; find the *BreakProperty.txt in auxiliary/ and emoji-data.txt
+ in emoji/.
+ * In tst_QTextBoundaryFinder's data/ sub-directory, update its files
+ from the auxiliary/ sub-directory of the UCD data.
+ * Download https://www.unicode.org/Public/idna/$Version/IdnaMappingTable.txt
+ and put it into data/.
+ * Download https://www.unicode.org/Public/idna/$Version/IdnaTestV2.txt
+ and put it into tests/auto/corelib/io/qurluts46/testdata.
* If needed, add an entry to enum QChar::UnicodeVersion for the new
Unicode version
* In that case, also update main.cpp's initAgeMap and DATA_VERSION_S*
to match
-* Download https://www.unicode.org/Public/idna/$Version/IdnaMappingTable.txt
- and put it into data/.
-* Download https://www.unicode.org/Public/idna/$Version/IdnaTestV2.txt
- and put it into tests/auto/corelib/io/qurluts46/testdata.
* Build this project. Its binary, unicode, ignores command-line
options and assumes it is being run from this directory. When run,
it produces lots of output. If it gets as far as updating
diff --git a/util/unicode/update_ucd.sh b/util/unicode/update_ucd.sh
new file mode 100755
index 0000000000..38bc79570a
--- /dev/null
+++ b/util/unicode/update_ucd.sh
@@ -0,0 +1,78 @@
+#! /bin/sh
+#############################################################################
+##
+## Copyright (C) 2021 The Qt Company Ltd.
+## Contact: https://www.qt.io/licensing/
+##
+## This file is the build configuration utility of the Qt Toolkit.
+##
+## $QT_BEGIN_LICENSE:GPL-EXCEPT$
+## Commercial License Usage
+## Licensees holding valid commercial Qt licenses may use this file in
+## accordance with the commercial license agreement provided with the
+## Software or, alternatively, in accordance with the terms contained in
+## a written agreement between you and The Qt Company. For licensing terms
+## and conditions see https://www.qt.io/terms-conditions. For further
+## information use the contact form at https://www.qt.io/contact-us.
+##
+## GNU General Public License Usage
+## Alternatively, this file may be used under the terms of the GNU
+## General Public License version 3 as published by the Free Software
+## Foundation with exceptions as appearing in the file LICENSE.GPL3-EXCEPT
+## included in the packaging of this file. Please review the following
+## information to ensure the GNU General Public License requirements will
+## be met: https://www.gnu.org/licenses/gpl-3.0.html.
+##
+## $QT_END_LICENSE$
+##
+#############################################################################
+
+# This script downloads UCD files for an Unicode release and updates the
+# copies used by Qt. It expects the new Unicode version as argument:
+#
+# $ ./update_ucd.sh 14.0.0
+
+set -e
+
+if [ "$#" -ne 1 ]
+then
+ echo "Usage: $0 <UNICODE-VERSION>" >&2
+ exit 1
+fi
+
+VERSION="$1"
+
+qtbase=$(realpath "$(dirname "$0")"/../..)
+tmp=$(mktemp)
+
+download()
+{
+ wget -nv -P "$tmp" "$1"
+}
+
+download "https://www.unicode.org/Public/zipped/$VERSION/UCD.zip"
+unzip -q "$tmp/UCD.zip" -d "$tmp"
+download "https://www.unicode.org/Public/idna/$VERSION/IdnaMappingTable.txt"
+download "https://www.unicode.org/Public/idna/$VERSION/IdnaTestV2.txt"
+
+data_dirs="util/unicode/data \
+tests/auto/corelib/io/qurluts46/testdata \
+tests/auto/corelib/text/qtextboundaryfinder/data"
+
+for dir in $data_dirs
+do
+ find "$qtbase/$dir" -name '*.txt' -o -name '*.html'
+done | grep -vw 'ReadMe.*\.txt' | while read -r file
+do
+ base_name=$(basename "$file")
+ echo "Updating ${base_name}"
+ full_name=$(find "$tmp" -name "$base_name" -print -quit)
+ if [ "$full_name" = "" ]
+ then
+ echo "No source file for: ${base_name}" >&2
+ exit 1
+ fi
+ cp "$full_name" "$file"
+done
+
+rm -rf "$tmp"