summaryrefslogtreecommitdiffstats
path: root/src/corelib/doc/src/qt6-changes.qdoc
diff options
context:
space:
mode:
authorSona Kurazyan <sona.kurazyan@qt.io>2022-07-05 16:21:41 +0200
committerSona Kurazyan <sona.kurazyan@qt.io>2022-07-20 13:15:59 +0200
commit93f7291387c03367e828b16299ddcbaf1f804e25 (patch)
tree9b0e3109e4ebbdc7fc7188de29a3f89e3671d05f /src/corelib/doc/src/qt6-changes.qdoc
parent581a342a3c6b62ccb7b9df8a9985460fa366e265 (diff)
Include the QRegularExpression porting docs in Qt 6 porting guide
The instructions for porting away from QRegExp to QRegularExpression in the Qt 6 porting guide were mostly copied from the similar docs for QRegExp, which are moved to doc/global/includes/corelib/port-from-qregexp.qdocinc. The later now covers everything that the docs from porting guide did and doesn't have the issues listed in QTBUG-89702. Remove the old docs and include the docs from doc/global/includes instead. Task-number: QTBUG-89702 Pick-to: 6.4 6.3 6.2 Change-Id: Ifdb79d5775bc0cadd02c21299d58adb27ae13337 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Diffstat (limited to 'src/corelib/doc/src/qt6-changes.qdoc')
-rw-r--r--src/corelib/doc/src/qt6-changes.qdoc251
1 files changed, 6 insertions, 245 deletions
diff --git a/src/corelib/doc/src/qt6-changes.qdoc b/src/corelib/doc/src/qt6-changes.qdoc
index cf1cc01b1f..5073a72d6a 100644
--- a/src/corelib/doc/src/qt6-changes.qdoc
+++ b/src/corelib/doc/src/qt6-changes.qdoc
@@ -528,252 +528,13 @@
\section2 The QRegularExpression class
- In Qt6, all methods taking the \c QRegExp got removed from our code-base.
- Therefore it is very likely that you will have to port your application or
- library to \l QRegularExpression.
+ In Qt 6, the \c QRegExp type has been retired to the Qt5Compat module
+ and all Qt APIs using it have been removed from other modules.
+ Client code which used it can be ported to use \l QRegularExpression
+ in its place. As \l QRegularExpression is present already in Qt 5,
+ this can be done and tested before migration to Qt 6.
- \l QRegularExpression implements Perl-compatible regular expressions. It
- fully supports Unicode. For an overview of the regular expression syntax
- supported by \l QRegularExpression, please refer to the aforementioned
- pcrepattern(3) man page. A regular expression is made up of two things: a
- pattern string and a set of pattern options that change the meaning of the
- pattern string.
-
- There are some subtle differences between \l QRegularExpression and \c
- QRegExp that will be explained by this document to ease the porting effort.
-
- \l QRegularExpression is more strict when it comes to the syntax of the
- regular expression. Therefore it is always good to check the expression
- for \l {QRegularExpression::isValid}{validity}.
-
- \l QRegularExpression can almost always be declared const (except when the
- pattern changes), while \c QRegExp almost never could be.
-
- There is no replacement for the \l {QRegExp::CaretMode}{CaretMode}
- enumeration. The \l {QRegularExpression::AnchoredMatchOption} match option
- can be used to emulate the QRegExp::CaretAtOffset behavior. There is no
- equivalent for the other QRegExp::CaretMode modes.
-
- \l QRegularExpression supports only Perl-compatible regular expressions.
- Still, it does not support all the features available in Perl-compatible
- regular expressions. The most notable one is the fact that duplicated names
- for capturing groups are not supported, and using them can lead to
- undefined behavior. This may change in a future version of Qt.
-
- \section3 Wildcard matching
-
- There is no direct way to do wildcard matching in \l QRegularExpression.
- However, the \l {QRegularExpression::wildcardToRegularExpression} method
- is provided to translate glob patterns into a Perl-compatible regular
- expression that can be used for that purpose.
-
- For example, if you have code like
-
- \code
- QRegExp wildcard("*.txt");
- wildcard.setPatternSyntax(QRegExp::Wildcard);
- \endcode
-
- you can rewrite it as
-
- \code
- auto wildcard = QRegularExpression(QRegularExpression::wildcardToRegularExpression("*.txt"));
- \endcode
-
- Please note though that not all shell like wildcard pattern might be
- translated in a way you would expect it. The following example code will
- silently break if simply converted using the above mentioned function:
-
- \code *
- const QString fp1("C:/Users/dummy/files/content.txt");
- const QString fp2("/home/dummy/files/content.txt");
-
- QRegExp re1("\1/files/*");
- re1.setPatternSyntax(QRegExp::Wildcard);
- ... = re1.exactMatch(fp1); // returns true
- ... = re1.exactMatch(fp2); // returns true
-
- // but converted with QRegularExpression::wildcardToRegularExpression()
-
- QRegularExpression re2(QRegularExpression::wildcardToRegularExpression("\1/files/*"));
- ... = re2.match(fp1).hasMatch(); // returns false
- ... = re2.match(fp2).hasMatch(); // returns false
- \endcode
-
- \section3 Searching forward
-
- Forward searching inside a string was usually implemented with a loop using
- \c {QRegExp::indexIn} and a growing offset, but can now be easily implemented
- with \l QRegularExpressionMatchIterator or \l {QString::indexOf}.
-
- For example, if you have code like
-
- \code
- QString subject("the quick fox");
-
- int offset = 0;
- QRegExp re("(\\w+)");
- while ((offset = re.indexIn(subject, offset)) != -1) {
- offset += re.matchedLength();
- // ...
- }
- \endcode
-
- you can rewrite it as
-
- \code
- QRegularExpression re("(\\w+)");
- QString subject("the quick fox");
-
- QRegularExpressionMatchIterator i = re.globalMatch(subject);
- while (i.hasNext()) {
- QRegularExpressionMatch match = i.next();
- // ...
- }
-
- // or alternatively using QString::indexOf
-
- qsizetype from = 0;
- QRegularExpressionMatch match;
- while ((from = subject.indexOf(re, from, &match)) != -1) {
- from += match.capturedLength();
- // ...
- }
- \endcode
-
- \section3 Searching backwards
-
- Backwards searching inside a string was usually often implemented as a loop
- over \c {QRegExp::lastIndexIn}, but can now be easily implemented using
- \l {QString::lastIndexOf} and \l {QRegularExpressionMatch}.
-
- \note \l QRegularExpressionMatchIterator is not capable of performing a
- backwards search.
-
- For example, if you have code like
-
- \code
- int offset = -1;
- QString subject("Lorem ipsum dolor sit amet, consetetur sadipscing.");
-
- QRegExp re("\\s+([ids]\\w+)");
- while ((offset = re.lastIndexIn(subject, offset)) != -1) {
- --offset;
- // ...
- }
- \endcode
-
- you can rewrite it as
-
- \code
- qsizetype from = -1;
- QString subject("Lorem ipsum dolor sit amet, consetetur sadipscing.");
-
- QRegularExpressionMatch match;
- QRegularExpression re("\\s+([ids]\\w+)");
- while ((from = subject.lastIndexOf(re, from, &match)) != -1) {
- --from;
- // ...
- }
- \endcode
-
- \section3 exactMatch vs. match.hasMatch
-
- \c {QRegExp::exactMatch} served two purposes: it exactly matched a regular
- expression against a subject string, and it implemented partial matching.
- Exact matching indicates whether the regular expression matches the entire
- subject string. For example:
-
- \code
- QString source("abc123");
-
- QRegExp("\\d+").exactMatch(source); // returns false
- QRegExp("[a-z]+\\d+").exactMatch(source); // returns true
-
- QRegularExpression("\\d+").match(source).hasMatch(); // returns true
- QRegularExpression("[a-z]+\\d+").match(source).hasMatch(); // returns true
- \endcode
-
- Exact matching is not reflected in \l QRegularExpression. If you want to be
- sure that the subject string matches the regular expression exactly, you
- can wrap the pattern using the \l {QRegularExpression::anchoredPattern}
- function:
-
- \code
- QString source("abc123");
-
- QString pattern("\\d+");
- QRegularExpression(pattern).match(source).hasMatch(); // returns true
-
- pattern = QRegularExpression::anchoredPattern(pattern);
- QRegularExpression(pattern).match(source).hasMatch(); // returns false
- \endcode
-
- \section3 Minimal matching
-
- \c QRegExp::setMinimal() implemented minimal matching by simply reversing
- the greediness of the quantifiers (\c QRegExp did not support lazy
- quantifiers, like *?, +?, etc.). QRegularExpression instead does support
- greedy, lazy and possessive quantifiers. The \l
- {QRegularExpression::InvertedGreedinessOption} pattern option can be useful
- to emulate the effects of \c QRegExp::setMinimal(): if enabled, it inverts
- the greediness of quantifiers (greedy ones become lazy and vice versa).
-
- \section3 Different pattern syntax
-
- Porting a regular expression from \c QRegExp to \l QRegularExpression may
- require changes to the pattern itself. Therefore it is recommended to check
- the pattern used with the \l {QRegularExpression::isValid} method. This is
- especially important for user provided pattern or pattern not controlled by
- the developer.
-
- In other cases, a pattern ported from \c QRegExp to \l QRegularExpression may
- silently change semantics. Therefore, it is necessary to review the patterns
- used. The most notable cases of silent incompatibility are:
-
- \list
- \li Curly braces are needed in order to use a hexadecimal escape like \c
- {\xHHHH} with more than 2 digits. A pattern like \c {\x2022} needs
- to be ported to \c {\x{2022}}, or it will match a space \c {(0x20)}
- followed by the string \c {"22"}. In general, it is highly recommended
- to always use curly braces with the \c {\x} escape, no matter the
- amount of digits specified.
-
- \li A \c{0-to-n} quantification like \c {{,n}} needs to be ported to
- \c {{0,n}} to preserve semantics. Otherwise, a pattern such as
- \c {\d{,3}} would actually match a digit followed by the exact
- string \c {"{,3}"}.
- \endlist
-
- \section3 Partial Matching
-
- When using \c QRegExp::exactMatch(), if an exact match was not found, one
- could still find out how much of the subject string was matched by the
- regular expression by calling \c QRegExp::matchedLength(). If the returned
- length was equal to the subject string's length, then one could conclude
- that a partial match was found.
- \l QRegularExpression supports partial matching explicitly by means of the
- appropriate \l {QRegularExpression::MatchType}.
-
- \section3 Global matching
-
- Due to limitations of the \c QRegExp API it was impossible to implement
- global matching correctly (that is, like Perl does). In particular, patterns
- that can match zero characters (like "a*") are problematic. \l
- {QRegularExpression::wildcardToRegularExpression} implements Perl global
- match correctly, and the returned iterator can be used to examine each
- result.
-
- \section3 Unicode properties support
-
- When using \c QRegExp, character classes such as \c{\w}, \c{\d}, etc. match
- characters with the corresponding Unicode property: for instance, \c{\d}
- matches any character with the Unicode Nd (decimal digit) property. Those
- character classes only match ASCII characters by default. When using \l
- QRegularExpression: for instance, \c{\d} matches exactly a character in the
- 0-9 ASCII range. It is possible to change this behavior by using the \l
- {QRegularExpression::UseUnicodePropertiesOption}
- pattern option.
+ \include corelib/port-from-qregexp.qdocinc porting-to-qregularexpression
\section2 The QRegExp class