summaryrefslogtreecommitdiffstats
path: root/src/corelib/tools/qstringiterator.qdoc
blob: caec8803f3a56d26bab7c7ee0d07124b47960ff5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
/****************************************************************************
**
** Copyright (C) 2014 Klarälvdalens Datakonsult AB, a KDAB Group company, info@kdab.com, author Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
** Contact: https://www.qt.io/licensing/
**
** This file is part of the QtCore module of the Qt Toolkit.
**
** $QT_BEGIN_LICENSE:LGPL$
** Commercial License Usage
** Licensees holding valid commercial Qt licenses may use this file in
** accordance with the commercial license agreement provided with the
** Software or, alternatively, in accordance with the terms contained in
** a written agreement between you and The Qt Company. For licensing terms
** and conditions see https://www.qt.io/terms-conditions. For further
** information use the contact form at https://www.qt.io/contact-us.
**
** GNU Lesser General Public License Usage
** Alternatively, this file may be used under the terms of the GNU Lesser
** General Public License version 3 as published by the Free Software
** Foundation and appearing in the file LICENSE.LGPL3 included in the
** packaging of this file. Please review the following information to
** ensure the GNU Lesser General Public License version 3 requirements
** will be met: https://www.gnu.org/licenses/lgpl-3.0.html.
**
** GNU General Public License Usage
** Alternatively, this file may be used under the terms of the GNU
** General Public License version 2.0 or (at your option) the GNU General
** Public license version 3 or any later version approved by the KDE Free
** Qt Foundation. The licenses are as published by the Free Software
** Foundation and appearing in the file LICENSE.GPL2 and LICENSE.GPL3
** included in the packaging of this file. Please review the following
** information to ensure the GNU General Public License requirements will
** be met: https://www.gnu.org/licenses/gpl-2.0.html and
** https://www.gnu.org/licenses/gpl-3.0.html.
**
** $QT_END_LICENSE$
**
****************************************************************************/

/*!
    \class QStringIterator
    \since 5.3
    \inmodule QtCore
    \ingroup tools

    \internal

    \brief The QStringIterator class provides a Unicode-aware iterator over QString.

    \reentrant

    QStringIterator is a Java-like, bidirectional, const iterator over the contents of a
    QString. Unlike QString's own iterators, which manage the individual UTF-16 code units,
    QStringIterator is Unicode-aware: it will transparently handle the \e{surrogate pairs}
    that may be present in a QString, and return the individual Unicode code points.

    You can create a QStringIterator that iterates over a given
    QStringView by passing the string to the QStringIterator's constructor:

    \snippet code/src_corelib_tools_qstringiterator.cpp 0

    A newly created QStringIterator will point before the first position in the
    string. It is possible to check whether the iterator can be advanced by
    calling hasNext(), and actually advance it (and obtain the next code point)
    by calling next():

    \snippet code/src_corelib_tools_qstringiterator.cpp 1

    Similarly, the hasPrevious() and previous() functions can be used to iterate backwards.

    The peekNext() and peekPrevious() functions will return the code point
    respectively after and behind the iterator's current position, but unlike
    next() and previous() they will not move the iterator.
    Similarly, the advance() and recede() functions will move the iterator
    respectively after and behind the iterator's current position, but they
    will not return the code point the iterator has moved through.

    \section1 Unicode Handling

    QString and all of its functions work in terms of UTF-16 code units. Unicode code points
    that fall outside the Basic Multilingual Plane (U+10000 to U+10FFFF) will therefore
    be represented by \e{surrogate pairs} in a QString, that is, a sequence of two
    UTF-16 code units that encode a single code point.

    QStringIterator will automatically handle surrogate pairs inside a QString,
    and return the correctly decoded code point, while also moving the iterator by
    the right amount of code units to match the decoded code points.

    For instance:

    \snippet code/src_corelib_tools_qstringiterator.cpp 2

    If the iterator is not able to decode the next code point (or the previous
    one, when iterating backwards), then it will return \c{0xFFFD}, that is,
    Unicode's replacement character (see QChar::ReplacementCharacter).
    It is possible to make QStringIterator return another value when it encounters
    a decoding problem; please refer to the each function documentation for
    more details.

    \section1 Unchecked Iteration

    It is possible to optimize iterating over a QString contents by skipping
    some checks. This is in general not safe to do, because a QString is allowed
    to contain malformed UTF-16 data; however, if we can trust a given QString,
    then we can use the optimized \e{unchecked} functions.

    QStringIterator provides the \e{unchecked} counterparts for next(),
    peekNext(), advance(), previous(), peekPrevious(), and recede():
    they're called, respectively,
    nextUnchecked(), peekNextUnchecked(), advanceUnchecked(),
    previousUnchecked(), peekPreviousUnchecked(), recedeUnchecked().
    The counterparts work exactly like the original ones,
    but they're faster as they're allowed to make certain assumptions about
    the string contents.

    \note please be extremely careful when using QStringIterator's unchecked functions,
    as using them on a string containing malformed data leads to undefined behavior.

    \sa QString, QChar
*/

/*!
    \fn QStringIterator::QStringIterator(QStringView string, qsizetype idx)

    Constructs an iterator over the contents of \a string. The iterator will point
    before position \a idx in the string.

    The string view \a string must remain valid while the iterator is being used.
*/

/*!
    \fn QStringIterator::QStringIterator(const QChar *begin, const QChar *end)

    Constructs an iterator which iterates over the range from \a begin to \a end.
    The iterator will point before \a begin.

    The range from \a begin to \a end must remain valid while the iterator is being used.
*/

/*!
    \fn QString::const_iterator QStringIterator::position() const

    Returns the current position of the iterator.
*/

/*!
    \fn void QStringIterator::setPosition(QString::const_iterator position)

    Sets the iterator's current position to \a position, which must be inside
    of the iterable range.
*/

/*!
    \fn bool QStringIterator::hasNext() const

    Returns true if the iterator has not reached the end of the valid iterable range
    and therefore can move forward; false otherwise.

    \sa next()
*/

/*!
    \fn void QStringIterator::advance()

    Advances the iterator by one Unicode code point.

    \note calling this function when the iterator is past the end of the iterable range
    leads to undefined behavior.

    \sa next(), hasNext()
*/

/*!
    \fn void QStringIterator::advanceUnchecked()

    Advances the iterator by one Unicode code point.

    \note calling this function when the iterator is past the end of the iterable range
    or on a QString containing malformed UTF-16 data leads to undefined behavior.

    \sa advance(), next(), hasNext()
*/

/*!
    \fn uint QStringIterator::peekNextUnchecked() const

    Returns the Unicode code point that is immediately after the iterator's current
    position. The current position is not changed.

    \note calling this function when the iterator is past the end of the iterable range
    or on a QString containing malformed UTF-16 data leads to undefined behavior.

    \sa peekNext(), next(), hasNext()
*/

/*!
    \fn uint QStringIterator::peekNext(uint invalidAs = QChar::ReplacementCharacter) const

    Returns the Unicode code point that is immediately after the iterator's current
    position. The current position is not changed.

    If the iterator is not able to decode the UTF-16 data after the iterator's current
    position, this function returns \a invalidAs (by default, QChar::ReplacementCharacter,
    which corresponds to \c{U+FFFD}).

    \note calling this function when the iterator is past the end of the iterable range
    leads to undefined behavior.

    \sa next(), hasNext()
*/

/*!
    \fn uint QStringIterator::nextUnchecked()

    Advances the iterator's current position by one Unicode code point,
    and returns the Unicode code point that gets pointed by the iterator.

    \note calling this function when the iterator is past the end of the iterable range
    or on a QString containing malformed UTF-16 data leads to undefined behavior.

    \sa next(), hasNext()
*/

/*!
    \fn uint QStringIterator::next(uint invalidAs = QChar::ReplacementCharacter)

    Advances the iterator's current position by one Unicode code point,
    and returns the Unicode code point that gets pointed by the iterator.

    If the iterator is not able to decode the UTF-16 data at the iterator's current
    position, this function returns \a invalidAs (by default, QChar::ReplacementCharacter,
    which corresponds to \c{U+FFFD}).

    \note calling this function when the iterator is past the end of the iterable range
    leads to undefined behavior.

    \sa peekNext(), hasNext()
*/


/*!
    \fn bool QStringIterator::hasPrevious() const

    Returns true if the iterator is after the beginning of the valid iterable range
    and therefore can move backwards; false otherwise.

    \sa previous()
*/

/*!
    \fn void QStringIterator::recede()

    Moves the iterator back by one Unicode code point.

    \note calling this function when the iterator is before the beginning of the iterable range
    leads to undefined behavior.

    \sa previous(), hasPrevious()
*/

/*!
    \fn void QStringIterator::recedeUnchecked()

    Moves the iterator back by one Unicode code point.

    \note calling this function when the iterator is before the beginning of the iterable range
    or on a QString containing malformed UTF-16 data leads to undefined behavior.

    \sa recede(), previous(), hasPrevious()
*/

/*!
    \fn uint QStringIterator::peekPreviousUnchecked() const

    Returns the Unicode code point that is immediately before the iterator's current
    position. The current position is not changed.

    \note calling this function when the iterator is before the beginning of the iterable range
    or on a QString containing malformed UTF-16 data leads to undefined behavior.

    \sa previous(), hasPrevious()
*/

/*!
    \fn uint QStringIterator::peekPrevious(uint invalidAs = QChar::ReplacementCharacter) const

    Returns the Unicode code point that is immediately before the iterator's current
    position. The current position is not changed.

    If the iterator is not able to decode the UTF-16 data before the iterator's current
    position, this function returns \a invalidAs (by default, QChar::ReplacementCharacter,
    which corresponds to \c{U+FFFD}).

    \note calling this function when the iterator is before the beginning of the iterable range
    leads to undefined behavior.

    \sa previous(), hasPrevious()
*/

/*!
    \fn uint QStringIterator::previousUnchecked()

    Moves the iterator's current position back by one Unicode code point,
    and returns the Unicode code point that gets pointed by the iterator.

    \note calling this function when the iterator is before the beginning of the iterable range
    or on a QString containing malformed UTF-16 data leads to undefined behavior.

    \sa previous(), hasPrevious()
*/

/*!
    \fn uint QStringIterator::previous(uint invalidAs = QChar::ReplacementCharacter)

    Moves the iterator's current position back by one Unicode code point,
    and returns the Unicode code point that gets pointed by the iterator.

    If the iterator is not able to decode the UTF-16 data at the iterator's current
    position, this function returns \a invalidAs (by default, QChar::ReplacementCharacter,
    which corresponds to \c{U+FFFD}).

    \note calling this function when the iterator is before the beginning of the iterable range
    leads to undefined behavior.

    \sa peekPrevious(), hasPrevious()
*/