summaryrefslogtreecommitdiffstats
path: root/src/xml/doc/src/xml-processing.qdoc
blob: 226eeb196d51949ae59a3933c1a0ce829e31bde5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
/****************************************************************************
**
** Copyright (C) 2020 The Qt Company Ltd.
** Contact: https://www.qt.io/licensing/
**
** This file is part of the documentation of the Qt Toolkit.
**
** $QT_BEGIN_LICENSE:FDL$
** Commercial License Usage
** Licensees holding valid commercial Qt licenses may use this file in
** accordance with the commercial license agreement provided with the
** Software or, alternatively, in accordance with the terms contained in
** a written agreement between you and The Qt Company. For licensing terms
** and conditions see https://www.qt.io/terms-conditions. For further
** information use the contact form at https://www.qt.io/contact-us.
**
** GNU Free Documentation License Usage
** Alternatively, this file may be used under the terms of the GNU Free
** Documentation License version 1.3 as published by the Free Software
** Foundation and appearing in the file included in the packaging of
** this file. Please review the following information to ensure
** the GNU Free Documentation License version 1.3 requirements
** will be met: https://www.gnu.org/licenses/fdl-1.3.html.
** $QT_END_LICENSE$
**
****************************************************************************/

/*!
    \group xml-tools
    \title XML Classes

    \brief Classes that support XML.

    These classes are relevant to XML users.

    \generatelist{related}
*/

/*!
    \page xml-processing.html
    \title XML Processing

    \brief An Overview of the XML processing facilities in Qt.

    Qt provides two general-purpose sets of APIs to read and write well-formed
    XML: \l{XML Streaming}{stream based} and
    \l{Working with the DOM Tree}{DOM based}.

    Qt also provides specific support for some XML dialects. For instance, the
    Qt SVG module provides the QSvgRenderer and QSvgGenerator classes to read
    and write a subset of SVG, an XML-based file
    format. Qt also provides helper functions that may be useful to
    those working with XML and XHTML: see Qt::escape() and
    Qt::convertFromPlainText().

    \section1 Topics:

    \list
    \li \l {Classes for XML Processing}
    \li \l {An Introduction to Namespaces}
    \li \l {XML Streaming}
    \li \l {Working with the DOM Tree}
    \endlist

    \section1 Classes for XML Processing

    These classes are relevant to XML users.

    \annotatedlist xml-tools
*/

/*!
    \page xml-namespaces.html
    \title An Introduction to Namespaces
    \target namespaces

    \nextpage XML Streaming

    Parts of the Qt XML module documentation assume that you are familiar
    with XML namespaces. Here we present a brief introduction; skip to
    \l{#namespacesConventions}{Qt XML documentation conventions}
    if you already know this material.

    Namespaces are a concept introduced into XML to allow a more modular
    design. With their help data processing software can easily resolve
    naming conflicts in XML documents.

    Consider the following example:

    \snippet code/doc_src_qtxml.qdoc 6

    Here we find three different uses of the name \e title. If you wish to
    process this document you will encounter problems because each of the
    \e titles should be displayed in a different manner -- even though
    they have the same name.

    The solution would be to have some means of identifying the first
    occurrence of \e title as the title of a book, i.e. to use the \e
    title element of a book namespace to distinguish it from, for example,
    the chapter title, e.g.:
    \snippet code/doc_src_qtxml.qdoc 7

    \e book in this case is a \e prefix denoting the namespace.

    Before we can apply a namespace to element or attribute names we must
    declare it.

    Namespaces are URIs like \e http://www.example.com/fnord/book/. This
    does not mean that data must be available at this address; the URI is
    simply used to provide a unique name.

    We declare namespaces in the same way as attributes; strictly speaking
    they \e are attributes. To make for example \e
    http://www.example.com/fnord/ the document's default XML namespace \e
    xmlns we write

    \snippet code/doc_src_qtxml.qdoc 8

    To distinguish the \e http://www.example.com/fnord/book/ namespace from
    the default, we must supply it with a prefix:

    \snippet code/doc_src_qtxml.qdoc 9

    A namespace that is declared like this can be applied to element and
    attribute names by prepending the appropriate prefix and a ":"
    delimiter. We have already seen this with the \e book:title element.

    Element names without a prefix belong to the default namespace. This
    rule does not apply to attributes: an attribute without a prefix does
    not belong to any of the declared XML namespaces at all. Attributes
    always belong to the "traditional" namespace of the element in which
    they appear. A "traditional" namespace is not an XML namespace, it
    simply means that all attribute names belonging to one element must be
    different. Later we will see how to assign an XML namespace to an
    attribute.

    Due to the fact that attributes without prefixes are not in any XML
    namespace there is no collision between the attribute \e title (that
    belongs to the \e author element) and for example the \e title element
    within a \e chapter.

    Let's clarify this with an example:
    \snippet code/doc_src_qtxml.qdoc 10

    Within the \e document element we have two namespaces declared. The
    default namespace \e http://www.example.com/fnord/ applies to the \e
    book element, the \e chapter element, the appropriate \e title element
    and of course to \e document itself.

    The \e book:author and \e book:title elements belong to the namespace
    with the URI \e http://www.example.com/fnord/book/.

    The two \e book:author attributes \e title and \e name have no XML
    namespace assigned. They are only members of the "traditional"
    namespace of the element \e book:author, meaning that for example two
    \e title attributes in \e book:author are forbidden.

    In the above example we circumvent the last rule by adding a \e title
    attribute from the \e http://www.example.com/fnord/ namespace to \e
    book:author: the \e fnord:title comes from the namespace with the
    prefix \e fnord that is declared in the \e book:author element.

    Clearly the \e fnord namespace has the same namespace URI as the
    default namespace. So why didn't we simply use the default namespace
    we'd already declared? The answer is quite complex:
    \list
    \li attributes without a prefix don't belong to any XML namespace at
    all, not even to the default namespace;
    \li additionally omitting the prefix would lead to a \e title-title clash;
    \li writing it as \e xmlns:title would declare a new namespace with the
    prefix \e title instead of applying the default \e xmlns namespace.
    \endlist

    With the Qt XML classes elements and attributes can be accessed in two
    ways: either by referring to their qualified names consisting of the
    namespace prefix and the "real" name (or \e local name) or by the
    combination of local name and namespace URI.

    More information on XML namespaces can be found at
    \l http://www.w3.org/TR/REC-xml-names/.

    \target namespacesConventions
    \section1 Conventions Used in the Qt XML Documentation

    The following terms are used to distinguish the parts of names within
    the context of namespaces:
    \list
    \li  The \e {qualified name}
        is the name as it appears in the document. (In the above example \e
        book:title is a qualified name.)
    \li  A \e {namespace prefix} in a qualified name
        is the part to the left of the ":". (\e book is the namespace prefix in
        \e book:title.)
    \li  The \e {local part} of a name (also referred to as the \e {local
        name}) appears to the right of the ":". (Thus \e title is the
        local part of \e book:title.)
    \li  The \e {namespace URI} ("Uniform Resource Identifier") is a unique
        identifier for a namespace. It looks like a URL
        (e.g. \e http://www.example.com/fnord/ ) but does not require
        data to be accessible by the given protocol at the named address.
    \endlist

    Elements without a ":" (like \e chapter in the example) do not have a
    namespace prefix. In this case the local part and the qualified name
    are identical (i.e. \e chapter).

    \sa {DOM Bookmarks Example}
*/

/*!
    \page xml-streaming.html
    \title XML Streaming

    \previouspage An Introduction to Namespaces
    \nextpage Working with the DOM Tree

    Qt provides two classes for reading and writing XML through a simple streaming
    API:  QXmlStreamReader and QXmlStreamWriter.

    A stream reader reports an XML document as a stream
    of tokens. This differs from SAX as SAX applications provide handlers to
    receive XML events from the parser whereas the QXmlStreamReader drives the
    loop, pulling tokens from the reader when they are needed.
    This pulling approach makes it possible to build recursive descent parsers,
    allowing XML parsing code to be split into different methods or classes.

    QXmlStreamReader is a well-formed XML 1.0 parser that excludes external
    parsed entities. Hence, data provided by the stream reader adheres to the
    W3C's criteria for well-formed XML, as long as no error occurs. Otherwise,
    functions such as \l{QXmlStreamReader::atEnd()}{atEnd()},
    \l{QXmlStreamReader::error()}{error()} and \l{QXmlStreamReader::hasError()}
    {hasError()} can be used to check and view the errors.

    An example of QXmlStreamReader implementation would be the \c XbelReader in
    \l{QXmlStream Bookmarks Example}, which wraps a QXmlStreamReader.
    The constructor takes \a treeWidget as a parameter and the class has Xbel
    specific functions:

    \snippet streambookmarks/xbelreader.h 1

    \dots
    \snippet streambookmarks/xbelreader.h 2
    \dots

    The \c read() function accepts a QIODevice and sets it with
    \l{QXmlStreamReader::setDevice()}{setDevice()}. The
    \l{QXmlStreamReader::raiseError()}{raiseError()} function is used to
    display a custom error message, inidicating that the file's version
    is incorrect.

    \snippet streambookmarks/xbelreader.cpp 1

    The pendent to QXmlStreamReader is QXmlStreamWriter, which provides an XML
    writer with a simple streaming API. QXmlStreamWriter operates on a
    QIODevice and has specialized functions for all XML tokens or events you
    want to write, such as \l{QXmlStreamWriter::writeDTD()}{writeDTD()},
    \l{QXmlStreamWriter::writeCharacters()}{writeCharacters()},
    \l{QXmlStreamWriter::writeComment()}{writeComment()} and so on.

    To write XML document with QXmlStreamWriter, you start a document with the
    \l{QXmlStreamWriter::writeStartDocument()}{writeStartDocument()} function
    and end it with \l{QXmlStreamWriter::writeEndDocument()}
    {writeEndDocument()}, which implicitly closes all remaining open tags.
    Element tags are opened with \l{QXmlStreamWriter::writeStartDocument()}
    {writeStartDocument()} and followed by
    \l{QXmlStreamWriter::writeAttribute()}{writeAttribute()} or
    \l{QXmlStreamWriter::writeAttributes()}{writeAttributes()},
    element content, and then \l{QXmlStreamWriter::writeEndDocument()}
    {writeEndDocument()}. Also, \l{QXmlStreamWriter::writeEmptyElement()}
    {writeEmptyElement()} can be used to write empty elements.

    Element content comprises characters, entity references or nested elements.
    Content can be written with \l{QXmlStreamWriter::writeCharacters()}
    {writeCharacters()}, a function that also takes care of escaping all
    forbidden characters and character sequences,
    \l{QXmlStreamWriter::writeEntityReference()}{writeEntityReference()},
    or subsequent calls to \l{QXmlStreamWriter::writeStartElement()}
    {writeStartElement()}.

    The \c XbelWriter class from \l{QXmlStream Bookmarks Example} wraps a
    QXmlStreamWriter. Its \c writeFile() function illustrates the core
    functions of QXmlStreamWriter mentioned above:

    \snippet streambookmarks/xbelwriter.cpp 1
*/

/*!
    \page xml-dom.tml
    \title Working with the DOM Tree
    \target dom

    \previouspage XML Streaming

    DOM Level 2 is a W3C Recommendation for XML interfaces that maps the
    constituents of an XML document to a tree structure. The specification
    of DOM Level 2 can be found at \l{http://www.w3.org/DOM/}.

    \target domIntro
    \section1 Introduction to DOM

    DOM provides an interface to access and change the content and
    structure of an XML file. It makes a hierarchical view of the document
    (a tree view). Thus -- in contrast to the streaming API provided
    by QXmlStreamReader -- an object
    model of the document is resident in memory after parsing which makes
    manipulation easy.

    All DOM nodes in the document tree are subclasses of \l QDomNode. The
    document itself is represented as a \l QDomDocument object.

    Here are the available node classes and their potential child classes:

    \list
    \li \l QDomDocument: Possible children are
            \list
            \li \l QDomElement (at most one)
            \li \l QDomProcessingInstruction
            \li \l QDomComment
            \li \l QDomDocumentType
            \endlist
    \li \l QDomDocumentFragment: Possible children are
            \list
            \li \l QDomElement
            \li \l QDomProcessingInstruction
            \li \l QDomComment
            \li \l QDomText
            \li \l QDomCDATASection
            \li \l QDomEntityReference
            \endlist
    \li \l QDomDocumentType: No children
    \li \l QDomEntityReference: Possible children are
            \list
            \li \l QDomElement
            \li \l QDomProcessingInstruction
            \li \l QDomComment
            \li \l QDomText
            \li \l QDomCDATASection
            \li \l QDomEntityReference
            \endlist
    \li \l QDomElement: Possible children are
            \list
            \li \l QDomElement
            \li \l QDomText
            \li \l QDomComment
            \li \l QDomProcessingInstruction
            \li \l QDomCDATASection
            \li \l QDomEntityReference
            \endlist
    \li \l QDomAttr: Possible children are
            \list
            \li \l QDomText
            \li \l QDomEntityReference
            \endlist
    \li \l QDomProcessingInstruction: No children
    \li \l QDomComment: No children
    \li \l QDomText: No children
    \li \l QDomCDATASection: No children
    \li \l QDomEntity: Possible children are
            \list
            \li \l QDomElement
            \li \l QDomProcessingInstruction
            \li \l QDomComment
            \li \l QDomText
            \li \l QDomCDATASection
            \li \l QDomEntityReference
            \endlist
    \li \l QDomNotation: No children
    \endlist

    With \l QDomNodeList and \l QDomNamedNodeMap two collection classes
    are provided: \l QDomNodeList is a list of nodes,
    and \l QDomNamedNodeMap is used to handle unordered sets of nodes
    (often used for attributes).

    The \l QDomImplementation class allows the user to query features of the
    DOM implementation.

    To get started please refer to the \l QDomDocument documentation.
    You might also want to take a look at the \l{DOM Bookmarks Example},
    which illustrates how to read and write an XML bookmark file (XBEL)
    using DOM.
*/