summaryrefslogtreecommitdiffstats
path: root/doc/src/serialization.qdoc
blob: 7d1365f6eb2b6b7a8634523a22fcf49e534afc07 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
/****************************************************************************
**
** Copyright (C) 2011 Nokia Corporation and/or its subsidiary(-ies).
** All rights reserved.
** Contact: Nokia Corporation (qt-info@nokia.com)
**
** This file is part of the documentation of JsonStream
**
** $QT_BEGIN_LICENSE:FDL$
** GNU Free Documentation License
** Alternatively, this file may be used under the terms of the GNU Free
** Documentation License version 1.3 as published by the Free Software
** Foundation and appearing in the file included in the packaging of
** this file.
**
** Other Usage
** Alternatively, this file may be used in accordance with the terms
** and conditions contained in a signed written agreement between you
** and Nokia.
**
**
**
**
** $QT_END_LICENSE$
**
****************************************************************************/

/*!
\title JSON Stream Serialization
\target serialization
\page serialization.html

\section1 Stream serialization

One key challenge with JSON is that there are a number of different
ways that JSON can be serialized to be sent over a socket connection:
\list
 \li UTF-8 encoded (the default)
 \li UTF-16 LE or BE
 \li UTF-32 LE or BE
 \li \l {http://bsonspec.org} {BSON} (Binary JSON)
 \li \l Qt Json [ssh://codereview.qt-project.org:29418/playground/qtbinaryjson.git]
\endlist
For a discussion of the UTF encoding formats, see \l
{http://www.ietf.org/rfc/rfc4627} {RFC4627}.

The JSON stream reference supports all standard encoding formats by
\b{auto-detection}.  The server class assumes that communication
will be initiated by the client. The initial bytes received are
matched to the signature of one of the serialization techniques and
the connection is set to that format.

To be specific, the server matches (table data from \l
{http://www.ietf.org/rfc/rfc4627} {RFC4627}):
\table
\header
  \li Encoding
  \li Bytes
  \li Discussion
\row
  \li UTF-8
  \li 7B xx yy zz
  \li First byte should be the '{' character, followed by whitespace
     and a '"' quotation mark.
\row
  \li BSON
  \li 62 73 6F 6E
  \li First four bytes are 'bson'.  Strictly speaking, this is
     not the true BSON format (which starts with an int32 length)
     but in the interests of autodetection we've enforced this
     requirement.  The BSON packet follows.
\row
  \li QBJS
  \li 71 62 6A 73
  \li First four bytes are 'qbjs'.  This matches the standard
     QtJson::JsonDocument header.
\row
  \li UTF-32BE
  \li 00 00 00 7B
  \li First four bytes should be the '{' character
\row
  \li UTF-32LE
  \li 7B 00 00 00
  \li First four bytes should be the '{' character.
\row
  \li Raw UTF-16BE
  \li 00 7B 00 xx
  \li First two bytes should be the '{' character.
\row
  \li Raw UTF-16LE
  \li 7B 00 xx 00
  \li First two bytes should be the '{' character.
\row
  \li UTF-16BE with BOM
  \li FE FF 00 7B
  \li U+FEFF + '{'
\row
  \li UTF-16LE with BOM
  \li FF FE 7B 00
  \li U+FEFF + '{'
\endtable

Clearly, there is a danger that the BSON encoding format could be
confused with UTF-32LE, UTF-16BE, UTF-16LE, or even UTF-8 (for
example, "7B 20 7D 00" which is '{ }'). To avoid confusion, it is
recommended that UTF-16 encodings send a BOM (U+FEFF) character to
start their stream.  When in doubt, the protcol will select UTF
encoding formats before BSON, which means that UTF-32LE is
particularly susceptable to being done incorrectly.

\warning We probably will disallow UTF-32 encoding formats to resolve
         ambiguity.  Or, we may require a BSON header to be
         transmitted to avoid confusion.

Packet sizes are limited by this protocol.  If too large of a packet
is received (typically 65535 bytes), the connection will be dropped.

*/