1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
|
/****************************************************************************
**
** Copyright (C) 2015 The Qt Company Ltd.
** Contact: http://www.qt.io/licensing/
**
** This file is part of the documentation of the Qt Toolkit.
**
** $QT_BEGIN_LICENSE:FDL$
** Commercial License Usage
** Licensees holding valid commercial Qt licenses may use this file in
** accordance with the commercial license agreement provided with the
** Software or, alternatively, in accordance with the terms contained in
** a written agreement between you and The Qt Company. For licensing terms
** and conditions see http://www.qt.io/terms-conditions. For further
** information use the contact form at http://www.qt.io/contact-us.
**
** GNU Free Documentation License Usage
** Alternatively, this file may be used under the terms of the GNU Free
** Documentation License version 1.3 as published by the Free Software
** Foundation and appearing in the file included in the packaging of
** this file. Please review the following information to ensure
** the GNU Free Documentation License version 1.3 requirements
** will be met: http://www.gnu.org/copyleft/fdl.html.
** $QT_END_LICENSE$
**
****************************************************************************/
/*!
\title Qt Quick Scene Graph
\page qtquick-visualcanvas-scenegraph.html
\section1 The Scene Graph in Qt Quick
Qt Quick 2 makes use of a dedicated scene graph based and a series of
adaptations of which the default uses OpenGL ES 2.0 or OpenGL 2.0 for
its rendering. Using a scene graph for graphics rather than the
traditional imperative painting systems (QPainter and
similar), means the scene to be rendered can be retained between
frames and the complete set of primitives to render is known before
rendering starts. This opens up for a number of optimizations, such as
batch rendering to minimize state changes and discarding obscured
primitives.
For example, say a user-interface contains a list of ten items
where each item has a background color, an icon and a text. Using the
traditional drawing techniques, this would result in 30 draw calls and
a similar amount of state changes. A scene graph, on the other hand,
could reorganize the primitives to render such that all backgrounds
are drawn in one call, then all icons, then all the text, reducing the
total amount of draw calls to only 3. Batching and state change
reduction like this can greatly improve performance on some hardware.
The scene graph is closely tied to Qt Quick 2.0 and can not be used
stand-alone. The scene graph is managed and rendered by the
QQuickWindow class and custom Item types can add their graphical
primitives into the scene graph through a call to
QQuickItem::updatePaintNode().
The scene graph is a graphical representation of the Item scene, an
independent structure that contains enough information to render all
the items. Once it has been set up, it can be manipulated and rendered
independently of the state of the items. On many platforms, the scene
graph will even be rendered on a dedicated render thread while the GUI
thread is preparing the next frame's state.
\note Much of the information listed on this page is specific to the
default OpenGL adaptation of the Qt Quick Scene graph. For more information
about the different scene graph adaptations see
\l{qtquick-visualcanvas-adaptations.html}{Scene Graph Adaptations}.
\section1 Qt Quick Scene Graph Structure
The scene graph is composed of a number of predefined node types, each
serving a dedicated purpose. Although we refer to it as a scene graph,
a more precise definition is node tree. The tree is built from
QQuickItem types in the QML scene and internally the scene is then
processed by a renderer which draws the scene. The nodes themselves do
\b not contain any active drawing code nor virtual \c paint()
function.
Even though the node tree is mostly built internally by the existing
Qt Quick QML types, it is possible for users to also add complete
subtrees with their own content, including subtrees that represent 3D
models.
\section2 Nodes
The most important node for users is the \l QSGGeometryNode. It is
used to define custom graphics by defining its geometry and
material. The geometry is defined using \l QSGGeometry and describes
the shape or mesh of the graphical primitive. It can be a line, a
rectangle, a polygon, many disconnected rectangles, or complex 3D
mesh. The material defines how the pixels in this shape are filled.
A node can have any number of children and geometry nodes will be
rendered so they appear in child-order with parents behind their
children. \note This does not say anything about the actual rendering
order in the renderer. Only the visual output is guaranteed.
The available nodes are:
\annotatedlist{qtquick-scenegraph-nodes}
Custom nodes are added to the scene graph by subclassing
QQuickItem::updatePaintNode() and setting the
\l {QQuickItem::ItemHasContents} flag.
\warning It is crucial that OpenGL operations and interaction with the
scene graph happens exclusively on the render thread, primarily
during the updatePaintNode() call. The rule of thumb is to only
use classes with the "QSG" prefix inside the
QQuickItem::updatePaintNode() function.
For more details, see the \l {Scene Graph - Custom Geometry}.
\section3 Preprocessing
Nodes have a virtual QSGNode::preprocess() function, which will be
called before the scene graph is rendered. Node subclasses can set the
flag \l QSGNode::UsePreprocess and override the QSGNode::preprocess()
function to do final preparation of their node. For example, dividing a
bezier curve into the correct level of detail for the current scale
factor or updating a section of a texture.
\section3 Node Ownership
Ownership of the nodes is either done explicitly by the creator or by
the scene graph by setting the flag \l QSGNode::OwnedByParent.
Assigning ownership to the scene graph is often preferable as it
simplifies cleanup when the scene graph lives outside the GUI thread.
\section2 Materials
The material describes how the interior of a geometry in a \l
QSGGeometryNode is filled. It encapsulates an OpenGL shader program
and provides ample flexibility in what can be achieved, though most of
the Qt Quick items themselves only use very basic materials, such as
solid color and texture fills.
For users who just want to apply custom shading to a QML Item type,
it is possible to do this directly in QML using the \l ShaderEffect
type.
Below is a complete list of material classes:
\annotatedlist{qtquick-scenegraph-materials}
For more details, see the \l {Scene Graph - Simple Material}
\section2 Convenience Nodes
The scene graph API is very low-level and focuses on performance
rather than convenience. Writing custom geometries and materials from
scratch, even the most basic ones, requires a non-trivial amount of
code. For this reason, the API includes a few convenience classes to
make the most common custom nodes readily available.
\list
\li \l QSGSimpleRectNode - a QSGGeometryNode subclass which defines a
rectangular geometry with a solid color material.
\li \l QSGSimpleTextureNode - a QSGGeometryNode subclass which defines
a rectangular geometry with a texture material.
\endlist
\section1 Scene Graph and Rendering
The rendering of the scene graph happens internally in the
QQuickWindow class, and there is no public API to access it. There are,
however, a few places in the rendering pipeline where the user can
attach application code. This can be used to add custom scene graph
content or render raw OpenGL content. The integration points are
defined by the render loop.
For detailed description of how the scene graph renderer works, see
\l {Qt Quick Scene Graph Renderer}.
There are three render loop variants available: \c basic, \c windows,
and \c threaded. Out of these, \c basic and \c windows are
single-threaded, while \c threaded performs scene graph rendering on a
dedicated thread. Qt attempts to choose a suitable loop based on the
platform and possibly the graphics drivers in use. When this is not
satisfactory, or for testing purposes, the environment variable
\c QSG_RENDER_LOOP can be used to force the usage of a given loop. To
verify which render loop is in use, enable the \c qt.scenegraph.general
\l {QLoggingCategory}{logging category}.
\note The \c threaded and \c windows render loops rely on the OpenGL
implementation for throttling by requesting a swap interval of 1. Some
graphics drivers allow users to override this setting and turn it off,
ignoring Qt's request. Without blocking in the swap buffers operation
(or elsewhere), the render loop will run animations too fast and spin
the CPU at 100%. If a system is known to be unable to provide
vsync-based throttling, use the \c basic render loop instead by
setting \c {QSG_RENDER_LOOP=basic} in the environment.
\section2 Threaded Render Loop ("threaded")
On many configurations, the scene graph rendering will happen on a
dedicated render thread. This is done to increase parallelism of
multi-core processors and make better use of stall times such as
waiting for a blocking swap buffer call. This offers significant
performance improvements, but imposes certain restrictions on where
and when interaction with the scene graph can happen.
The following is a simple outline of how a frame gets
composed with the threaded render loop.
\image sg-renderloop-threaded.jpg
\list 1
\li A change occurs in the QML scene, causing \c QQuickItem::update()
to be called. This can be the result of for instance an animation or
user input. An event is posted to the render thread to initiate a new
frame.
\li The render thread prepares to draw a new frame and makes the
OpenGL context current and initiates a block on the GUI thread.
\li While the render thread is preparing the new frame, the GUI thread
calls QQuickItem::updatePolish() to do final touch-up of items before
they are rendered.
\li GUI thread is blocked.
\li The QQuickWindow::beforeSynchronizing() signal is emitted.
Applications can make direct connections (using Qt::DirectConnection)
to this signal to do any preparation required before calls to
QQuickItem::updatePaintNode().
\li Synchronization of the QML state into the scene graph. This is
done by calling the QQuickItem::updatePaintNode() function on all
items that have changed since the previous frame. This is the only
time the QML items and the nodes in the scene graph interact.
\li GUI thread block is released.
\li The scene graph is rendered:
\list 1
\li The QQuickWindow::beforeRendering() signal is
emitted. Applications can make direct connections
(using Qt::DirectConnection) to this signal to use custom OpenGL calls
which will then stack visually beneath the QML scene.
\li Items that have specified QSGNode::UsePreprocess, will have their
QSGNode::preprocess() function invoked.
\li The renderer processes the nodes and calls OpenGL functions.
\li The QQuickWindow::afterRendering() signal is
emitted. Applications can make direct connections
(using Qt::DirectConnection) to this signal to use custom OpenGL calls
which will then stack visually over the QML scene.
\li The rendered frame is swapped and QQuickWindow::frameSwapped()
is emitted.
\endlist
\li While the render thread is rendering, the GUI is free to advance
animations, process events, etc.
\endlist
The threaded renderer is currently used by default on Windows with
opengl32.dll, Linux with non-Mesa based drivers, \macos, mobile
platforms, and Embedded Linux with EGLFS but this is subject to
change. It is possible to force use of the threaded renderer by
setting \c {QSG_RENDER_LOOP=threaded} in the environment.
\section2 Non-threaded Render Loops ("basic" and "windows")
The non-threaded render loop is currently used by default on Windows
with ANGLE or a non-default opengl32 implementation and Linux with
Mesa drivers. For the latter this is mostly a precautionary measure,
as not all combinations of OpenGL drivers and windowing systems have
been tested. At the same time implementations like ANGLE or Mesa
llvmpipe are not able to function properly with threaded rendering at
all so not using threaded rendering is essential for these.
By default \c windows is used for non-threaded rendering on Windows
with ANGLE, while \c basic is used for all other platforms when
non-threaded rendering is needed.
Even when using the non-threaded render loop, you should write your
code as if you are using the threaded renderer, as failing to do so
will make the code non-portable.
The following is a simplified illustration of the frame rendering
sequence in the non-threaded renderer.
\image sg-renderloop-singlethreaded.jpg
\section2 Custom control over rendering with QQuickRenderControl
When using QQuickRenderControl, the responsibility for driving the
rendering loop is transferred to the application. In this case no
built-in render loop is used. Instead, it is up to the application to
invoke the polish, synchronize and rendering steps at the appropriate
time. It is possible to implement either a threaded or non-threaded
behavior similar to the ones shown above.
\section2 Mixing Scene Graph and OpenGL
The scene graph offers two methods for integrating OpenGL content:
by calling OpenGL commands directly and by creating a textured node
in the scene graph.
By connecting to the \l QQuickWindow::beforeRendering() and \l
QQuickWindow::afterRendering() signals, applications can make OpenGL
calls directly into the same context as the scene graph is rendering
to. As the signal names indicate, the user can then render OpenGL
content either under a Qt Quick scene or over it. The benefit of
integrating in this manner is that no extra framebuffer nor memory is
needed to perform the rendering. The downside is that Qt Quick decides
when to call the signals and this is the only time the OpenGL
application is allowed to draw.
The \l {Scene Graph - OpenGL Under QML} example gives an example on
how to use these signals.
The other alternative is to create a QQuickFramebufferObject, render
into it, and let it be displayed in the scene graph as a texture.
The \l {Scene Graph - Rendering FBOs} example shows how this can be
done. It is also possible to combine multiple rendering contexts and
multiple threads to create content to be displayed in the scene graph.
The \l {Scene Graph - Rendering FBOs in a thread} examples show how
this can be done.
\warning When mixing OpenGL content with scene graph rendering, it is
important the application does not leave the OpenGL context in a state
with buffers bound, attributes enabled, special values in the z-buffer
or stencil-buffer or similar. Doing so can result in unpredictable
behavior.
\warning The OpenGL rendering code must be thread aware, as the
rendering might be happening outside the GUI thread.
\section2 Custom Items using QPainter
The QQuickItem provides a subclass, QQuickPaintedItem, which allows
the users to render content using QPainter.
\warning Using QQuickPaintedItem uses an indirect 2D surface to render
its content, either using software rasterization or using an OpenGL
framebuffer object (FBO), so the rendering is a two-step
operation. First rasterize the surface, then draw the surface. Using
scene graph API directly is always significantly faster.
\section1 Logging Support
The scene graph has support for a number of logging categories. These
can be useful in tracking down both performance issues and bugs in
addition to being helpful to Qt contributors.
\list
\li \c {qt.scenegraph.time.texture} - logs the time spent doing texture uploads
\li \c {qt.scenegraph.time.compilation} - logs the time spent doing shader compilation
\li \c {qt.scenegraph.time.renderer} - logs the time spent in the various steps of the renderer
\li \c {qt.scenegraph.time.renderloop} - logs the time spent in the various steps of the render loop
\li \c {qt.scenegraph.time.glyph} - logs the time spent preparing distance field glyphs
\li \c {qt.scenegraph.general} - logs general information about various parts of the scene graph and the graphics stack
\li \c {qt.scenegraph.renderloop} - creates a detailed log of the various stages involved in rendering. This log mode is primarily useful for developers working on Qt.
\endlist
\section1 Scene Graph Backend
In addition to the public API, the scene graph has an adaptation layer
which opens up the implementation to do hardware specific
adaptations. This is an undocumented, internal and private plugin API,
which lets hardware adaptation teams make the most of their hardware.
It includes:
\list
\li Custom textures; specifically the implementation of
QQuickWindow::createTextureFromImage and the internal representation
of the texture used by \l Image and \l BorderImage types.
\li Custom renderer; the adaptation layer lets the plugin decide how
the scene graph is traversed and rendered, making it possible to
optimize the rendering algorithm for a specific hardware or to make
use of extensions which improve performance.
\li Custom scene graph implementation of many of the default QML
types, including its text and font rendering.
\li Custom animation driver; allows the animation system to hook
into the low-level display vertical refresh to get smooth rendering.
\li Custom render loop; allows better control over how QML deals
with multiple windows.
\endlist
*/
/*!
\title Qt Quick Scene Graph Renderer
\page qtquick-visualcanvas-scenegraph-renderer.html
This document explains how the scene graph renderer works internally
so that one can write code that uses it in an optimal fashion, both
performance-wise and feature-wise.
One does not need to understand the internals of the renderer to get
good performance. However, it might help when integrating with the
scene graph or to figure out why it is not possible to squeeze the
maximum efficiency out of the graphics chip.
\note Even in the case where every frame is unique and everything is
uploaded from scratch, the default renderer will perform well.
The Qt Quick items in a QML scene populates a tree of QSGNode
instances. Once created, this tree is a complete description of how
a certain frame should be rendered. It does not contain any
references back to the Qt Quick items at all and will on most
platforms be processed and rendered in a separate thread. The
renderer is a self contained part of the scene graph which traverses
the QSGNode tree and uses geometry defined in QSGGeometryNode and
shader state defined in QSGMaterial to schedule OpenGL state change
and draw calls.
If needed, the renderer can be completely replaced using the
internal scene graph back-end API. This is mostly interesting for
platform vendors who wish to take advantage of non-standard hardware
features. For majority of use cases, the default renderer will be
sufficient.
The default renderer focuses on two primary strategies to optimize
the rendering. Batching of draw calls and retention of geometry on
the GPU.
\section1 Batching
Where a traditional 2D API, such as QPainter, Cairo or Context2D, is
written to handle thousands of individual draw calls per frame,
OpenGL is a pure hardware API and performs best when the number of
draw calls is very low and state changes are kept to a
minimum. Consider the following use case:
\image visualcanvas_list.png
The simplest way of drawing this list is on a cell-by-cell basis. First
the background is drawn. This is a rectangle of a specific color. In
OpenGL terms this means selecting a shader program to do solid color
fills, setting up the fill color, setting the transformation matrix
containing the x and y offsets and then using for instance
\c glDrawArrays to draw two triangles making up the rectangle. The icon
is drawn next. In OpenGL terms this means selecting a shader program
to draw textures, selecting the active texture to use, setting the
transformation matrix, enabling alpha-blending and then using for
instance \c glDrawArrays to draw the two triangles making up the
bounding rectangle of the icon. The text and separator line between
cells follow a similar pattern. And this process is repeated for
every cell in the list, so for a longer list, the overhead imposed
by OpenGL state changes and draw calls completely outweighs the
benefit that using a hardware accelerated API could provide.
When each primitive is large, this overhead is negligible, but in
the case of a typical UI, there are many small items which add up to
a considerable overhead.
The default scene graph renderer works within these
limitations and will try to merge individual primitives together
into batches while preserving the exact same visual result. The
result is fewer OpenGL state changes and a minimal amount of draw
calls, resulting in optimal performance.
\section2 Opaque Primitives
The renderer separates between opaque primitives and primitives
which require alpha blending. By using OpenGL's Z-buffer and giving
each primitive a unique z position, the renderer can freely reorder
opaque primitives without any regard for their location on screen
and which other elements they overlap with. By looking at each
primitive's material state, the renderer will create opaque
batches. From Qt Quick core item set, this includes Rectangle items
with opaque colors and fully opaque images, such as JPEGs or BMPs.
Another benefit of using opaque primitives, is that opaque
primitives does not require \c GL_BLEND to be enabled which can be
quite costly, especially on mobile and embedded GPUs.
Opaque primitives are rendered in a front-to-back manner with
\c glDepthMask and \c GL_DEPTH_TEST enabled. On GPUs that internally do
early-z checks, this means that the fragment shader does not need to
run for pixels or blocks of pixels that are obscured. Beware that
the renderer still needs to take these nodes into account and the
vertex shader is still run for every vertex in these primitives, so
if the application knows that something is fully obscured, the best
thing to do is to explicitly hide it using Item::visible or
Item::opacity.
\note The Item::z is used to control an Item's stacking order
relative to its siblings. It has no direct relation to the renderer and
OpenGL's Z-buffer.
\section2 Alpha Blended Primitives
Once opaque primitives have been drawn, the renderer will disable
\c glDepthMask, enable \c GL_BLEND and render all alpha blended primitives
in a back-to-front manner.
Batching of alpha blended primitives requires a bit more effort in
the renderer as elements that are overlapping need to be rendered in
the correct order for alpha blending to look correct. Relying on the
Z-buffer alone is not enough. The renderer does a pass over all
alpha blended primitives and will look at their bounding rect in
addition to their material state to figure out which elements can be
batched and which can not.
\image visualcanvas_overlap.png
In the left-most case, the blue backgrounds can be drawn in one call
and the two text elements in another call, as the texts only overlap
a background which they are stacked in front of. In the right-most
case, the background of "Item 4" overlaps the text of "Item 3" so in
this case, each of backgrounds and texts need to be drawn using
separate calls.
Z-wise, the alpha primitives are interleaved with the opaque nodes
and may trigger early-z when available, but again, setting
Item::visible to false is always faster.
\section2 Mixing with 3D Primitives
The scene graph can support pseudo 3D and proper 3D primitives. For
instance, one can implement a "page curl" effect using a
ShaderEffect or implement a bumpmapped torus using QSGGeometry and a
custom material. While doing so, one needs to take into account that
the default renderer already makes use of the depth buffer.
The renderer modifies the vertex shader returned from
QSGMaterialShader::vertexShader() and compresses the z values of the
vertex after the model-view and projection matrices has been applied
and then adds a small translation on the z to position it the
correct z position.
The compression assumes that the z values are in the range of 0 to
1.
\section2 Texture Atlas
The active texture is a unique OpenGL state, which means that
multiple primitives using different OpenGL textures cannot be
batched. The Qt Quick scene graph for this reason allows multiple
QSGTexture instances to be allocated as smaller sub-regions of a
larger texture; a texture atlas.
The biggest benefit of texture atlases is that multiple QSGTexture
instances now refer to the same OpenGL texture instance. This makes
it possible to batch textured draw calls as well, such as Image
items, BorderImage items, ShaderEffect items and also C++ types such
as QSGSimpleTextureNode and custom QSGGeometryNodes using textures.
\note Large textures do not go into the texture atlas.
Atlas based textures are created by passing
QQuickWindow::TextureCanUseAtlas to the
QQuickWindow::createTextureFromImage().
\note Atlas based textures do not have texture coordinates ranging
from 0 to 1. Use QSGTexture::normalizedTextureSubRect() to get the
atlas texture coordinates.
The scene graph uses heuristics to figure out how large the atlas
should be and what the size threshold for being entered into the
atlas is. If different values are needed, it is possible to override
them using the environment variables \c {QSG_ATLAS_WIDTH=[width]},
\c {QSG_ATLAS_HEIGHT=[height]} and \c
{QSG_ATLAS_SIZE_LIMIT=[size]}. Changing these values will mostly be
interesting for platform vendors.
\section1 Batch Roots
In addition to merging compatible primitives into batches, the
default renderer also tries to minimize the amount of data that
needs to be sent to the GPU for every frame. The default renderer
identifies subtrees which belong together and tries to put these
into separate batches. Once batches are identified, they are merged,
uploaded and stored in GPU memory, using Vertex Buffer Objects.
\section2 Transform Nodes
Each Qt Quick Item inserts a QSGTransformNode into the scene graph
tree to manage its x, y, scale or rotation. Child items will be
populated under this transform node. The default renderer tracks
the state of transform nodes between frames, and will look at
subtrees to decide if a transform node is a good candidate to become
a root for a set of batches. A transform node which changes between
frames and which has a fairly complex subtree, can become a batch
root.
QSGGeometryNodes in the subtree of a batch root are pre-transformed
relative to the root on the CPU. They are then uploaded and retained
on the GPU. When the transform changes, the renderer only needs to
update the matrix of the root, not each individual item, making list
and grid scrolling very fast. For successive frames, as long as
nodes are not being added or removed, rendering the list is
effectively for free. When new content enters the subtree, the batch
that gets it is rebuilt, but this is still relatively fast. There are
usually several unchanging frames for every frame with added or
removed nodes when panning through a grid or list.
Another benefit of identifying transform nodes as batch roots is
that it allows the renderer to retain the parts of the tree that has
not changed. For instance, say a UI consists of a list and a button
row. When the list is being scrolled and delegates are being added
and removed, the rest of the UI, the button row, is unchanged and
can be drawn using the geometry already stored on the GPU.
The node and vertex threshold for a transform node to become a batch
root can be overridden using the environment variables \c
{QSG_RENDERER_BATCH_NODE_THRESHOLD=[count]} and \c
{QSG_RENDERER_BATCH_VERTEX_THRESHOLD=[count]}. Overriding these flags
will be mostly useful for platform vendors.
\note Beneath a batch root, one batch is created for each unique
set of material state and geometry type.
\section2 Clipping
When setting Item::clip to true, it will create a QSGClipNode with a
rectangle in its geometry. The default renderer will apply this clip
by using scissoring in OpenGL. If the item is rotated by a
non-90-degree angle, the OpenGL's stencil buffer is used. Qt Quick
Item only supports setting a rectangle as clip through QML, but the
scene graph API and the default renderer can use any shape for
clipping.
When applying a clip to a subtree, that subtree needs to be rendered
with a unique OpenGL state. This means that when Item::clip is true,
batching of that item is limited to its children. When there are
many children, like a ListView or GridView, or complex children,
like a TextArea, this is fine. One should, however, use clip on
smaller items with caution as it prevents batching. This includes
button label, text field or list delegate and table cells.
\section2 Vertex Buffers
Each batch uses a vertex buffer object (VBO) to store its data on
the GPU. This vertex buffer is retained between frames and updated
when the part of the scene graph that it represents changes.
By default, the renderer will upload data into the VBO using
\c GL_STATIC_DRAW. It is possible to select different upload strategy
by setting the environment variable \c
{QSG_RENDERER_BUFFER_STRATEGY=[strategy]}. Valid values are \c
stream and \c dynamic. Changing this value is mostly useful for
platform vendors.
\section1 Antialiasing
The scene graph supports two types of antialiasing. By default, primitives
such as rectangles and images will be antialiased by adding more
vertices along the edge of the primitives so that the edges fade
to transparent. We call this method \e {vertex antialiasing}. If the
user requests a multisampled OpenGL context, by setting a QSurfaceFormat
with samples greater than \c 0 using QQuickWindow::setFormat(), the
scene graph will prefer multisample based antialiasing (MSAA).
The two techniques will affect how the rendering happens internally
and have different limitations.
It is also possible to override the antialiasing method used by
setting the environment variable \c {QSG_ANTIALIASING_METHOD}
to either \c vertex or \c {msaa}.
Vertex antialiasing can produce seams between edges of adjacent
primitives, even when the two edges are mathmatically the same.
Multisample antialiasing does not.
\section2 Vertex Antialiasing
Vertex antialiasing can be enabled and disabled on a per-item basis
using the Item::antialiasing property. It will work regardless of
what the underlying hardware supports and produces higher quality
antialiasing, both for normally rendered primitives and also for
primitives captured into framebuffer objects, for instance using
the ShaderEffectSource type.
The downside to using vertex antialiasing is that each primitive
with antialiasing enabled will have to be blended. In terms of
batching, this means that the renderer needs to do more work to
figure out if the primitive can be batched or not and due to overlaps
with other elements in the scene, it may also result in less batching,
which could impact performance.
On low-end hardware blending can also be quite expensive so for an
image or rounded rectangle that covers most of the screen, the amount
of blending needed for the interior of these primitives can result
in significant performance loss as the entire primitive must be blended.
\section2 Multisample Antialiasing
Multisample antialiasing is a hardware feature where the hardware
calculates a coverage value per pixel in the primitive. Some hardware
can multisample at a very low cost, while other hardware may
need both more memory and more GPU cycles to render a frame.
Using multisample antialiasing, many primitives, such as rounded
rectangles and image elements can be antialiased and still be
\e opaque in the scene graph. This means the renderer has an easier
job when creating batches and can rely on early-z to avoid overdraw.
When multisample antialiasing is used, content rendered into
framebuffer objects, need additional extensions to support multisampling
of framebuffers. Typically \c GL_EXT_framebuffer_multisample and
\c GL_EXT_framebuffer_blit. Most desktop chips have these extensions
present, but they are less common in embedded chips. When framebuffer
multisampling is not available in the hardware, content rendered into
framebuffer objects will not be antialiased, including the content of
a ShaderEffectSource.
\section1 Performance
As stated in the beginning, understanding the finer details of the
renderer is not required to get good performance. It is written to
optimize for common use cases and will perform quite well under
almost any circumstance.
\list
\li Good performance comes from effective batching, with as little
as possible of the geometry being uploaded again and again. By
setting the environment variable \c {QSG_RENDERER_DEBUG=render}, the
renderer will output statistics on how well the batching goes, how
many batches, which batches are retained and which are opaque and
not. When striving for optimal performance, uploads should happen
only when really needed, batches should be fewer than 10 and at
least 3-4 of them should be opaque.
\li The default renderer does not do any CPU-side viewport clipping
nor occlusion detection. If something is not supposed to be visible,
it should not be shown. Use \c {Item::visible: false} for items that
should not be drawn. The primary reason for not adding such logic is
that it adds additional cost which would also hurt applications that
took care in behaving well.
\li Make sure the texture atlas is used. The Image and BorderImage
items will use it unless the image is too large. For textures
created in C++, pass QQuickWindow::TextureCanUseAtlas when
calling QQuickWindow::createTexture().
By setting the environment variable \c {QSG_ATLAS_OVERLAY} all atlas
textures will be colorized so they are easily identifiable in the
application.
\li Use opaque primitives where possible. Opaque primitives are
faster to process in the renderer and faster to draw on the GPU. For
instance, PNG files will often have an alpha channel, even though
each pixel is fully opaque. JPG files are always opaque. When
providing images to a QQuickImageProvider or creating images with
QQuickWindow::createTextureFromImage(), let the image have
QImage::Format_RGB32, when possible.
\li Be aware of that overlapping compond items, like in the
illustration above, can not be batched.
\li Clipping breaks batching. Never use on a per-item basis, inside
tables cells, item delegates or similar. Instead of clipping text,
use eliding. Instead of clipping an image, create a
QQuickImageProvider that returns a cropped image.
\li Batching only works for 16-bit indices. All built-in items use
16-bit indices, but custom geometry is free to also use 32-bit
indices.
\li Some material flags prevent batching, the most limiting one
being QSGMaterial::RequiresFullMatrix which prevents all batching.
\li Applications with a monochrome background should set it using
QQuickWindow::setColor() rather than using a top-level Rectangle item.
QQuickWindow::setColor() will be used in a call to \c glClear(),
which is potentially faster.
\li Mipmapped Image items are not placed in global atlas and will
not be batched.
\endlist
If an application performs poorly, make sure that rendering is
actually the bottleneck. Use a profiler! The environment variable \c
{QSG_RENDER_TIMING=1} will output a number of useful timing
parameters which can be useful in pinpointing where a problem lies.
\section1 Visualizing
To visualize the various aspects of the scene graph's default renderer, the
\c QSG_VISUALIZE environment variable can be set to one of the values
detailed in each section below. We provide examples of the output of
some of the variables using the following QML code:
\code
import QtQuick 2.2
Rectangle {
width: 200
height: 140
ListView {
id: clippedList
x: 20
y: 20
width: 70
height: 100
clip: true
model: ["Item A", "Item B", "Item C", "Item D"]
delegate: Rectangle {
color: "lightblue"
width: parent.width
height: 25
Text {
text: modelData
anchors.fill: parent
horizontalAlignment: Text.AlignHCenter
verticalAlignment: Text.AlignVCenter
}
}
}
ListView {
id: clippedDelegateList
x: clippedList.x + clippedList.width + 20
y: 20
width: 70
height: 100
clip: true
model: ["Item A", "Item B", "Item C", "Item D"]
delegate: Rectangle {
color: "lightblue"
width: parent.width
height: 25
clip: true
Text {
text: modelData
anchors.fill: parent
horizontalAlignment: Text.AlignHCenter
verticalAlignment: Text.AlignVCenter
}
}
}
}
\endcode
For the ListView on the left, we set its \l {Item::clip}{clip} property to
\c true. For the ListView on right, we also set each delegate's
\l {Item::clip}{clip} property to \c true to illustrate the effects of
clipping on batching.
\image visualize-original.png "Original"
Original
\note The visualized elements do not respect clipping, and rendering order is
arbitrary.
\section2 Visualizing Batches
Setting \c QSG_VISUALIZE to \c batches visualizes batches in the renderer.
Merged batches are drawn with a solid color and unmerged batches are drawn
with a diagonal line pattern. Few unique colors means good batching.
Unmerged batches are bad if they contain many individual nodes.
\image visualize-batches.png "batches"
\c QSG_VISUALIZE=batches
\section2 Visualizing Clipping
Setting \c QSG_VISUALIZE to \c clip draws red areas on top of the scene
to indicate clipping. As Qt Quick Items do not clip by default, no clipping
is usually visualized.
\image visualize-clip.png
\c QSG_VISUALIZE=clip
\section2 Visualizing Changes
Setting \c QSG_VISUALIZE to \c changes visualizes changes in the renderer.
Changes in the scenegraph are visualized with a flashing overlay of a random
color. Changes on a primitive are visualized with a solid color, while
changes in an ancestor, such as matrix or opacity changes, are visualized
with a pattern.
\section2 Visualizing Overdraw
Setting \c QSG_VISUALIZE to \c overdraw visualizes overdraw in the renderer.
Visualize all items in 3D to highlight overdraws. This mode can also be used
to detect geometry outside the viewport to some extent. Opaque items are
rendered with a green tint, while translucent items are rendered with a red
tint. The bounding box for the viewport is rendered in blue. Opaque content
is easier for the scenegraph to process and is usually faster to render.
Note that the root rectangle in the code above is superfluous as the window
is also white, so drawing the rectangle is a waste of resources in this case.
Changing it to an Item can give a slight performance boost.
\image visualize-overdraw-1.png "overdraw-1"
\image visualize-overdraw-2.png "overdraw-2"
\c QSG_VISUALIZE=overdraw
*/
|