aboutsummaryrefslogtreecommitdiffstats
path: root/src
diff options
context:
space:
mode:
authorLaszlo Agocs <laszlo.agocs@qt.io>2021-08-31 18:27:50 +0200
committerLaszlo Agocs <laszlo.agocs@qt.io>2021-09-10 21:14:13 +0200
commit84f30bf17170769795ddd8dbe1e7008ddbef2f6a (patch)
treea8a3dadfcc44f55b5d37e308f107c5ee04530673 /src
parent4cfb323389363bf12f84044b8c7501ef2f0441b9 (diff)
Remove the per-rendercontext srb cache
Remove the srbCache which is effectively a map of unbounded size. Depending on scene structure and changes to the scene that trigger rebuilding the batching structures of the scenegraph, it can grow unexpectedly large, potentially growing continuously on every frame, with animated scenes that manage to trigger the "right" kind of changes to the scene. Other similar structures, such as the graphics pipeline or the sampler tables have a known maximum size, because we know that there can only be a certain number of combinations in the values that form the keys to the map. Whereas with QRhiShaderResourceBindings objects the key is the full binding list, including the QRhiBuffer/Texture/Sampler pointers. This becomes complex pretty quickly when combined with the internal mechanisms of the scenegraph's batching system and the lifecycle of these QRhiResource objects. For instance, uniform buffers live in Batch, which is a pooled, (semi-)transient object: changes to the scene that lead to full or partial rebuild will generate a new set of Batches, meaning a QSGGeometryNode can be associated with a different QRhiBuffer afterwards (not that we create new buffers all the time, they are reused, but they can be associated with different nodes over their lifetime). Which then means that to render the node, we now need a different srb as well (that references the correct QRhiBuffer). Depending on the scene and how it changes over time, this may lead to an explosive number of combinations, thus a "continuously" growing srbCache map, inserting more and more new entries, while leaving perhaps-unused entries in there for ever. Trimming is far from trivial as the costs of managing the purging of unused entries is likely more expensive than not having the "cache" in the first place. Thus, drop the whole data structure and give ownership of the srb to Element (which is the object in the shadow tree for each QSGGeometryNode). To not completely degrade performance, this needs some new enablers in QRhi, namely the ability to have a fast path for replacing the QRhiResources (buffers, textures, samplers) in the binding list of an already baked srb without having to go through full rebaking. (e.g. with Vulkan, we want to reuse all the existing descriptor set layouts and the descriptor sets themselves, it's just the descriptors themselves that need new values written) So now, instead of looking up in a QHash, we first compare the full binding list (this is what one gets with the QHash when there's a hit anyway: at minimum the binding list is hashed and then compared). If this matches then fine (and we saved the time spent on hashing even). Then, if there's a mismatch, we now do a layout-only comparison. If this matches, then we use the new enablers in QRhi to do a "light update" (different resources, but same binding points, types, etc.) to the srb. Finally, if there was a mismatch here too, we go through create() to do a full rebuild. Therefore, the performance effects of this patch depend heavily on the scene. Scenes that do not do changes that trigger re-batching often may see improvements even. Others are assumed to have small or negligible consequences with regards to performance (due to now spending time on a layout-compatible comparison of binding lists which was not there before). In return, we do not need to have the srbCache QHash at all, leading to, with scenes exhibiting the relevant patterns, significantly reduced memory usage. Once srbCache is gone, the following issues need to be handled: First, now that the srb ownership stays with the shadow node (Element), instead of guaranteed to stay around for ever in an srb cache data structure, throwing references into the keys used to look up pipelines is problematic (and was never nice to begin with). Get rid of this, building on the new QRhi API that gives us a list of uints that can be used to test for layout compatibility. The container is implicitly shared, which makes the switch quite simple and efficient. Second, all this does not mean that some sort of reuse of srb objects is not needed. Consider scrolling a list view. Nodes (and so shadow nodes, i.e. Element objects) come and go in this case. Creating full new srb objects for the Elements for the list items becoming visible is not ideal. (even if only really dubious with Vulkan, where there are true Vulkan API objects underneath) To facilitate reuse, reintroduce an srb pool in ShaderManager, but this time with different semantics: this stores unused srb objects, with the serialized layout description as the key. When a new Element needs an srb, we try to pick a (layout compatible) srb from this pool. If there is a match, the Element gets the srb with ownership, and the srb is removed from the pool. When an Element is about to go away, its srb may be "moved" to the pool when the pool is not too large already. The threshold for being too large is currently simply 1024 elements (in the QMultiHash). This can be overridden with an environment variable, although that should pretty much be never needed in practice. Pick-to: 6.2 Fixes: QTBUG-96130 Change-Id: Ie5a49beeb6dea0db6a81fdceebf39374ad192b0e Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Christian Strømme <christian.stromme@qt.io>
Diffstat (limited to 'src')
-rw-r--r--src/quick/scenegraph/coreapi/qsgbatchrenderer.cpp202
-rw-r--r--src/quick/scenegraph/coreapi/qsgbatchrenderer_p.h15
2 files changed, 147 insertions, 70 deletions
diff --git a/src/quick/scenegraph/coreapi/qsgbatchrenderer.cpp b/src/quick/scenegraph/coreapi/qsgbatchrenderer.cpp
index 588210aeff..d619b887c8 100644
--- a/src/quick/scenegraph/coreapi/qsgbatchrenderer.cpp
+++ b/src/quick/scenegraph/coreapi/qsgbatchrenderer.cpp
@@ -299,11 +299,11 @@ void ShaderManager::invalidated()
qDeleteAll(rewrittenShaders);
rewrittenShaders.clear();
- qDeleteAll(srbCache);
- srbCache.clear();
-
qDeleteAll(pipelineCache);
pipelineCache.clear();
+
+ qDeleteAll(srbPool);
+ srbPool.clear();
}
void ShaderManager::clearCachedRendererData()
@@ -324,24 +324,6 @@ void ShaderManager::clearCachedRendererData()
}
}
-QRhiShaderResourceBindings *ShaderManager::srb(const ShaderResourceBindingList &bindings)
-{
- auto it = srbCache.constFind(bindings);
- if (it != srbCache.constEnd())
- return *it;
-
- QRhiShaderResourceBindings *srb = context->rhi()->newShaderResourceBindings();
- srb->setBindings(bindings.cbegin(), bindings.cend());
- if (srb->create()) {
- srbCache.insert(bindings, srb);
- } else {
- qWarning("Failed to build srb");
- delete srb;
- srb = nullptr;
- }
- return srb;
-}
-
void qsg_dumpShadowRoots(BatchRootInfo *i, int indent)
{
static int extraIndent = 0;
@@ -928,10 +910,11 @@ Renderer::Renderer(QSGDefaultRenderContext *ctx, QSGRendererInterface::RenderMod
m_batchNodeThreshold = qt_sg_envInt("QSG_RENDERER_BATCH_NODE_THRESHOLD", 64);
m_batchVertexThreshold = qt_sg_envInt("QSG_RENDERER_BATCH_VERTEX_THRESHOLD", 1024);
+ m_srbPoolThreshold = qt_sg_envInt("QSG_RENDERER_SRB_POOL_THRESHOLD", 1024);
if (Q_UNLIKELY(debug_build() || debug_render())) {
- qDebug("Batch thresholds: nodes: %d vertices: %d",
- m_batchNodeThreshold, m_batchVertexThreshold);
+ qDebug("Batch thresholds: nodes: %d vertices: %d Srb pool threshold: %d",
+ m_batchNodeThreshold, m_batchVertexThreshold, m_srbPoolThreshold);
}
}
@@ -980,13 +963,8 @@ Renderer::~Renderer()
m_nodeAllocator.release(n);
// Remaining elements...
- for (int i=0; i<m_elementsToDelete.size(); ++i) {
- Element *e = m_elementsToDelete.at(i);
- if (e->isRenderNode)
- delete static_cast<RenderNodeElement *>(e);
- else
- m_elementAllocator.release(e);
- }
+ for (int i=0; i<m_elementsToDelete.size(); ++i)
+ releaseElement(m_elementsToDelete.at(i), true);
destroyGraphicsResources();
@@ -2629,10 +2607,24 @@ static inline bool needsBlendConstant(QRhiGraphicsPipeline::BlendFactor f)
bool Renderer::ensurePipelineState(Element *e, const ShaderManager::Shader *sms, bool depthPostPass)
{
- // In unmerged batches the srbs in the elements are all compatible
- // layout-wise. Note the key's == and qHash implementations: the rp desc and
- // srb are tested for (layout) compatibility, not pointer equality.
- const GraphicsPipelineStateKey k { m_gstate, sms, renderPassDescriptor(), e->srb };
+ // Note the key's == and qHash implementations: the renderpass descriptor is
+ // tested for compatibility, not pointer equality.
+ //
+ // We do not store the srb pointer itself because the ownership stays with
+ // the Element and that can go away more often that we would like it
+ // to. (think scrolling a list view, constantly dropping and creating new
+ // nodes) Rather, use an opaque blob of a few uints and store and compare
+ // that. This works because once the pipeline is built, we will always call
+ // setShaderResources with an explicitly specified srb which is fine even if
+ // e->srb we used here to bake the pipeline is already gone by that point.
+ //
+ // A typical QSGMaterial's serialized srb layout is 8 uints. (uniform buffer
+ // + texture, 4 fields each) Regardless, using an implicitly shared
+ // container is essential here. (won't detach so no more allocs and copies
+ // are done, unless the Element decides to rebake the srb with a different
+ // layout - but then the detach is exactly what we need)
+
+ const GraphicsPipelineStateKey k { m_gstate, sms, renderPassDescriptor(), e->srb->serializedLayoutDescription() };
// Note: dynamic state (viewport rect, scissor rect, stencil ref, blend
// constant) is never a part of GraphicsState/QRhiGraphicsPipeline.
@@ -2826,8 +2818,8 @@ static void materialToRendererGraphicsState(GraphicsState *dst,
void Renderer::updateMaterialDynamicData(ShaderManager::Shader *sms,
QSGMaterialShader::RenderState &renderState,
QSGMaterial *material,
- ShaderManager::ShaderResourceBindingList *bindings,
const Batch *batch,
+ Element *e,
int ubufOffset,
int ubufRegionSize)
{
@@ -2835,6 +2827,8 @@ void Renderer::updateMaterialDynamicData(ShaderManager::Shader *sms,
QSGMaterialShader *shader = sms->programRhi.program;
QSGMaterialShaderPrivate *pd = QSGMaterialShaderPrivate::get(shader);
+ QVarLengthArray<QRhiShaderResourceBinding, 8> bindings;
+
if (pd->ubufBinding >= 0) {
m_current_uniform_data = &pd->masterUniformData;
const bool changed = shader->updateUniformData(renderState, material, m_currentMaterial);
@@ -2843,11 +2837,11 @@ void Renderer::updateMaterialDynamicData(ShaderManager::Shader *sms,
if (changed || !batch->ubufDataValid)
m_resourceUpdates->updateDynamicBuffer(batch->ubuf, ubufOffset, ubufRegionSize, pd->masterUniformData.constData());
- bindings->append(QRhiShaderResourceBinding::uniformBuffer(pd->ubufBinding,
- pd->ubufStages,
- batch->ubuf,
- ubufOffset,
- ubufRegionSize));
+ bindings.append(QRhiShaderResourceBinding::uniformBuffer(pd->ubufBinding,
+ pd->ubufStages,
+ batch->ubuf,
+ ubufOffset,
+ ubufRegionSize));
}
for (int binding = 0; binding < QSGMaterialShaderPrivate::MAX_SHADER_RESOURCE_BINDINGS; ++binding) {
@@ -2901,17 +2895,92 @@ void Renderer::updateMaterialDynamicData(ShaderManager::Shader *sms,
if (!texture)
texture = dummyTexture();
QRhiSampler *sampler = pd->samplerBindingTable[binding];
- bindings->append(QRhiShaderResourceBinding::sampledTexture(binding,
- stages,
- texture,
- sampler));
+ bindings.append(QRhiShaderResourceBinding::sampledTexture(binding,
+ stages,
+ texture,
+ sampler));
}
}
#ifndef QT_NO_DEBUG
- if (bindings->isEmpty())
+ if (bindings.isEmpty())
qWarning("No shader resources for material %p, this is odd.", material);
#endif
+
+ enum class SrbAction {
+ Unknown,
+ DoNothing,
+ UpdateResources,
+ Rebake
+ } srbAction = SrbAction::Unknown;
+
+ // First, if the Element has no srb created at all, then try to find an existing,
+ // currently unused srb that is layout-compatible with our binding list.
+ if (!e->srb) {
+ // reuse a QVector as our work area, thus possibly reusing the underlying allocation too
+ QVector<quint32> &layoutDesc(m_shaderManager->srbLayoutDescSerializeWorkspace);
+ layoutDesc.clear();
+ QRhiShaderResourceBinding::serializeLayoutDescription(bindings.cbegin(), bindings.cend(), std::back_inserter(layoutDesc));
+ e->srb = m_shaderManager->srbPool.take(layoutDesc);
+ if (e->srb) {
+ // Here we know layout compatibility is satisfied, but do not spend time on full
+ // comparison. The chance of getting an srb that refers to the same resources
+ // (buffer, textures) is low in practice. So reuse, but write new resources.
+ srbAction = SrbAction::UpdateResources;
+ }
+ }
+
+ // If the Element had an existing srb, investigate:
+ // - It may be used as-is (when nothing changed in the scene regarding this node compared to the previous frame).
+ // - Otherwise it may be able to go with a lightweight update (replace resources, binding list layout is the same).
+ // - If all else fails rebake the full thing, meaning we reuse the memory allocation but will recreate everything underneath.
+ if (srbAction == SrbAction::Unknown && e->srb) {
+ if (std::equal(e->srb->cbeginBindings(), e->srb->cendBindings(), bindings.cbegin(), bindings.cend())) {
+ srbAction = SrbAction::DoNothing;
+ } else if (std::equal(e->srb->cbeginBindings(), e->srb->cendBindings(), bindings.cbegin(), bindings.cend(),
+ [](const auto &a, const auto &b) { return a.isLayoutCompatible(b); }))
+ {
+ srbAction = SrbAction::UpdateResources;
+ } else {
+ srbAction = SrbAction::Rebake;
+ }
+ }
+
+ // If the Element had no srb associated at all and could not find a layout-compatible
+ // one from the pool, then create a whole new object.
+ if (!e->srb) {
+ e->srb = m_rhi->newShaderResourceBindings();
+ srbAction = SrbAction::Rebake;
+ }
+
+ Q_ASSERT(srbAction != SrbAction::Unknown && e->srb);
+
+ switch (srbAction) {
+ case SrbAction::DoNothing:
+ break;
+ case SrbAction::UpdateResources:
+ {
+ e->srb->setBindings(bindings.cbegin(), bindings.cend());
+ QRhiShaderResourceBindings::UpdateFlags flags;
+ // Due to the way the binding list is built up above, if we have a uniform buffer
+ // at binding point 0 (or none at all) then the sampledTexture bindings are added
+ // with increasing binding points afterwards, so the list is already sorted based
+ // on the binding points, thus we can save some time by telling the QRhi backend
+ // not to sort again.
+ if (pd->ubufBinding <= 0 || bindings.count() <= 1)
+ flags |= QRhiShaderResourceBindings::BindingsAreSorted;
+
+ e->srb->updateResources(flags);
+ }
+ break;
+ case SrbAction::Rebake:
+ e->srb->setBindings(bindings.cbegin(), bindings.cend());
+ if (!e->srb->create())
+ qWarning("Failed to build srb");
+ break;
+ default:
+ Q_ASSERT_X(false, "updateMaterialDynamicData", "No srb action set, this cannot happen");
+ }
}
void Renderer::updateMaterialStaticData(ShaderManager::Shader *sms,
@@ -3029,8 +3098,7 @@ bool Renderer::prepareRenderMergedBatch(Batch *batch, PreparedRenderBatch *rende
bool pendingGStatePop = false;
updateMaterialStaticData(sms, renderState, material, batch, &pendingGStatePop);
- ShaderManager::ShaderResourceBindingList bindings;
- updateMaterialDynamicData(sms, renderState, material, &bindings, batch, 0, ubufSize);
+ updateMaterialDynamicData(sms, renderState, material, batch, e, 0, ubufSize);
#ifndef QT_NO_DEBUG
if (qsg_test_and_clear_material_failure()) {
@@ -3045,8 +3113,6 @@ bool Renderer::prepareRenderMergedBatch(Batch *batch, PreparedRenderBatch *rende
}
#endif
- e->srb = m_shaderManager->srb(bindings);
-
m_gstate.drawMode = QSGGeometry::DrawingMode(g->drawingMode());
m_gstate.lineWidth = g->lineWidth();
@@ -3232,9 +3298,7 @@ bool Renderer::prepareRenderUnmergedBatch(Batch *batch, PreparedRenderBatch *ren
}
QSGMaterialShader::RenderState renderState = state(QSGMaterialShader::RenderState::DirtyStates(int(dirty)));
- ShaderManager::ShaderResourceBindingList bindings;
- updateMaterialDynamicData(sms, renderState,
- material, &bindings, batch, ubufOffset, ubufSize);
+ updateMaterialDynamicData(sms, renderState, material, batch, e, ubufOffset, ubufSize);
#ifndef QT_NO_DEBUG
if (qsg_test_and_clear_material_failure()) {
@@ -3246,8 +3310,6 @@ bool Renderer::prepareRenderUnmergedBatch(Batch *batch, PreparedRenderBatch *ren
}
#endif
- e->srb = m_shaderManager->srb(bindings);
-
ubufOffset += aligned(ubufSize, m_ubufAlignment);
const QSGGeometry::DrawingMode prevDrawMode = m_gstate.drawMode;
@@ -3370,6 +3432,26 @@ void Renderer::setGraphicsPipeline(QRhiCommandBuffer *cb, const Batch *batch, El
cb->setShaderResources(e->srb);
}
+void Renderer::releaseElement(Element *e, bool inDestructor)
+{
+ if (e->isRenderNode) {
+ delete static_cast<RenderNodeElement *>(e);
+ } else {
+ if (e->srb) {
+ if (!inDestructor) {
+ if (m_shaderManager->srbPool.count() < m_srbPoolThreshold)
+ m_shaderManager->srbPool.insert(e->srb->serializedLayoutDescription(), e->srb);
+ else
+ delete e->srb;
+ } else {
+ delete e->srb;
+ }
+ e->srb = nullptr;
+ }
+ m_elementAllocator.release(e);
+ }
+}
+
void Renderer::deleteRemovedElements()
{
if (!m_elementsToDelete.size())
@@ -3386,13 +3468,9 @@ void Renderer::deleteRemovedElements()
*e = nullptr;
}
- for (int i=0; i<m_elementsToDelete.size(); ++i) {
- Element *e = m_elementsToDelete.at(i);
- if (e->isRenderNode)
- delete static_cast<RenderNodeElement *>(e);
- else
- m_elementAllocator.release(e);
- }
+ for (int i=0; i<m_elementsToDelete.size(); ++i)
+ releaseElement(m_elementsToDelete.at(i));
+
m_elementsToDelete.reset();
}
@@ -3959,7 +4037,7 @@ bool operator==(const GraphicsPipelineStateKey &a, const GraphicsPipelineStateKe
return a.state == b.state
&& a.sms->programRhi.program == b.sms->programRhi.program
&& a.compatibleRenderPassDescriptor->isCompatible(b.compatibleRenderPassDescriptor)
- && a.layoutCompatibleSrb->isLayoutCompatible(b.layoutCompatibleSrb);
+ && a.srbLayoutDescription == b.srbLayoutDescription;
}
bool operator!=(const GraphicsPipelineStateKey &a, const GraphicsPipelineStateKey &b) noexcept
diff --git a/src/quick/scenegraph/coreapi/qsgbatchrenderer_p.h b/src/quick/scenegraph/coreapi/qsgbatchrenderer_p.h
index a42c41304f..50c652eec0 100644
--- a/src/quick/scenegraph/coreapi/qsgbatchrenderer_p.h
+++ b/src/quick/scenegraph/coreapi/qsgbatchrenderer_p.h
@@ -653,7 +653,7 @@ struct GraphicsPipelineStateKey
GraphicsState state;
const ShaderManagerShader *sms;
const QRhiRenderPassDescriptor *compatibleRenderPassDescriptor;
- const QRhiShaderResourceBindings *layoutCompatibleSrb;
+ QVector<quint32> srbLayoutDescription;
};
bool operator==(const GraphicsPipelineStateKey &a, const GraphicsPipelineStateKey &b) noexcept;
@@ -687,11 +687,11 @@ public:
void clearCachedRendererData();
- using ShaderResourceBindingList = QVarLengthArray<QRhiShaderResourceBinding, 8>;
- QRhiShaderResourceBindings *srb(const ShaderResourceBindingList &bindings);
-
QHash<GraphicsPipelineStateKey, QRhiGraphicsPipeline *> pipelineCache;
+ QMultiHash<QVector<quint32>, QRhiShaderResourceBindings *> srbPool;
+ QVector<quint32> srbLayoutDescSerializeWorkspace;
+
public Q_SLOTS:
void invalidated();
@@ -705,8 +705,6 @@ private:
QHash<ShaderKey, Shader *> stockShaders;
QSGDefaultRenderContext *context;
-
- QHash<ShaderResourceBindingList, QRhiShaderResourceBindings *> srbCache;
};
struct RenderPassState
@@ -822,8 +820,7 @@ private:
bool ensurePipelineState(Element *e, const ShaderManager::Shader *sms, bool depthPostPass = false);
QRhiTexture *dummyTexture();
void updateMaterialDynamicData(ShaderManager::Shader *sms, QSGMaterialShader::RenderState &renderState,
- QSGMaterial *material, ShaderManager::ShaderResourceBindingList *bindings,
- const Batch *batch, int ubufOffset, int ubufRegionSize);
+ QSGMaterial *material, const Batch *batch, Element *e, int ubufOffset, int ubufRegionSize);
void updateMaterialStaticData(ShaderManager::Shader *sms, QSGMaterialShader::RenderState &renderState,
QSGMaterial *material, Batch *batch, bool *gstateChanged);
void checkLineWidth(QSGGeometry *g);
@@ -858,6 +855,7 @@ private:
inline Batch *newBatch();
void invalidateAndRecycleBatch(Batch *b);
+ void releaseElement(Element *e, bool inDestructor = false);
void setVisualizationMode(const QByteArray &mode) override;
bool hasVisualizationModeWithContinuousUpdate() const override;
@@ -893,6 +891,7 @@ private:
int m_batchNodeThreshold;
int m_batchVertexThreshold;
+ int m_srbPoolThreshold;
Visualizer *m_visualizer;