From ae82e039eb74b092604910a1f3f7fa4887397cbc Mon Sep 17 00:00:00 2001 From: Christian Kandeler Date: Mon, 10 Jul 2023 15:19:29 +0200 Subject: Loader: Resolve products in parallel The resolveProduct() function is now executed for several products simultaneously, with the (relatively few) accesses to common resources guarded by mutexes. Using Qt Creator as a mid-to-large-sized test project, we see the following changes in the time it takes to resolve the project on some example machines: - Linux (36 cores): 10.5s -> 4.8s - Linux (8 cores): 17s -> 6.5s - macOS (6 cores): 41s -> 16s - Windows (8 cores): 20s -> 9s Unsurprisingly, the speed-up does not scale with the number of processors, as there are typically lots of inter-product dependencies and some expensive resources such as Probes are shared globally. However, we do see a factor of two to three across all the hardware and OS configuarations, which is a good practical result for users. Note that running with -j1, i.e. forcing the use of only a single core, takes the same amount of time everywhere as it did without the patch, so there is no scheduling overhead in the single-core case. The results of our benchmarker tool look interesting. Here they are for qbs and Qt Creator, respectively: ========== Performance data for Resolving ========== (qbs) Old instruction count: 9121688266 New instruction count: 15736125513 Relative change: +72 % Old peak memory usage: 84155384 Bytes New peak memory usage: 187776736 Bytes Relative change: +123 % ========== Performance data for Resolving ========== (QtC) Old instruction count: 59901017190 New instruction count: 65227937765 Relative change: +8 % Old peak memory usage: 621560008 Bytes New peak memory usage: 761732040 Bytes Relative change: +22 % The increased peak memory usage is to be expected, as there are now several JS engines running in parallel. The instruction count increase is likely due to a higher amount of deferrals. Importantly, it appears to go down massively with increased project size, so it does not seem that the parallelism hides a serious per-thread slowdown. Change-Id: Ib4d9ca9aa0687c1056ff82f9805b565cc5a35894 Reviewed-by: Ivan Komissarov --- src/lib/corelib/language/scriptengine.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'src/lib/corelib/language/scriptengine.h') diff --git a/src/lib/corelib/language/scriptengine.h b/src/lib/corelib/language/scriptengine.h index 942b7519c..4a55392e3 100644 --- a/src/lib/corelib/language/scriptengine.h +++ b/src/lib/corelib/language/scriptengine.h @@ -58,6 +58,7 @@ #include #include +#include #include #include #include @@ -360,7 +361,7 @@ private: std::unordered_map m_jsFileCache; bool m_propertyCacheEnabled = true; bool m_active = false; - bool m_canceling = false; + std::atomic_bool m_canceling = false; QHash m_propertyCache; PropertySet m_propertiesRequestedInScript; QHash m_propertiesRequestedFromArtifact; -- cgit v1.2.3