| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Implement AVX2 versions of the three optimized paths of bilinear
texture transform.
Change-Id: Ie7199ef7dcce1e3457535fee35822d76afc0e8ba
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Manually vectorizing is significantly faster because we can optimize
for common cases like long stretches of opaque or transparent pixels.
This is both smaller and faster than the auto-vectorized version, it is
also much faster than the autovectorized version for AVX2 which then can
be removed.
Change-Id: I0fa80ce273a8387cc6cd084879822ad9bade385c
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The alpha channel of an RGB32 image was not properly ignored when doing
blending with partial opacity.
Now the alpha value is properly ignored, which is both more correct
and faster. This also makes SSE2 and AVX2 implementations match NEON
which was already doing the right thing (though had dead code for
doing it wrong).
Change-Id: I4613b8d70ed8c2e36ced10baaa7a4a55bd36a940
Reviewed-by: Eirik Aavitsland <eirik.aavitsland@qt.io>
|
|
|
|
|
|
|
|
|
|
| |
Defines a structure that tells the compiler in no uncertain terms the
maximum number of times a loop can be run.
The reduces the size of qdrawhelper_avx2.o from 22kbytes to 11kbytes.
Change-Id: Ie3d6281b04b4be3332497c15f3dfe9f185e20507
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
|
|
|
|
|
|
|
|
| |
The order of the arguments to testc was wrong, it should have been the
other way. Replaced with testz to also get rid of setzero.
Change-Id: Iff968c140f9ca34c6bd7c7f04a3623fd8ec42e1c
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
|
|
|
|
|
|
|
|
| |
This patch adds AVX2 versions of the fast blending functions that we
already have SSE2 versions of.
Change-Id: Ifd1a22f7891b6208cb74929ad26095d12c5a1efb
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
|
|
|
|
|
|
|
|
| |
Removes the now unused QPixelLayout parameter, simplifies the colorTable
passing and prepares for adding dithering.
Change-Id: Iaf7698b248b857804d8921bf118e7cfabbabff87
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
|
|
|
|
|
|
|
|
|
|
|
| |
From Qt 5.7 -> LGPL v2.1 isn't an option anymore, see
http://blog.qt.io/blog/2016/01/13/new-agreement-with-the-kde-free-qt-foundation/
Updated license headers to use new LGPL header instead of LGPL21 one
(in those files which will be under LGPL v3)
Change-Id: I046ec3e47b1876cd7b4b0353a576b352e3a946d9
Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
|
|
Following up on using GCC's autovectorizing for faster SSE4.1
premultiply, this patch adds specialized autovectorized versions
of premultiply for AVX2, giving another almost doubling in speed.
To make the speed up for AVX2 and also SSE4_1 available to non-GCC
compilers, the target-specific methods have been moved to separate
files.
Change-Id: I97ce05be67f4adeeb9a096eef80fd5fb662099f3
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
|