qt/qtbase.git - Qt Base (Core, Gui, Widgets, Network, ...)

diff options

author	Thiago Macieira <thiago.macieira@intel.com>	2019-01-09 20:24:32 -0800
committer	Allan Sandfeld Jensen <allan.jensen@qt.io>	2019-01-15 21:52:46 +0000
commit	f7a7a49f9235c9375fc515a3062341f285f3c2c3 (patch)
tree	df90884c90d1d129f05d3ec9657d487c9bab1218 /tests/testserver/vsftpd/vsftpd.sh
parent	cacf2ad9229a6842dbc0e002ed8ba4d04db026ae (diff)

Fix the AVX2 ARGB->ARGB64 conversion code

Commit c8c5ff19de1c34a99b8315e59015d115957b3584 introduced the solution as a simple scaling up of the code in qdrawhelper_sse4.cpp, but it's bad due to the way that the 256-bit unpack instructions work: the unpack-low instruction unpacks the lower half of each half of the 256-bit register. So we fix it up by inserting a permute4x64 that swaps the middle two quarters of the 256-bit register (permute8x32 requires a __m256i parameter, instead of an immediate). This introduces an instruction that costs 3 cycles in each loop, but since the AVX2 code has double the throughput compared to SSE4 code, it should still be faster. This problem does not affect the ARGB->ARGB32 code because that repacks at the end. Change-Id: I4d4dadb709f1482fa8ccfffd1578620b45166a4f Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>

Diffstat (limited to 'tests/testserver/vsftpd/vsftpd.sh')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: