summaryrefslogtreecommitdiffstats
path: root/tests/testserver/vsftpd/vsftpd.sh
diff options
context:
space:
mode:
authorThiago Macieira <thiago.macieira@intel.com>2019-01-09 20:24:32 -0800
committerAllan Sandfeld Jensen <allan.jensen@qt.io>2019-01-15 21:52:46 +0000
commitf7a7a49f9235c9375fc515a3062341f285f3c2c3 (patch)
treedf90884c90d1d129f05d3ec9657d487c9bab1218 /tests/testserver/vsftpd/vsftpd.sh
parentcacf2ad9229a6842dbc0e002ed8ba4d04db026ae (diff)
Fix the AVX2 ARGB->ARGB64 conversion code
Commit c8c5ff19de1c34a99b8315e59015d115957b3584 introduced the solution as a simple scaling up of the code in qdrawhelper_sse4.cpp, but it's bad due to the way that the 256-bit unpack instructions work: the unpack-low instruction unpacks the lower half of each half of the 256-bit register. So we fix it up by inserting a permute4x64 that swaps the middle two quarters of the 256-bit register (permute8x32 requires a __m256i parameter, instead of an immediate). This introduces an instruction that costs 3 cycles in each loop, but since the AVX2 code has double the throughput compared to SSE4 code, it should still be faster. This problem does not affect the ARGB->ARGB32 code because that repacks at the end. Change-Id: I4d4dadb709f1482fa8ccfffd1578620b45166a4f Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Diffstat (limited to 'tests/testserver/vsftpd/vsftpd.sh')
0 files changed, 0 insertions, 0 deletions