summaryrefslogtreecommitdiffstats
path: root/src/corelib/tools/qsimd_p.h
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'origin/5.6.1' into 5.7.0Liang Qi2016-05-261-26/+0
|\ | | | | | | | | | | | | | | Conflicts: src/corelib/tools/qsimd_p.h src/network/socket/qnativesocketengine_winrt.cpp Change-Id: I2765b671664c2a84839b2f88ba724fdf0c1fa7c6
| * Replace qUnaligned{Load,Store} with the existing q{To,From}Unalignedv5.6.1-1v5.6.1Thiago Macieira2016-05-251-37/+0
| | | | | | | | | | | | | | | | Move the Q_ALWAYS_INLINE and forcing of __builtin_memcpy to the existing functions. Change-Id: Icaa7fb2a490246bda156ffff143c137e520eea79 Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
* | Check for CRC32 properlyLaszlo Agocs2016-05-261-1/+1
| | | | | | | | | | | | | | | | Just being on ARMv8 does not mean CRC32 (and arm_acle.h) is available. Task-number: QTBUG-53629 Change-Id: I104f643f2d59620e1f4d1ef814a1de71bb484e7b Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | Merge remote-tracking branch 'origin/5.6' into 5.7Liang Qi2016-04-131-1/+7
|\| | | | | | | | | | | | | | | | | | | Conflicts: config.tests/unix/compile.test src/android/jar/src/org/qtproject/qt5/android/QtActivityDelegate.java src/testlib/qtestcase.cpp src/testlib/qtestcase.qdoc Change-Id: Ied3c471dbc9a076c8de33d673bd557e88575609d
| * wince: Fix intrinsics for X86 platforms when SSE2 is enabledAndreas Holzammer2016-04-111-1/+7
| | | | | | | | | | | | | | | | | | | | | | SSE2 can use intrinsics, which are supported by WEC2013, but for WEC7 they need to be defined. Change-Id: I261f3db4db7abcb0b59598cef9cbad404635c3ec Reviewed-by: Friedemann Kleint <Friedemann.Kleint@theqtcompany.com> Reviewed-by: Gunnar Roth <gunnar.roth@gmx.net> Reviewed-by: Kevin Funk <kevin.funk@kdab.com> Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
* | Remove the traces of the discontinued android-no-sdk platformEirik Aavitsland2016-03-301-1/+1
| | | | | | | | | | | | | | | | | | Cleaning out the workarounds for the discontinued "Embedded Android" platform of Boot2Qt. Change-Id: I0ff9d770e82a43457fb7e5da0428f4597ead4038 Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@theqtcompany.com> Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@theqtcompany.com>
* | Merge remote-tracking branch 'origin/5.6' into 5.7Liang Qi2016-03-111-0/+26
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change partially reverts 1bfc7f68 about QT_HAS_BUILTIN define and undef in src/corelib/tools/qsimd_p.h. This change is also squashed with "Fall back to c++11 standard compiler flag for host builds" which is done by Peter Seiderer. Conflicts: mkspecs/features/default_post.prf src/3rdparty/sqlite/0001-Fixing-the-SQLite3-build-for-WEC2013-again.patch src/3rdparty/sqlite/sqlite3.c src/corelib/tools/qsimd_p.h src/gui/kernel/qevent.cpp src/gui/kernel/qwindowsysteminterface.cpp src/gui/kernel/qwindowsysteminterface_p.h src/plugins/bearer/blackberry/blackberry.pro src/plugins/platforms/cocoa/qcocoasystemsettings.mm src/plugins/platformthemes/gtk2/gtk2.pro src/plugins/styles/bb10style/bb10style.pro src/sql/drivers/sqlite2/qsql_sqlite2.cpp tools/configure/configureapp.cpp Task-number: QTBUG-51644 Done-with: Peter Seiderer <ps.report@gmx.net> Change-Id: I6100d6ace31b2e8d41a95f0b5d5ebf8f1fd88b44
| * QString, QJson, QHash: Fix UBs involving unaligned loadsMarc Mutz2016-03-091-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found by UBSan: src/corelib/tools/qstring.cpp:587:42: runtime error: load of misaligned address 0x2acbf4b7551b for type 'const long long int', which requires 8 byte alignment src/corelib/json/qjson_p.h:405:30: runtime error: store to misaligned address 0x0000019b1e52 for type 'quint64', which requires 8 byte alignment src/corelib/tools/qhash.cpp:116:27: runtime error: load of misaligned address 0x2b8f9ce80e85 for type 'const qlonglong', which requires 8 byte alignment src/corelib/tools/qhash.cpp:133:26: runtime error: load of misaligned address 0x2b8f9ce80e8d for type 'const ushort', which requires 2 byte alignment Fix by memcpy()ing into a local variable. Wrap this trick in template functions in qsimd_p.h. These are marked as always- inline and use __builtin_memcpy() where available in an attempt to avoid the memcpy() function call overhead in debug builds. While this looks prohibitively expensive, from the pov of the C++ abstract machine, it is 100% equivalent, except for the absence of undefined behavior. In one case, the cast produces a local temporary which is then copied into the function, and in the other case, that local variable comes from return value of qUnalignedLoad(). Consequently, GCC compiles these two versions into identical assembler code (only verfied for ucstrncmp, but there's no reason to believe that it wouldn't hold for the other cases, too). Task-number: QTBUG-51651 Change-Id: Ia50d4a1d7580b6f803e0895c9f3d89c7da37840c Reviewed-by: Olivier Goffart (Woboq GmbH) <ogoffart@woboq.com> Reviewed-by: Allan Sandfeld Jensen <allan.jensen@theqtcompany.com>
* | iOS: Disable usage of crc32 intrinsics.Erik Verbruggen2016-03-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To quote http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20160222/151168.html : > AArch64: fix Cyclone CPU features list. > It turns out we don't have CRC after all. Who knew? So clang did define __ARM_FEATURE_CRC32, while the CPU didn't support the crc32 instructions, resulting in EXC_BAD_INSTRUCTION. Change-Id: I4b0123ac5e7fd04696c05bfe7dacce205cffac8f Task-number: QTBUG-51168 Reviewed-by: Tor Arne Vestbø <tor.arne.vestbo@theqtcompany.com>
* | Add Intel copyright to files that Intel has had non-trivial contributionThiago Macieira2016-01-211-0/+1
| | | | | | | | | | | | | | | | | | I wrote a script to help find the files, but I reviewed the contributions manually to be sure I wasn't claiming copyright for search & replace, adding Q_DECL_NOTHROW or adding "We mean it" headers. Change-Id: I7a9e11d7b64a4cc78e24ffff142b506368fc8842 Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
* | Updated license headersJani Heikkinen2016-01-151-14/+20
| | | | | | | | | | | | | | | | | | | | | | From Qt 5.7 -> LGPL v2.1 isn't an option anymore, see http://blog.qt.io/blog/2016/01/13/new-agreement-with-the-kde-free-qt-foundation/ Updated license headers to use new LGPL header instead of LGPL21 one (in those files which will be under LGPL v3) Change-Id: I046ec3e47b1876cd7b4b0353a576b352e3a946d9 Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
* | ARMv8: add crc32 feature detection.Erik Verbruggen2016-01-131-2/+22
| | | | | | | | | | Change-Id: I3cfac90dfa137d0bf3d124d87262eb2dbb56459c Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | Use __builtin_clz/ctz when available.Erik Verbruggen2015-12-011-1/+17
| | | | | | | | | | | | | | | | Nicely ask the compiler if it has a built-in for clz/ctz before resorting to CPU specific brute force measurements. Change-Id: Ifa992267ec4528219d7da14524af738316ceeaea Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | Merge remote-tracking branch 'origin/5.6' into devSimon Hausmann2015-11-271-1/+5
|\| | | | | | | Change-Id: Ib43c6f126998eefcfed9a7c1f2bcbac8b4dd05ec
| * Detect NEON on AArch64Allan Sandfeld Jensen2015-11-261-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | The __ARM_NEON is the standard define for NEON instructions support __ARM_NEON__ is only legacy, and specifically not defined in AArch64 builds, which causes us not to detect NEON support there. The NEON assembler files doesn't build with AArch64, so the NEON drawhelper methods must be excluded for now. Change-Id: Ie32f855bde94ee7efd8a8ddb7766c931778e729b Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | Merge remote-tracking branch 'origin/5.6' into devLiang Qi2015-10-141-4/+3
|\| | | | | | | | | | | | | | | | | Conflicts: tests/auto/corelib/io/qfile/tst_qfile.cpp tests/auto/corelib/io/qprocess/tst_qprocess.cpp tests/auto/corelib/tools/qversionnumber/qversionnumber.pro Change-Id: Ia93ce500349d96a2fbf0b4a37b73f088cc505c6e
| * Revert "Add support for same-file intrinsics with Clang 3.7"Thiago Macieira2015-10-011-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 39c2b8c5c12dfb8560fa04ce346a129adb223e29. The feature is not working: $ clang -c -o /dev/null -msse2 -include tmmintrin.h -xc /dev/null In file included from <built-in>:316: In file included from <command line>:1: /home/thiago/clang3.7/bin/../lib/clang/3.7.0/include/tmmintrin.h:28:2: error: "SSSE3 instruction set not enabled" For reference: $ icpc -c -o /dev/null -msse2 -include tmmintrin.h -xc /dev/null; echo $? 0 $ gcc -c -o /dev/null -msse2 -include tmmintrin.h -xc /dev/null; echo $? 0 Change-Id: I42e7ef1a481840699a8dffff140844cb8872ed6e Reviewed-by: Sérgio Martins <sergio.martins@kdab.com>
| * Fix ICC warning about use of "defined" in a macroThiago Macieira2015-09-251-1/+1
| | | | | | | | | | | | | | qhash.cpp(89): warning #3199: "defined" is always false in a macro expansion in Microsoft mode Change-Id: I7de033f80b0e4431b7f1ffff13fc960bcbb17352 Reviewed-by: Olivier Goffart (Woboq GmbH) <ogoffart@woboq.com>
* | configure: Add support for detecting AVX512 instructionsThiago Macieira2015-09-251-0/+1
|/ | | | | | | | | | | | | | | | | Tested on Linux with Clang 3.7, GCC 4.9, 5.1 and 6.0, ICC 16 beta; on OS X with Clang-XCode 6.4, ICC 16 beta; on Windows with MSVC 2013 and ICC 15. MinGW is not tested. GCC 4.9: AVX512F AVX512ER AVX512CD AVX512PF GCC 5 & 6: AVX512F AVX512ER AVX512CD AVX512PF AVX512DQ AVX512BW AVX512VL AVX512IFMA AVX512VBMI Clang 3.7: AVX512F AVX512ER AVX512CD Clang-XCode: <none> ICC 15 & 16: AVX512F AVX512ER AVX512CD AVX512PF AVX512DQ AVX512BW AVX512VL MSVC 2013: <none> Change-Id: Ib306f8f647014b399b87ffff13f1da1b161c31d7 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@theqtcompany.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Expand reporting of the Intel instruction set extensionsThiago Macieira2015-09-221-1/+92
| | | | | | | | | | | | | | | Detection for most of them is free because we're loading the entire registers anyway. The only exception is AVX512VBMI, which is in a new register we hadn't yet read from. I've also added the new GCC names so they can be used with QT_FUNCTION_TARGET. The only two exceptions are "movbe" and "popcnt", which are extremely restricted in use and we are not likely to have code dedicated to using them. Change-Id: Ib306f8f647014b399b87ffff13f1d8fd29e58be0 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@theqtcompany.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Reorganize the bits for the CPU feature detectionThiago Macieira2015-09-221-24/+36
| | | | | | | | | | | | | | | Instead of trying to detect one bit and set another, let's just use the bits from the x86 CPUID instruction on x86. This makes use of the full 64-bit space now. Since MSVC doesn't like enums bigger than 32-bit, we have to store the bit number instead of the actual bit value in the constant. For that reason, I also renamed the constants, to catch anyone who was using them directly, instead of through qCpuHasFeature. Change-Id: Ib306f8f647014b399b87ffff13f1d587692d827a Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@theqtcompany.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Change the CPU feature status word to be 64-bit instead of 32-bitThiago Macieira2015-09-111-5/+15
| | | | | | | | I'm going to need the extra bits for x86. Change-Id: Ib306f8f647014b399b87ffff13f1d3d23e138518 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@theqtcompany.com> Reviewed-by: Allan Sandfeld Jensen <allan.jensen@theqtcompany.com>
* Add support for same-file intrinsics with Clang 3.7Thiago Macieira2015-08-221-2/+3
| | | | | | | It supports the same feature that GCC does Change-Id: Ib306f8f647014b399b87ffff13f1f3159898741b Reviewed-by: Olivier Goffart (Woboq GmbH) <ogoffart@woboq.com>
* Remove attempt at detecting compile-time HLEThiago Macieira2015-07-201-3/+0
| | | | | | | | There's no __HLE__ macro and there won't be, since the HLE prefix can be run on older CPUs. There's no need for runtime detection. Change-Id: Ib306f8f647014b399b87ffff13f1daba0e654b02 Reviewed-by: Olivier Goffart (Woboq GmbH) <ogoffart@woboq.com>
* Update copyright headersJani Heikkinen2015-02-111-7/+7
| | | | | | | | | | | | | | | | | | Qt copyrights are now in The Qt Company, so we could update the source code headers accordingly. In the same go we should also fix the links to point to qt.io. Outdated header.LGPL removed (use header.LGPL21 instead) Old header.LGPL3 renamed to header.LGPL3-COMM to match actual licensing combination. New header.LGPL-COMM taken in the use file which were using old header.LGPL3 (src/plugins/platforms/android/extract.cpp) Added new header.LGPL3 containing Commercial + LGPLv3 + GPLv2 license combination Change-Id: I6f49b819a8a20cc4f88b794a8f6726d975e8ffbe Reviewed-by: Matti Paaso <matti.paaso@theqtcompany.com>
* Store the GCC version number in Q_CC_GNUThiago Macieira2014-11-051-2/+2
| | | | | | | | | | | The sequence of (__GNUC__ * 100 + __GNUC_MINOR__) was used in quite a few places. Simplify it to make the code more readable. This follows the change done for Clang, which was quite necessary since Apple's version of Clang has different build numbers. Change-Id: I886271a5a5f21ae59485ecf8d140527723345a46 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@theqtcompany.com>
* Define Q_CC_CLANG to be the version of upstream Clang that's in useTor Arne Vestbø2014-11-051-1/+1
| | | | | | | | | | | | We map the Apple Clang versions to upstream, so that we have one define to compare against. Fixes build break on iOS due to qbasicatomic.h not defining QT_BASIC_ATOMIC_HAS_CONSTRUCTORS on Apple Clang versions, which is needed after 1e9db9f5e18123f2e686c10b Change-Id: I17493c0187c20abc5d22e71944d62bfd16afbad2 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Fix X86 Wince builds.Bjoern Breitmeyer2014-10-291-1/+1
| | | | | | | | Windows CE does not have all _BitScanReverse intrinsics, so disable those for Q_OS_WINCE. Change-Id: I34a3c02c6ffdfff2a209b2c9c1b80bef4566ee39 Reviewed-by: Friedemann Kleint <Friedemann.Kleint@digia.com>
* Update license headers and add new license filesMatti Paaso2014-09-241-19/+11
| | | | | | | | | - Renamed LICENSE.LGPL to LICENSE.LGPLv21 - Added LICENSE.LGPLv3 - Removed LICENSE.GPL Change-Id: Iec3406e3eb3f133be549092015cefe33d259a3f2 Reviewed-by: Iikka Eklund <iikka.eklund@digia.com>
* Merge remote-tracking branch 'origin/5.3' into 5.4Frederik Gladhorn2014-09-231-0/+11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The isAlwaysAskOption was removed in 38621713150b663355ebeb799a5a50d8e39a3c38 so manually removed code in src/plugins/bearer/connman/qconnmanengine.cpp Conflicts: src/corelib/global/qglobal.h src/corelib/tools/qcollator_macx.cpp src/corelib/tools/qstring.cpp src/gui/kernel/qwindow.cpp src/gui/kernel/qwindow_p.h src/gui/text/qtextengine.cpp src/platformsupport/fontdatabases/fontconfig/qfontenginemultifontconfig_p.h src/plugins/platforms/android/qandroidinputcontext.cpp src/plugins/platforms/xcb/qglxintegration.cpp src/plugins/platforms/xcb/qglxintegration.h src/plugins/platforms/xcb/qxcbconnection_xi2.cpp src/testlib/qtestcase.cpp src/testlib/qtestlog.cpp src/widgets/dialogs/qfiledialog.cpp src/widgets/kernel/qwindowcontainer.cpp tests/auto/corelib/tools/qcollator/tst_qcollator.cpp tests/auto/gui/text/qtextscriptengine/tst_qtextscriptengine.cpp tests/auto/widgets/kernel/qwidget_window/tst_qwidget_window.cpp tests/auto/widgets/widgets/qlineedit/tst_qlineedit.cpp Change-Id: Ic5d4187f682257a17509f6cd28d2836c6cfe2fc8
| * Add missing private headers warningSamuel Gaist2014-09-041-0/+11
| | | | | | | | | | Change-Id: I7a4dd22ea3bcebf4c3ec3ad731628fd8f3c247e0 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | Remove the last remnants of iWMMXt in QtThiago Macieira2014-08-051-24/+0
| | | | | | | | | | | | | | | | This code hasn't been tested for at least 4 years. It's not maintained and probably doesn't work. Change-Id: I4b9a5179e34111b400914f91caa6b741b69771bb Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
* | Merge remote-tracking branch 'origin/5.3' into devFrederik Gladhorn2014-07-011-1/+3
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: mkspecs/qnx-x86-qcc/qplatformdefs.h src/corelib/global/qglobal.h src/network/socket/qnativesocketengine_winrt.cpp src/plugins/platforms/android/androidjniaccessibility.cpp src/plugins/platforms/windows/qwindowswindow.cpp Manually adjusted: mkspecs/qnx-armle-v7-qcc/qplatformdefs.h to include 9ce697f2d54be6d94381c72af28dda79cbc027d4 Thanks goes to Sergio for the qnx mkspecs adjustments. Change-Id: I53b1fd6bc5bc884e5ee2c2b84975f58171a1cb8e
| * Fix compilation with the Intel compiler on certain systemsThiago Macieira2014-06-191-1/+3
| | | | | | | | | | | | | | | | We require the intrinsics from immintrin.h, so include it unconditioanlly with that compiler. Change-Id: I4a17676631f9d89e2d22e486f40c9b177ca06c1e Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
* | MIPS: Support recognition of the DSP ASE at run-timeAdrian Perez de Castro2014-06-271-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Add detection of MIPS DSPr2 at run-time in qsimd.cpp. This makes it possible to have generic Qt builds for MIPS that can enable the fast code paths for processors with the DSP ASE at run-time. Also, this makes it possible to manually disable them by setting the environment variable "QT_NO_CPU_FEATURE=dspr2". Last, but not least, functions requiring DSPr2 are not enabled when running in CPUs with version-1 DSP. Change-Id: Ia5a01d84119553c22ab83386c74a6cb8ba5fee53 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* | Add support for single-file multi-target intrinsics in QtThiago Macieira2014-05-271-17/+102
|/ | | | | | | | | | | | | | | | | | | | | | | GCC 4.9 now allows us to #include any and all intrinsics headers, not just the one for which we're compiling code, a behavior that ICC and MSVC have had for some time. With that, we're able to have the functions for different targets in the same source file. See the GCC manual: http://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning.html This functionality is notified by the QT_COMPILER_SUPPORTS_HERE(XXX) macro, which indicates that all the intrinsics from QT_COMPILER_SUPPORTS_xxx are available and enabled. To complement, a QT_COMPILER_SUPPORTS(XXX) macro is also added. Unlike ICC and MSVC, GCC requires a special function attribute, which will also cause code optimization. That's the QT_FUNCTION_TARGET macro. Note: because of the absence of the target attribute, ICC and MSVC will not generate instructions with the VEX prefix unless they only exist with the VEX prefix or if -mavx / -arch:AVX are enabled. Change-Id: I0c1880c20324bd8e0fc68a863e36d1fa7755dff0 Reviewed-by: Allan Sandfeld Jensen <allan.jensen@digia.com>
* Fix compile for embedded Androidaavit2014-03-311-1/+1
| | | | | | | It also has a broken declaration of posix_memalign Change-Id: Ie8f245564f80b04901425729b46953828204efaf Reviewed-by: Andy Nichols <andy.nichols@digia.com>
* Update the macro that MSVC 2013 defines for AVX code generationThiago Macieira2014-02-011-11/+7
| | | | | | | | | | | | | | | | | | http://msdn.microsoft.com/en-us/library/b0084kay(v=vs.120).aspx says: __AVX__ Defined when /arch:AVX is specified. Now we know what flag it is, we don't need to use our _M_AVX flag anymore. We're also now assuming that Microsoft will follow the same pattern for AVX2 (i.e., __AVX2__), so this commit also removes the check for _M_AVX2. The other defines that were defined alongside AVX2 are removed because they have no use currently in Qt. Change-Id: I64a026b2206dbd0d2dffa7c803bee969c9b94a94 Reviewed-by: Friedemann Kleint <Friedemann.Kleint@digia.com> Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Improve ucstrncmp with SSE2Thiago Macieira2014-01-311-0/+13
| | | | | | | | | | The benchmarks showed that the basic SSE2-based building block improves performance by about 50% with data extracted from a Qt Creator run. None of the other alternatives provide clear better results -- the best was 3.8% and with only one compiler. Change-Id: I77314785afecfacaf21c41fd79c97cadf357f895 Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* Add support for UTF-8 encoding/decoding with SIMDThiago Macieira2014-01-311-5/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Decoding from UTF-8 is easy: if the high bit is set, we fall back to the byte-by-byte decoding. Encoding to UTF-8 requires a little bit more work: to detect anything between 0x0080 and 0xffff, we have several options but none as easy as above. Multiple alternatives are in the benchmark code. In both loops, we do two things once we run into a non-ASCII character: first, we continue the loop for the remainder of ASCII characters in the buffer (which we can tell by checking the bits set in the mask), then we find the last non-ASCII character in that 16-character group, so we don't reenter the SSE code too soon. For the UTF-8 encoding, I have chosen the alternative that results in the best performance. It's closely tied to the alternative running the PMIN instruction, but that requires SSE 4.1. It's not worth the complexity. And quite counter-intuitively, the dedicated string instruction from SSE 4.2 performs most poorly of all solutions. This begs re-visiting the performance of the toLatin1 encoder. The best of 10 benchmark runs of this code were measured on my SandyBridge CPU @ 2.66 GHz (turbo @ 3.3 GHz), both as CPU cycles and as CPU ticks: Compared to: ICU Qt 4.7 non-SSE Qt 5.3 Data set fromUtf8 toUtf8 fromUtf8 toUtf8 fromUtf8 toUtf8 ASCII only 7.50x 6.22x 6.94x 7.60x 4.45x 4.90x 2-char UTF-8 1.17x 1.33x 1.64x 1.56x 1.01x 1.02x 3-char UTF-8 1.08x 1.18x 1.48x 1.33x 0.97x 0.92x 4-char UTF-8 1.05x 1.19x 1.20x 1.21x 0.97x 0.97x Creator data 3.62x 2.16x 2.60x 1.25x 1.78x 1.23x As shown by the numbers, the SSE-based code is slightly worse than the non-SSE code for dense non-ASCII strings. However, as evident in the Qt Creator data, most strings manipulated by applications are either pure ASCII or mostly so, so there's a net gain. Done-with: H. Peter Anvin <hpa@linux.intel.com> Change-Id: Ia74fbdfdcd7b088f6cba5048c03a153c01f5dbc1 Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* Replace the qCpuHasFeature function with a macroThiago Macieira2013-12-051-5/+1
| | | | | | | | | | | We want to make sure that there's a constant propagation from the static variable that is filled in with the current code-generation options. With most compilers in debug mode, we'd carry dead code. With MSVC, even inlining is really bad even in release mode, and it doesn't perform constant propagation even with __forceinline. Change-Id: I7a95ff6622b864771243990bb5e205b2df0c33fc Reviewed-by: Marc Mutz <marc.mutz@kdab.com>
* Make the inline CPU detection functions also staticThiago Macieira2013-03-261-2/+2
| | | | | | | | | | | | Since qCpuHasFeature() checks the static qCompilerCpuFeatures variable and that variable's value might change depending on the compiler flags, it's best to ensure that the function is not subject to link-time merging. That would be bad if it happened when qCpuHasFeature() was used from a file with higher CPU compiler settings than the default, as it would incorrectly conclude that certain features are always available. Change-Id: I8bacde056fb89869ec1d306a163742e72522315e Reviewed-by: Tor Arne Vestbø <tor.arne.vestbo@digia.com>
* Merge remote-tracking branch 'origin/stable' into devFrederik Gladhorn2013-02-141-1/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: src/concurrent/doc/qtconcurrent.qdocconf src/corelib/doc/qtcore.qdocconf src/corelib/global/qglobal.h src/dbus/doc/qtdbus.qdocconf src/dbus/qdbusmessage.h src/gui/doc/qtgui.qdocconf src/gui/image/qimagereader.cpp src/network/doc/qtnetwork.qdocconf src/opengl/doc/qtopengl.qdocconf src/opengl/qgl.h src/plugins/platforms/windows/qwindowswindow.cpp src/printsupport/doc/qtprintsupport.qdocconf src/sql/doc/qtsql.qdocconf src/testlib/doc/qttestlib.qdocconf src/tools/qdoc/doc/config/qt-cpp-ignore.qdocconf src/widgets/doc/qtwidgets.qdocconf src/xml/doc/qtxml.qdocconf Change-Id: Ie9a1fa2cc44bec22a0b942e817a1095ca3414629
| * Allow x86intrin.h with ICC 13.1Thiago Macieira2013-02-061-1/+1
| | | | | | | | | | | | | | | | The Intel C++ Composer XE 2013 Update 2 (a.k.a. ICC 13.1) has fixed the bug of the undefined intrinsics. Change-Id: If837a0800725d55fed7eff39b9d52c359dabb073 Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
| * Let ICC 13 build Qt again if the system compiler is GCC 4.7.Thiago Macieira2013-02-011-1/+2
| | | | | | | | | | | | | | | | | | | | GCC 4.7 has new builtins in x86intrin.h that ICC 13 does not (yet) understand, causing compilation errors. /usr/lib/gcc/x86_64-redhat-linux/4.7.2/include/adxintrin.h(36): error: identifier "__builtin_ia32_addcarryx_u32" is undefined Change-Id: I1845ccc3bf3ac15aef063bc3f998c5839fa51866 Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
* | Remove QT_{BEGIN,END}_HEADER macro usageSergio Ahumada2013-01-291-4/+0
|/ | | | | | | | | | | The macro was made empty in ba3dc5f3b56d1fab6fe37fe7ae08096d7dc68bcb and is no longer necessary or used. Discussed-on: http://lists.qt-project.org/pipermail/development/2013-January/009284.html Change-Id: Id2bb2e2cabde059305d4af5f12593344ba30f001 Reviewed-by: Laszlo Papp <lpapp@kde.org> Reviewed-by: Jędrzej Nowacki <jedrzej.nowacki@digia.com> Reviewed-by: hjk <hjk121@nokiamail.com>
* Update copyright year in Digia's license headersSergio Ahumada2013-01-181-1/+1
| | | | | Change-Id: Ic804938fc352291d011800d21e549c10acac66fb Reviewed-by: Lars Knoll <lars.knoll@digia.com>
* Change copyrights from Nokia to DigiaIikka Eklund2012-09-221-24/+24
| | | | | | | | Change copyrights and license headers from Nokia to Digia Change-Id: If1cc974286d29fd01ec6c19dd4719a67f4c3f00e Reviewed-by: Lars Knoll <lars.knoll@digia.com> Reviewed-by: Sergio Ahumada <sergio.ahumada@digia.com>
* Make the CPU detection much more efficient in user codeThiago Macieira2012-07-021-3/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | First, check that the option in question hasn't been already enabled by the compiler, via compiler switches. If it has been, then we don't need to verify anything, and we can assume that it's safe to use such instructions. For example, on an x86-64 build, qCpuHasFeature(SSE2) is always a constant true. If the compile-time check fails, then we proceed to try and detect the processor features at runtime. But instead of insisting on a call to qDetectCPUFeatures, allow the code using the detection to read from a variable and simply test it for values. Only if the variable isn't initialised should it make a function call. The Q_ASSUME allows this code to be very efficient even with multiple uses of qCpuHasFeature. Change the uninitialised value from -1 to 0 so that simpler instructions can be used to check for non-initialisation. The qDetectCPUFeatures function is renamed to qDetectCpuFeatures to match the Qt coding style and also to catch uses this code that need to be adapted. Change-Id: I24ca5a6ad21075e2e249e1a4f8f5057b8f68ce7c Reviewed-by: Bradley T. Hughes <bradley.hughes@nokia.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Add support for the x86intrin.h header on GCC systems.Thiago Macieira2012-06-091-0/+6
| | | | | | | | | | | | This header can be included at any time on x86 systems and is present since the GCC versiosn that also support AVX. It contains intrinsics for instructions that have been present in x86 CPUs since the dawn of time. Change-Id: I9adb066c2c0b56ce8fd5ed7366716038f1254502 Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com> Reviewed-by: Samuel Rødal <samuel.rodal@nokia.com> Reviewed-by: Bradley T. Hughes <bradley.hughes@nokia.com>