| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These all had somewhat custom file headers with different text from the
ones I searched for previously, and so I missed them. Thanks to Hal and
Kristina and others who prompted me to fix this, and sorry it took so
long.
Reviewers: hfinkel
Subscribers: mcrosier, javed.absar, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60406
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@357941 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
matching of MSVC behavior with #pragma pack.
Summary:
With MSVC, #pragma pack is ignored when there is explicit alignment. This differs from gcc. Clang emulates this difference when compiling for Windows.
It appears that MSVC and its headers consider the __m128/__m128i/__m128d/etc. types to be explicitly aligned and ignores #pragma pack for them. Since we don't have explicit alignment on them in our headers, we don't match the MSVC behavior here.
This patch adds explicit alignment to match this behavior. I'm hoping this won't cause any problems when we're not emulating MSVC. But if someone knows of something that would be different we can swith to conditionally adding the alignment based on _MSC_VER.
I had to add explicitly unaligned types as well so we could use them in the loadu/storeu intrinsics which use __attribute__(__packed__). Using the now explicitly aligned types wouldn't produce align 1 accesses when targeting Windows.
Reviewers: rnk, erichkeane, spatel, RKSimon
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57961
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@353555 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
This adds
_mm_loadu_epi8, _mm256_loadu_epi8, _mm512_loadu_epi8
_mm_loadu_epi16, _mm256_loadu_epi16, _mm512_loadu_epi16
_mm_storeu_epi8, _mm256_storeu_epi8, _mm512_storeu_epi8
_mm_storeu_epi16, _mm256_storeu_epi16, _mm512_storeu_epi16
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@344862 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch lowers the _mm[256|512]_cvtepi{64|32|16}_epi{32|16|8} intrinsics to
native IR in cases where the result's length is less than 128 bits.
The resulting IR for 256-bit inputs is folded into VPMOV instructions, while for
128-bit inputs the vpshufb (or, in the 64-to-32-bit case, vinsertps)
instructions are generated instead
Differential Revision: https://reviews.llvm.org/D48712
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@336643 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
width. Add a min_vector_width function attribute and tag all x86 instrinsics with it
This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.
Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.
To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.
To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.
To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.
To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.
There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.
Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.
Differential Revision: https://reviews.llvm.org/D48617
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@336583 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All of these found by grepping through IR from the builtin tests for extra trunc and zext/sext instructions that shouldn't have been there.
Some of these were real bugs where we lost bits from the user input:
_mm512_mask_broadcast_f32x8
_mm512_maskz_broadcast_f32x8
_mm512_mask_broadcast_i32x8
_mm512_maskz_broadcast_i32x8
_mm256_mask_cvtusepi16_storeu_epi8
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@336042 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@334385 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
I think this is a holdover from when we used to declare variables inside the macros. And then its been copy and pasted forward for years every time a new macro intrinsic gets added.
Interestingly this caused some tests for IRGen to be slightly more optimized. We now return a zeroinitializer directly instead of going through a store+load.
It also removed a bogus error message on another test.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@333613 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
Intel Intrinsics Guide.
We had quite a few for different element sizes of integers sometimes with strange target features attached to them.
We only need a single version for each of _m128i, _m256i, and _m512i with the target feature that first introduced those types.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@333568 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
to a single version without masking. Use select builtins with appropriate operand instead.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@333387 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
in IR instead.
Someday maybe we'll use selects for all the builtins.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@332825 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
introduced in r332266.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@332738 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
builtins.
As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend.
Differential Revision: https://reviews.llvm.org/D46742
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@332266 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
Change Header files of the intrinsics for lowering test and testn intrinsics to IR code.
Removed test and testn builtins from clang
Differential Revision: https://reviews.llvm.org/D38737
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@318035 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
macros that just pass the right comparison predicate value to the regular cmp intrinsic. Remove mask cmpeq/cmpgt builtins that are now unused.
This shortens the intrinsic headers a little and allows us to get rid of the cmpeq and cmpgt handling from CGBuiltin.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@317506 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
This patch, together with a matching llvm patch (https://reviews.llvm.org/D37669), implements the lowering of X86 mask set1 intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D37668
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@313624 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
unmasked versions and selects.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@287313 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
earlier conversion away from a macro. NFC
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@286756 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
them with unmasked builtins and selects.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285539 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
Replace with unmasked builtins and select.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285516 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
unmasked builtins and select instead.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285505 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
with selects and the older unmasked builtins.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284954 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
selects and the older unmasked builtins.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284935 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
and older unmasked builtins.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284929 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
and older unmasked builtins.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284928 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284927 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
and the older unmaksed builtins.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284925 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
and the older non-masked versions instead.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284924 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284923 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
select in the header file with the older unmasked versions instead.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284920 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@280597 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@280596 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
Thanks to Simon Pilgrim for catching the mistake.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@276564 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
64-bit GPRs are available.
Usages of these intrinsics in a 32-bit build results in assertions in the backend.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@276249 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@274544 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
_mm{|256|512}_mask_cvt{s|us|}epi16_storeu_epi8 intrinsics
Differential Revision: http://reviews.llvm.org/D21729
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@274532 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@273533 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
Probably no real functional change.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@273389 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@273386 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@272466 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
directly with __builtin_shufflevector and __builtin_ia32_select. Also improve the formatting of the AVX2 version.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@272452 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
This will allow us to remove the x86 instrinics from the backend.
Differential Revision: http://reviews.llvm.org/D21060
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@272141 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
not be multiplied by 8.
The 512-bit version was fixed recently but this was missed.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@270970 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments.
This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@269743 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
instruction set
Differential Revision: http://reviews.llvm.org/D19766
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@268385 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19591
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@267942 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
instruction set
Differential Revision: http://reviews.llvm.org/D19588
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@267876 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
instruction set
Differential Revision: http://reviews.llvm.org/D19195
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@267380 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
| |
VPBROADCASTB/W/D/Q instruction set
Differential Revision: http://reviews.llvm.org/D19012
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@266195 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
cvt{b|d|q}2mask{128|256|512} and cvtmask2{b|d|q}{128|256|512} instruction set.
Differential Revision: http://reviews.llvm.org/D19009
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@266188 91177308-0d34-0410-b5e6-96231b3b80d8
|