diff options
author | Elad Cohen <elad2.cohen@intel.com> | 2017-05-03 12:28:54 +0000 |
---|---|---|
committer | Elad Cohen <elad2.cohen@intel.com> | 2017-05-03 12:28:54 +0000 |
commit | ea59a2471779a76cb73cd4d83bec3a022ff91e21 (patch) | |
tree | 69b5cb1b5ca71d2aecf0829a1bcd41a654bdf988 /docs | |
parent | 6804210951b1ab7c7b807458ae2a19d03cb4e705 (diff) |
Support arbitrary address space pointers in masked gather/scatter intrinsics.
Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a
non-default address space pointer we fail with a "Calling a function with a
bad singature!" assertion. This patch solves this by adding the 'vector of
pointers' argument as an overloaded type which will determine the address
space.
Differential revision: https://reviews.llvm.org/D31490
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302018 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r-- | docs/LangRef.rst | 18 |
1 files changed, 9 insertions, 9 deletions
diff --git a/docs/LangRef.rst b/docs/LangRef.rst index 2eab1b8a3dba..dad99e3352dd 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -7915,7 +7915,7 @@ makes sense: ; get pointers for 8 elements from array B %ptrs = getelementptr double, double* %B, <8 x i32> %C ; load 8 elements from array B into A - %A = call <8 x double> @llvm.masked.gather.v8f64(<8 x double*> %ptrs, + %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs, i32 8, <8 x i1> %mask, <8 x double> %passthru) Conversion Operations @@ -12024,9 +12024,9 @@ This is an overloaded intrinsic. The loaded data are multiple scalar values of a :: - declare <16 x float> @llvm.masked.gather.v16f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) - declare <2 x double> @llvm.masked.gather.v2f64 (<2 x double*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) - declare <8 x float*> @llvm.masked.gather.v8p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>) + declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) + declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) + declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>) Overview: """"""""" @@ -12049,7 +12049,7 @@ The semantics of this operation are equivalent to a sequence of conditional scal :: - %res = call <4 x double> @llvm.masked.gather.v4f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef) + %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef) ;; The gather with all-true mask is equivalent to the following instruction sequence %ptr0 = extractelement <4 x double*> %ptrs, i32 0 @@ -12078,9 +12078,9 @@ This is an overloaded intrinsic. The data stored in memory is a vector of any in :: - declare void @llvm.masked.scatter.v8i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) - declare void @llvm.masked.scatter.v16f32 (<16 x float> <value>, <16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) - declare void @llvm.masked.scatter.v4p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>) + declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) + declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) + declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>) Overview: """"""""" @@ -12101,7 +12101,7 @@ The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector :: ;; This instruction unconditionally stores data vector in multiple addresses - call @llvm.masked.scatter.v8i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) + call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) ;; It is equivalent to a list of scalar stores %val0 = extractelement <8 x i32> %value, i32 0 |