| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| | |
Created using spr 1.3.4
[skip ci]
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
getopt.h (#76137)
We previously were defining _BSD_SOURCE right before including getopt.h.
However, on mingw-w64, getopt.h is also transitively included by
unistd.h, and unistd.h can be transitively included by many headers
(recently, by some libc++ headers).
Therefore, to be safe, we need to define _BSD_SOURCE before including
any header. Thus do this in CMake.
This fixes https://github.com/llvm/llvm-project/issues/76050.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
passthru operand. (#75682)
ISD::VP_MERGE treats the false operand as the source for elements past
VL. The vmerge instruction encodes 3 registers and treats the vd
register as the source for the tail.
This patch adds a new ISD opcode that models the tail source explicitly.
During lowering we copy the false operand to this operand.
I think we can merge RISCVISD::VSELECT_VL with this new opcode by using
an UNDEF passthru, but I'll save that for another patch.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit implements conditional compilation for ASan helper code.
As convey to me by @EricWF, string benchmarks with UBSan have been
experiencing significant performance hit after the commit with ASan
string annotations. This is likely due to the fact that no-op ASan code
is not optimized out with UBSan. To address this issue, this commit
conditionalizes the inclusion of ASan helper function bodies using
`#ifdef` directives. This approach allows us to selectively include only
the ASan code when it's actually required, thereby enhancing
optimizations and improving performance.
While issue was noticed in string benchmarks, I expect same overhead
(just less noticeable) in other containers, therefore `std::vector` and
`std::deque` have same changes.
To see impact of that change run `string.libcxx.out` with UBSan and
`--benchmark_filter=BM_StringAssign` or
`--benchmark_filter=BM_StringConstruct`.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch fixes the erroneous multiple-target requirement in Fortran
offloading tests. Additionally, it adds two new variables
(test_flags_clang, test_flags_flang) to lit.cfg so that
compiler-specific flags for Clang and Flang can be specified.
This patch re-lands: #74543. The error was caused by having:
```
config.substitutions.append(("%flags", config.test_flags))
config.substitutions.append(("%flags_clang", config.test_flags_clang))
config.substitutions.append(("%flags_flang", config.test_flags_flang))
```
when instead it has to be:
```
config.substitutions.append(("%flags_clang", config.test_flags_clang))
config.substitutions.append(("%flags_flang", config.test_flags_flang))
config.substitutions.append(("%flags", config.test_flags))
```
because LIT replaces with the first longest sub-string match.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
LLVM ObjectFile currently records the start offsets of sections as the
start of the section header, whereas most other tools (WABT, emscripten,
wasm-tools) record it as the start of the section content, after the
header. This affects binutils tools such as objdump and nm, but not
compilation/assembly (since that is driven by symbols and assembler
labels which already have their values inside the section payload rather
in the header. This patch updates LLVM to match the other tools.
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
This diff speeds up CDSplit by not considering any hot-warm splitting
point that could break a fall-through branch from a basic block to its
most likely successor.
Co-authored-by: spupyrev <spupyrev@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
(#76184)
… scalar result. (#75820)"
This reverts commit 701f64790520790f75b1f948a752472d421ddaa3.
The commit breaks some uses of the 'maxloc' intrinsic.
See PR #75820
|
| |
| |
| | |
It makes them easier to read.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
TestGlobalModuleCache.py, a recently added test, tries to update a
source file in the build directory, but it assumes the file is writable.
In our distributed build and test system, this is not always true, so
the test often fails with a write permissions error.
This change fixes that by setting the permissions on the file to be
writable before attempting to write to it.
|
| |
| |
| |
| | |
Improve tests for atomic loads and stores, mainly by testing 128-bit atomic load and store instructions both with and w/out natural alignment.
|
| |
| |
| | |
Add SME2 DOT builtins.
|
| |
| |
| |
| |
| |
| |
| | |
Renaming a member variable from "Endoding" to "Encoding".
Also replace inlined code for "isNormalized" with a call to the
function, so that if the definition of normalization ever changes, we
only need to change the one place.
|
| |
| |
| |
| |
| |
| |
| |
| | |
large section name (#74381)"
This reverts commit 19fff858931bf575b63a0078cc553f8f93cced20.
Now that explicit large globals are handled properly in the small code model.
|
| |
| |
| |
| | |
Extends fold-arith-extf-into-vector-contract.mlir by adding a test case
for scalable vectors.
|
| |
| |
| |
| | |
(NFC)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
By looking at whether a global is large instead of looking at the code
model.
This also fixes references to large data in the small code model.
We now always fold any 32-bit offset into the addressing mode with the
large code model since it uses 64-bit relocations.
|
| |
| |
| |
| |
| |
| | |
This reverts commit 9f0f5587426a4ff24b240018cf8bf3acc3c566ae.
Fix expensive checks failure by properly marking register def for ADR.
|
| |
| |
| |
| |
| |
| |
| |
| | |
This patch fixes:
flang/lib/Optimizer/Transforms/StackArrays.cpp:452:7: error:
ignoring return value of function declared with 'nodiscard'
attribute [-Werror,-Wunused-result]
|
| |
| |
| |
| |
| | |
Depositing value into the lowest byte/word is a common code pattern.
This patch improves the code generation for it to avoid redundant AND
and OR operations.
|
| |
| |
| | |
Xcode 14.3.1 seems to have dropped these flags so we are creating unit tests to reproduce the issue.
|
| |
| |
| |
| |
| | |
This reverts commit 199a0f9f5aaf72ff856f68e3bb708e783252af17.
Fixed the left-shift of signed integer which was causing UB.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Follow up to the discussion from #75258, and serves as an alternate
solution for #74670.
Set the location to Unknown for deduplicated / moved / materialized
constants by OperationFolder. This makes sure that the folded constants
don't end up with an arbitrary location of one of the original ops that
became it, and that hoisted ops don't confuse the stepping order.
|
| |
| |
| |
| | |
Also, for consistency make the ZeroOp lowering switch on the ArmSMETileType,
rather than the element bit width.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add missing constant propogation folder for SNegate, [Logical]Not.
Implement additional folding when !(!x) for all ops.
This helps for readability of lowered code into SPIR-V.
Part of work for #70704
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This enum is used by dataflow analyses to indicate whether further
propagation is necessary to reach the fix point. Accidentally discarding
such a value will likely lead to propagation stopping early, leading to
incomplete or incorrect results. The most egregious example is the
duality between `join` on the analysis class, which triggers propagation
internally, and `join` on the lattice class that does not and expects
the caller to trigger it depending on the returned `ChangeResult`.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Each vector element is reduced independently, which is a form of
multi-reduction.
The plan is to allow for gradual lowering of multi-reduction that
results in fewer `gpu.shuffle` ops at the end:
1d `vector.multi_reduction` --> 1d `gpu.subgroup_reduce` --> smaller 1d
`gpu.subgroup_reduce` --> packed `gpu.shuffle` over i32
For example we can perform 2 independent f16 reductions with a series of
`gpu.shuffles` over i32, reducing the final number of `gpu.shuffles` by 2x.
|
| | |
|
| |
| |
| | |
Add SME2 MLA/MLS builtins.
|
| |
| |
| |
| |
| |
| |
| | |
(#76167)
…396)"
This reverts commit 8773c9be3d9868288f1f46957945d50ff58e4e91.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
IR intrinsics were already defined, but no codegen support had been
added.
I extracted this code from our downstream. Some of it may have come from
https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/ originally.
|
| |
| |
| |
| |
| |
| | |
This reverts commit 934b1099cbf14fa3f86a269dff957da8e5fb619f.
Buildbot failues on sanitizer-x86_64-linux-fast
|
| |
| |
| |
| |
| |
| | |
This reverts commit 5992ce90b8c0fac06436c3c86621fbf6d5398ee5.
Builtbot failures with expensive checks enabled.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently isGuaranteedNotToBeUndef() is the same as
isGuaranteedNotToBeUndefOrPoison(). This function is used in places
where we only care about undef (due to multi-use issues), not poison.
Make it more precise by only considering instructions that can create
undef (like loads or call), and ignore those that can only create
poison. In particular, we can ignore poison-generating flags.
This means that inferring more flags has less chance to pessimize other
transforms.
|
| |
| |
| |
| |
| | |
Add m_NNegZext() and m_SExtLike() matchers to make doing these kinds
of changes simpler in the future.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
This patch enables the following builtins for SME2:
- svld1, svld1_vnum
- svldnt1, svldnt1_vnum
- svst1, svst1_vnum
- svstnt1, svstnt1_vnum
|
| |
| |
| |
| |
| |
| |
| |
| | |
The AIX linker does not support the `--whole-archive` option, removing
the option if the OS is AIX.
---------
Co-authored-by: Mark Danial <mark.danial@ibm.com>
|
| |
| |
| |
| |
| |
| | |
According to Intel SDE, ADCX reads CF and ADOX reads OF. `Uses` was
set to empty by accident, the bug was not exposed b/c compiler never
emits these instructions.
|
| |
| |
| | |
It is used for FLAT atomics as well as Global atomics.
|