summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [𝘀𝗽𝗿] changes introduced through rebaseupstream/users/vitalybuka/spr/main.hwasan-print-stack-overflow-underflow-uasVitaly Buka2023-12-21282-3880/+15589
|\ | | | | | | | | | | Created using spr 1.3.4 [skip ci]
| * [hwasan] Respect strip_path_prefix printing locals (#76132)Vitaly Buka2023-12-212-2/+35
| |
| * [ADT] fix grammatical typo in Twine.h docs, NFCCyndy Ishida2023-12-211-1/+1
| |
| * [LLDB] Define _BSD_SOURCE globally, to get optreset available in mingw's ↵Martin Storsjö2023-12-222-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | getopt.h (#76137) We previously were defining _BSD_SOURCE right before including getopt.h. However, on mingw-w64, getopt.h is also transitively included by unistd.h, and unistd.h can be transitively included by many headers (recently, by some libc++ headers). Therefore, to be safe, we need to define _BSD_SOURCE before including any header. Thus do this in CMake. This fixes https://github.com/llvm/llvm-project/issues/76050.
| * [RISCV] Replace RISCVISD::VP_MERGE_VL with a new node that has a separate ↵Craig Topper2023-12-213-67/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | passthru operand. (#75682) ISD::VP_MERGE treats the false operand as the source for elements past VL. The vmerge instruction encodes 3 registers and treats the vd register as the source for the tail. This patch adds a new ISD opcode that models the tail source explicitly. During lowering we copy the false operand to this operand. I think we can merge RISCVISD::VSELECT_VL with this new opcode by using an UNDEF passthru, but I'll save that for another patch.
| * [llvm][docs][X86] Mention code model improvements in ReleaseNotes (#76190)Arthur Eubanks2023-12-211-0/+5
| |
| * [ASan][libc++] Optimization of container annotations (#76082)Tacet2023-12-213-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements conditional compilation for ASan helper code. As convey to me by @EricWF, string benchmarks with UBSan have been experiencing significant performance hit after the commit with ASan string annotations. This is likely due to the fact that no-op ASan code is not optimized out with UBSan. To address this issue, this commit conditionalizes the inclusion of ASan helper function bodies using `#ifdef` directives. This approach allows us to selectively include only the ASan code when it's actually required, thereby enhancing optimizations and improving performance. While issue was noticed in string benchmarks, I expect same overhead (just less noticeable) in other containers, therefore `std::vector` and `std::deque` have same changes. To see impact of that change run `string.libcxx.out` with UBSan and `--benchmark_filter=BM_StringAssign` or `--benchmark_filter=BM_StringConstruct`.
| * Reland [OpenMP][Fix] libomptarget Fortran tests (#76189)Fabian Mora2023-12-2111-15/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the erroneous multiple-target requirement in Fortran offloading tests. Additionally, it adds two new variables (test_flags_clang, test_flags_flang) to lit.cfg so that compiler-specific flags for Clang and Flang can be specified. This patch re-lands: #74543. The error was caused by having: ``` config.substitutions.append(("%flags", config.test_flags)) config.substitutions.append(("%flags_clang", config.test_flags_clang)) config.substitutions.append(("%flags_flang", config.test_flags_flang)) ``` when instead it has to be: ``` config.substitutions.append(("%flags_clang", config.test_flags_clang)) config.substitutions.append(("%flags_flang", config.test_flags_flang)) config.substitutions.append(("%flags", config.test_flags)) ``` because LIT replaces with the first longest sub-string match.
| * [WebAssembly][Object] Record section start offsets at start of payload (#76188)Derek Schuff2023-12-217-58/+58
| | | | | | | | | | | | | | | | | | LLVM ObjectFile currently records the start offsets of sections as the start of the section header, whereas most other tools (WABT, emscripten, wasm-tools) record it as the start of the section content, after the header. This affects binutils tools such as objdump and nm, but not compilation/assembly (since that is driven by symbols and assembler labels which already have their values inside the section payload rather in the header. This patch updates LLVM to match the other tools.
| * [test][hwasan] Update tests missed by #76130Vitaly Buka2023-12-212-4/+4
| |
| * [BOLT] Don't split likely fallthrough in CDSplit (#76164)ShatianWang2023-12-212-41/+68
| | | | | | | | | | | | | | This diff speeds up CDSplit by not considering any hot-warm splitting point that could break a fall-through branch from a basic block to its most likely successor. Co-authored-by: spupyrev <spupyrev@fb.com>
| * Revert "[Flang] Allow Intrinsic simpification with min/maxloc dim and… ↵Pete Steinfeld2023-12-212-68/+13
| | | | | | | | | | | | | | | | | | | | | | (#76184) … scalar result. (#75820)" This reverts commit 701f64790520790f75b1f948a752472d421ddaa3. The commit breaks some uses of the 'maxloc' intrinsic. See PR #75820
| * [hwasan] Separate sections in report (#76130)Vitaly Buka2023-12-211-4/+6
| | | | | | It makes them easier to read.
| * [LLDB] Fix write permission error in TestGlobalModuleCache.py (#76171)cmtice2023-12-211-0/+7
| | | | | | | | | | | | | | | | | | TestGlobalModuleCache.py, a recently added test, tries to update a source file in the build directory, but it assumes the file is writable. In our distributed build and test system, this is not always true, so the test often fails with a write permissions error. This change fixes that by setting the permissions on the file to be writable before attempting to write to it.
| * [SystemZ] Test improvements for atomic load/store instructions (NFC). (#75630)Jonas Paulsson2023-12-216-33/+67
| | | | | | | | Improve tests for atomic loads and stores, mainly by testing 128-bit atomic load and store instructions both with and w/out natural alignment.
| * [AArch64][SME2] Add builtins for FDOT, BFDOT, SUDOT, USDOT, SDOT, UDOT. (#75737)Dinar Temirbulatov2023-12-216-0/+1755
| | | | | | Add SME2 DOT builtins.
| * [AccelTable][NFC] Fix typos and duplicated code (#76155)Felipe de Azevedo Piovezan2023-12-212-7/+5
| | | | | | | | | | | | | | Renaming a member variable from "Endoding" to "Encoding". Also replace inlined code for "isNormalized" with a call to the function, so that if the definition of normalization ever changes, we only need to change the one place.
| * Reapply "[X86] Set SHF_X86_64_LARGE for globals with explicit well-known ↵Arthur Eubanks2023-12-212-10/+10
| | | | | | | | | | | | | | | | large section name (#74381)" This reverts commit 19fff858931bf575b63a0078cc553f8f93cced20. Now that explicit large globals are handled properly in the small code model.
| * [mlir][vector][nfc] Add a test case for scalable vectors (#76138)Andrzej Warzyński2023-12-211-3/+35
| | | | | | | | Extends fold-arith-extf-into-vector-contract.mlir by adding a test case for scalable vectors.
| * [llvm-profdata] Modernize FuncSampleStats, ValueSitesStats, and HotFuncInfo ↵Kazu Hirata2023-12-211-16/+13
| | | | | | | | (NFC)
| * [X86] Fix more medium code model addressing modes (#75641)Arthur Eubanks2023-12-215-66/+68
| | | | | | | | | | | | | | | | | | By looking at whether a global is large instead of looking at the code model. This also fixes references to large data in the small code model. We now always fold any 32-bit offset into the addressing mode with the large code model since it uses 64-bit relocations.
| * Re-land "[AArch64] Codegen support for FEAT_PAuthLR" (#75947)Tomas Matheson2023-12-2121-25/+752
| | | | | | | | | | | | This reverts commit 9f0f5587426a4ff24b240018cf8bf3acc3c566ae. Fix expensive checks failure by properly marking register def for ADR.
| * [flang] Fix a warningKazu Hirata2023-12-211-1/+1
| | | | | | | | | | | | | | | | This patch fixes: flang/lib/Optimizer/Transforms/StackArrays.cpp:452:7: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
| * [ISel] Add pattern matching for depositing subreg value (#75978)David Li2023-12-212-0/+110
| | | | | | | | | | Depositing value into the lowest byte/word is a common code pattern. This patch improves the code generation for it to avoid redundant AND and OR operations.
| * Add tests for driver to propagate module map flags for layering check (#75827)Walter Lee2023-12-211-1/+15
| | | | | | Xcode 14.3.1 seems to have dropped these flags so we are creating unit tests to reproduce the issue.
| * Re-land "[AArch64] Add FEAT_PAuthLR assembler support" (#75947)Tomas Matheson2023-12-2115-4/+518
| | | | | | | | | | This reverts commit 199a0f9f5aaf72ff856f68e3bb708e783252af17. Fixed the left-shift of signed integer which was causing UB.
| * [MLIR] Erase location of folded constants (#75415)Billy Zhu2023-12-217-18/+120
| | | | | | | | | | | | | | | | | | Follow up to the discussion from #75258, and serves as an alternate solution for #74670. Set the location to Unknown for deduplicated / moved / materialized constants by OperationFolder. This makes sure that the folded constants don't end up with an arbitrary location of one of the original ops that became it, and that hoisted ops don't confuse the stepping order.
| * [mlir][ArmSME] Move creation of load/store intrinsics to helpers (NFC) (#76168)Benjamin Maxwell2023-12-211-119/+108
| | | | | | | | Also, for consistency make the ZeroOp lowering switch on the ArmSMETileType, rather than the element bit width.
| * [mlir][spirv] Add folding for SNegate, [Logical]Not (#74992)Finn Plummer2023-12-215-0/+188
| | | | | | | | | | | | | | | | | | Add missing constant propogation folder for SNegate, [Logical]Not. Implement additional folding when !(!x) for all ops. This helps for readability of lowered code into SPIR-V. Part of work for #70704
| * [mlir][python] meta region_op (#75673)Maksim Levental2023-12-2114-13/+429
| |
| * [mlir] mark ChangeResult as nodiscard (#76147)Oleksandr "Alex" Zinenko2023-12-213-4/+4
| | | | | | | | | | | | | | | | | | This enum is used by dataflow analyses to indicate whether further propagation is necessary to reach the fix point. Accidentally discarding such a value will likely lead to propagation stopping early, leading to incomplete or incorrect results. The most egregious example is the duality between `join` on the analysis class, which triggers propagation internally, and `join` on the lattice class that does not and expects the caller to trigger it depending on the returned `ChangeResult`.
| * [mlir][gpu] Allow subgroup reductions over 1-d vector types (#76015)Jakub Kuderski2023-12-216-12/+84
| | | | | | | | | | | | | | | | | | | | | | | | Each vector element is reduced independently, which is a form of multi-reduction. The plan is to allow for gradual lowering of multi-reduction that results in fewer `gpu.shuffle` ops at the end: 1d `vector.multi_reduction` --> 1d `gpu.subgroup_reduce` --> smaller 1d `gpu.subgroup_reduce` --> packed `gpu.shuffle` over i32 For example we can perform 2 independent f16 reductions with a series of `gpu.shuffles` over i32, reducing the final number of `gpu.shuffles` by 2x.
| * [gn build] Port 0ea87560cca4LLVM GN Syncbot2023-12-211-0/+1
| |
| * [AArch64][SME2] Add SME2 MLA/MLS builtins. (#75584)Dinar Temirbulatov2023-12-218-0/+4048
| | | | | | Add SME2 MLA/MLS builtins.
| * Revert "[InstCombine] Extend `foldICmpBinOp` to `add`-like `or`. (#71… ↵Mikhail Gudim2023-12-212-126/+53
| | | | | | | | | | | | | | (#76167) …396)" This reverts commit 8773c9be3d9868288f1f46957945d50ff58e4e91.
| * [gn] port c6f29dbb596fNico Weber2023-12-212-0/+2
| |
| * [gn] port e3627e2690a (TextAPI/BinaryReader)Nico Weber2023-12-212-0/+11
| |
| * [RISCV] Add codegen support for experimental.vp.splice (#74688)Craig Topper2023-12-218-2/+1586
| | | | | | | | | | | | | | IR intrinsics were already defined, but no codegen support had been added. I extracted this code from our downstream. Some of it may have come from https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/ originally.
| * Revert "[AArch64] Add FEAT_PAuthLR assembler support"Tomas Matheson2023-12-2115-518/+4
| | | | | | | | | | | | This reverts commit 934b1099cbf14fa3f86a269dff957da8e5fb619f. Buildbot failues on sanitizer-x86_64-linux-fast
| * Revert "[AArch64] Codegen support for FEAT_PAuthLR"Tomas Matheson2023-12-2121-752/+25
| | | | | | | | | | | | This reverts commit 5992ce90b8c0fac06436c3c86621fbf6d5398ee5. Builtbot failures with expensive checks enabled.
| * [clang] Fix typos in documentationKazu Hirata2023-12-214-6/+6
| |
| * [llvm] Use DenseMap::contains (NFC)Kazu Hirata2023-12-215-7/+7
| |
| * [ValueTracking] Make isGuaranteedNotToBeUndef() more precise (#76160)Nikita Popov2023-12-212-33/+52
| | | | | | | | | | | | | | | | | | | | | | | | Currently isGuaranteedNotToBeUndef() is the same as isGuaranteedNotToBeUndefOrPoison(). This function is used in places where we only care about undef (due to multi-use issues), not poison. Make it more precise by only considering instructions that can create undef (like loads or call), and ignore those that can only create poison. In particular, we can ignore poison-generating flags. This means that inferring more flags has less chance to pessimize other transforms.
| * [InstCombine] Support zext nneg in gep of sext add foldNikita Popov2023-12-213-4/+30
| | | | | | | | | | Add m_NNegZext() and m_SExtLike() matchers to make doing these kinds of changes simpler in the future.
| * [InstCombine] Add zext nneg test variant for gep of sext add fold (NFC)Nikita Popov2023-12-211-0/+36
| |
| * [AMDGPU] Remove GDS and GWS for GFX12 (#76148)Jay Foad2023-12-219-11/+45
| |
| * [Clang][SME2] Enable multi-vector loads & stores for SME2 (#75821)Kerry McLaughlin2023-12-215-251/+267
| | | | | | | | | | | | | | This patch enables the following builtins for SME2: - svld1, svld1_vnum - svldnt1, svldnt1_vnum - svst1, svst1_vnum - svstnt1, svstnt1_vnum
| * [Flang] remove whole-archive option for AIX linker (#76039)madanial02023-12-212-3/+4
| | | | | | | | | | | | | | | | The AIX linker does not support the `--whole-archive` option, removing the option if the OS is AIX. --------- Co-authored-by: Mark Danial <mark.danial@ibm.com>
| * [X86] Set Uses = [EFLAGS] for ADCX/ADOXShengchen Kan2023-12-211-23/+13
| | | | | | | | | | | | According to Intel SDE, ADCX reads CF and ADOX reads OF. `Uses` was set to empty by accident, the bug was not exposed b/c compiler never emits these instructions.
| * [AMDGPU] Rename AMDGPUGlobalAtomicRtn -> AMDGPUAtomicRtn (#76157)Jay Foad2023-12-211-14/+14
| | | | | | It is used for FLAT atomics as well as Global atomics.