summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [𝘀𝗽𝗿] initial versionupstream/users/ilovepi/spr/riscv-support-global-dynamic-tlsdesc-in-the-risc-v-backendPaul Kirth2024-01-0919-12/+454
| | | | Created using spr 1.3.4
* libclc: generic: add half implementation for erf/erfc (#66901)Romaric Jodin2024-01-092-0/+24
| | | | | libclc does not have a half implementation for erf/erfc Add one based on the float implementation by extending the input and truncating the output.
* [GVNSink] Skip debug intrinsics when identifying sinking candidates (#77419)Shan Huang2024-01-092-3/+89
| | | Fixes #77147.
* [lldb][libc++] Adds some C++20 calendar data formatters. (#76983)Mark de Wever2024-01-095-0/+221
| | | | | | | | | | | | | | | | | | | | This adds a subset of the C++20 calendar data formatters: - day, - month, - year, - month_day, - month_day_last, and - year_month_day. A followup patch will add the missing calendar data formatters: - weekday, - weekday_indexed, - weekday_last, - month_weekday, - month_weekday_last, - year_month, - year_month_day_last - year_month_weekday, and - year_month_weekday_last.
* [lldb][Type] Add TypeQuery::SetLanguages API (#75926)Michael Buch2024-01-092-0/+8
| | | | | | | | This is required for users of `TypeQuery` that limit the set of languages of the query using APIs such as `GetSupportedLanguagesForTypes` or `GetSupportedLanguagesForExpressions`. Example usage: https://github.com/apple/llvm-project/pull/7885
* [gn] Make sync script print github URLsNico Weber2024-01-091-1/+2
| | | | Phab no longer knows about new revisions.
* [gn] port 07c9189fcc06Nico Weber2024-01-091-0/+1
|
* [gn] port 07c9189fcc06 (DWARFLinker/Classic)Nico Weber2024-01-097-16/+31
|
* [clang]use correct this scope to evaluate noexcept expr (#77416)Congcong Cai2024-01-093-1/+24
| | | | | | | Fixes: #77411 When substituting deduced type, noexcept expr in method should be instantiated and evaluated. ThisScrope should be switched to method context instead of origin sema context
* [mlir][gpu] Use DenseI32Array for NVVM's maxntid and reqntid (NFC) (#77466)Guray Ozen2024-01-095-21/+13
|
* [libc++] Allow running the test suite with optimizations (#68753)Louis Dionne2024-01-0931-74/+157
| | | | | | | | | This patch adds a configuration of the libc++ test suite that enables optimizations when building the tests. It also adds a new CI configuration to exercise this on a regular basis. This is added in the context of [1], which requires building with optimizations in order to hit the bug. [1]: https://github.com/llvm/llvm-project/issues/68552
* [PGO] Exposing PGO's Counter Reset and File Dumping APIs (#76471)Qiongsi Wu2024-01-0912-57/+310
| | | | | | | | | | | | | | | | | | | | | | | | | This PR exposes four PGO functions - `__llvm_profile_set_filename` - `__llvm_profile_reset_counters`, - `__llvm_profile_dump` - `__llvm_orderfile_dump` to user programs through the new header `instr_prof_interface.h` under `compiler-rt/include/profile`. This way, the user can include the header `profile/instr_prof_interface.h` to introduce these four names to their programs. Additionally, this PR defines macro `__LLVM_INSTR_PROFILE_GENERATE` when the program is compiled with profile generation, and defines macro `__LLVM_INSTR_PROFILE_USE` when the program is compiled with profile use. `__LLVM_INSTR_PROFILE_GENERATE` together with `instr_prof_interface.h` define the PGO functions only when the program is compiled with profile generation. When profile generation is off, these PGO functions are defined away and leave no trace in the user's program. Background: https://discourse.llvm.org/t/pgo-are-the-llvm-profile-functions-stable-c-apis-across-llvm-releases/75832
* Disable autolink_private_module.m for z/OS & AIXZibi Sarbinowski2024-01-091-0/+2
| | | | | This change disables it on z/OS and AIX since it fails on both platforms with: fatal error: error in backend: Objective-C support is unimplemented for object file format
* [acc] OpenACC dialect design philosophy and details (#75548)Razvan Lupusoru2024-01-092-7/+450
| | | | | | | | This document captures the design philosophy of the acc dialect. It also shares the rationale behind the design and implementation of various operations - and ties that back to the dialect design goals. Co-authored-by: Valentin Clement <clementval@gmail.com> Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
* [MLIR][NVVM] Add missing `;` when lowering stmatrix Op (#77471)Pradeep Kumar2024-01-092-6/+6
|
* [llvm/unittests] Reset the IsSSA property when using finalizeBundle() (#77469)Sameer Sahasrabuddhe2024-01-091-0/+1
|
* [DAG] XformToShuffleWithZero - use dyn_cast instead of isa/cast pair. NFCI.Simon Pilgrim2024-01-091-4/+4
|
* [GISel] Add RegState::Define to temporary defs in apply patterns (#77425)Sergei Barannikov2024-01-095-19/+75
| | | | Previously, registers created for temporary defs in apply patterns were rendered as uses, resulting in machine verifier errors.
* [SEH][CodeGen] Add test to track CFG optimization bug for SEH (#77441)HaohaiWen2024-01-091-0/+81
| | | | | LiveDebugValues requires CFG only has one entry. BranchFolding and MachineBlockPlacement may remove all predecessors of landing pad which leaves it to be another entry.
* [SelectionDAG] Add and use SDNode::getAsAPIntVal() helper (#77455)Alex Bradbury2024-01-0912-31/+36
| | | | | | | | | This is the logical equivalent for #76710 for APInt and uses the same naming scheme. Converted existing users through: `git grep -l "cast<ConstantSDNode>\(.*\).*getAPIntValueValue" | xargs sed -E -i 's/cast<ConstantSDNode>\((.*)\)->getAPIntValue/\1->getAsAPIntVal/'`
* [PhaseOrdering] Regenerate test checks (NFC)Nikita Popov2024-01-092-45/+45
|
* [JumpThreading] Regenerate test checks (NFC)Nikita Popov2024-01-093-79/+101
|
* [clang][Sema][NFC] Make a few parameters constTimm Bäder2024-01-092-24/+22
|
* [AArch64] Fix regression introduced by c7148467fc08eefaaae876c7d11d62… ↵David Sherwood2024-01-091-0/+2
| | | | | (#77467) …9c849f42cf
* [mlir] Add global and program memory space handling to the data layout ↵agozillon2024-01-0912-8/+250
| | | | | | | | | subsystem (#77367) This patch is based on a previous PR https://reviews.llvm.org/D144657 that added alloca address space handling to MLIR's DataLayout and DLTI interface. This patch aims to add identical features to import and access the global and program memory space through MLIR's DataLayout/DLTI system.
* [Flang][Driver] Enable gpulibc/nogpulibc options for Flang, which allows ↵agozillon2024-01-094-2/+18
| | | | | | | | | | | | | | | | | | | | | | linking of GPU LIBC for the fortran and OpenMP runtime (#77135) This patch seeks to add the -gpulibc and -nogpulibc for Flang, which allows the linking of the GPU libc library, this allows the use of memcpy and other useful library functions for GPU. In particular, this allows the Fortran runtime (written in C++) to be compiled for offload and then linked against the GPU LIBC library via this option to resolve memcpy and other C library functions that the fortran runtime depends on for AMD GPU devices (and likely other GPU devices). This is the current method I've tested and found to be able to utilise the Fortran runtime when compiled for AMD GPU, albeit it requires compiling libc for GPU and then the Fortran runtime for GPU, so not particularly straight forward or user friendly yet. Activating this option will allow the subset of C functions to also be utilised for GPU in other C/C++ based Fortran libraries if any are made when linking against GPU libc.
* [LoongArch] Implement LoongArchRegisterInfo::canRealignStack() (#76913)wanglei2024-01-093-5/+75
| | | | | | | | | | | | | | This patch fixes the crash issue in the test: CodeGen/LoongArch/can-not-realign-stack.ll Register allocator may spill virtual registers to the stack, which introduces stack alignment requirements (when the size of spilled registers exceeds the default alignment size of the stack). If a function does not have stack alignment requirements before register allocation, registers used for stack alignment will not be preserved. Therefore, we should implement `canRealignStack()` to inform the register allocator whether it is allowed to perform stack realignment operations.
* [LoongArch] Pre-commit test for #76913. NFCwanglei2024-01-091-0/+39
| | | | | | | | | | | | | This test will crash with expensive check. Crash message: ``` *** Bad machine code: Using an undefined physical register *** - function: main - basic block: %bb.0 entry (0x20fee70) - instruction: $r3 = frame-destroy ADDI_D $r22, -288 - operand 1: $r22 ```
* [RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710)Alex Bradbury2024-01-0940-186/+159
| | | | | | | | | | This follows on from #76708, allowing `cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just `N->getAsZextVal();` Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" | xargs sed -E -i 's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and then using `git clang-format` on the result.
* [mlir] add a chapter on matchers to the transform dialect tutorial (#76725)Oleksandr "Alex" Zinenko2024-01-0916-3/+1375
| | | | | These operations has been available for a while, but were not described in the tutorial. Add a new chapter on using and defining match operations.
* [mlir] introduce transform.collect_matching (#76724)Oleksandr "Alex" Zinenko2024-01-094-18/+279
| | | | | | Introduce a new match combinator into the transform dialect. This operation collects all operations that are yielded by a satisfactory match into its results. This is a simpler version of `foreach_match` that can be inserted directly into existing transform scripts.
* [AMDGPU][NFC] Update left over tests for COV5 (#76984)Saiyedul Islam2024-01-0913-62/+106
| | | Update AMDGPU CodeGen lit tests to check for COV5 ABI.
* [AMDGPU] Make isScalarLoadLegal a member of AMDGPURegisterBankInfo. NFC.Jay Foad2024-01-092-1/+3
|
* [NewPM] Update `CodeGenPreparePass` reference in `CodeGenPassBuilder.h` (#77446)paperchalice2024-01-092-2/+3
| | | Reland #77054.
* [CodeGen] Fix friend declaration in SSPLayoutAnalysis (#77447)paperchalice2024-01-091-1/+1
|
* [mlir][docs] Fix a broken passes documentation (#77402)Kohei Yamaguchi2024-01-092-2/+6
| | | | | - Add EmitC passes into Pass.md - Modify header level of the pass description to under the `LegalizeVectorStorage` pass
* [X86] Emit Warnings for frontend options to enable knl/knm specific ISAs. ↵Freddy Ye2024-01-097-6/+32
| | | | | | | (#75580) Since Knight Landing and Knight Mill microarchitectures are EOL, we would like to remove intrinsic supports for its specific ISA in LLVM 19. In LLVM 18, we will first emit a warning for the usage.
* [AArch64] Add an AArch64 pass for loop idiom transformations (#72273)David Sherwood2024-01-0912-0/+2877
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have added a new pass that looks for loops such as the following: ``` while (i != max_len) if (a[i] != b[i]) break; ... use index i ... ``` Although similar to a memcmp, this is slightly different because instead of returning the difference between the values of the first non-matching pair of bytes, it returns the index of the first mismatch. As such, we are not able to lower this to a memcmp call. The new pass can now spot such idioms and transform them into a specialised predicated loop that gives a significant performance improvement for AArch64. It is intended as a stop-gap solution until this can be handled by the vectoriser, which doesn't currently deal with early exits. This specialised loop makes use of a generic intrinsic that counts the trailing zero elements in a predicate vector. This was added in https://reviews.llvm.org/D159283 and for SVE we end up with brkb & incp instructions. Although we have added this pass only for AArch64, it was written in a generic way so that in theory it could be used by other targets. Currently the pass requires scalable vector support and needs to know the minimum page size for the target, however it's possible to make it work for fixed-width vectors too. Also, the llvm.experimental.cttz.elts intrinsic used by the pass has generic lowering, but can be made efficient for targets with instructions similar to SVE's brkb, cntp and incp. Original version of patch was posted on Phabricator: https://reviews.llvm.org/D158291 Patch co-authored by Kerry McLaughlin (@kmclaughlin-arm) and David Sherwood (@david-arm) See the original discussion on Discourse: https://discourse.llvm.org/t/aarch64-target-specific-loop-idiom-recognition/72383
* [flang] Fix fir::isPolymorphic for TYPE(*) assumed-size arrays (#77339)jeanPerier2024-01-094-18/+37
| | | | | | | | | | | | fir::isPolymorphic was returning false for TYPE(*) assumed-size arrays causing bad fir.rebox to be created when passing a polymorphic actual argument to such TYPE(*) dummy. Fix fir::isAssumedSize to return true for fir.ref<fir.array<none>> and fir.ref<none>. @cabreraam, I found this bug when testing your patch, although it is not caused by it, so you may hit it when passing TYPE(*) deferred shape of to assumed size TYPE(*) with a different rank.
* [CodeGen] Fix -Wmismatched-tags in StackProtector.h (NFC)Jie Fu2024-01-091-1/+1
| | | | | | | | | | | | | | | llvm-project/llvm/include/llvm/CodeGen/StackProtector.h:69:10: error: class 'AnalysisInfoMixin' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags] 69 | friend class AnalysisInfoMixin<SSPLayoutAnalysis>; | ^ llvm-project/llvm/include/llvm/IR/PassManager.h:414:8: note: previous use is here 414 | struct AnalysisInfoMixin : PassInfoMixin<DerivedT> { | ^ llvm-project/llvm/include/llvm/CodeGen/StackProtector.h:69:10: note: did you mean struct here? 69 | friend class AnalysisInfoMixin<SSPLayoutAnalysis>; | ^~~~~ | struct 1 error generated.
* [CodeGen] Port `StackProtector` to new pass manager (#75334)paperchalice2024-01-096-80/+176
| | | | | The original `StackProtector` is both transform and analysis pass, break it into two passes now. `getAnalysis<StackProtector>()` could be now replaced by `FAM.getResult<SSPLayoutAnalysis>(F)` in new pass system.
* [ARM] arm_acle.h add Coprocessor Instrinsics (#75440)hstk30-hw2024-01-095-0/+520
| | | | | https://github.com/llvm/llvm-project/issues/75424 Add Coprocessor Instrinsics
* AMDGPU: Regenerate test checksMatt Arsenault2024-01-091-2523/+2481
| | | | Fix test failures after auto-merge of f9fec402896a90f3b09cea359c330f65a0908649
* [clang] Update cxx_dr_status.html (#77372)Vlad Serebrennikov2024-01-095-186/+552
| | | This patch updates `cxx_dr_status.html` to bring it in sync with Core Issues List Revision 113.
* [LV] Create block in mask up-front if needed. (#76635)Florian Hahn2024-01-0926-136/+148
| | | | | | | | | | | | | | | | | | At the moment, block and edge masks are created on demand, which means that they are inserted at the point where they are demanded and then cached. It is possible that the mask for a block is looked up later at a point that's not dominated by the point where the mask has been inserted. To avoid this, create masks up front on entry to the corresponding basic block and leave it to VPlan simplification to remove unneeded masks. Note that we need to create masks for all blocks, if any of the blocks in the loop needs predication, as computing the mask of a block depends on the masks of its predecessor. Needed for #76090. https://github.com/llvm/llvm-project/pull/76635
* [GISel] Infer the type of an immediate when there is one element in TEC (#77399)Sergei Barannikov2024-01-092-11/+32
| | | | | | When there is just one element in the type equivalence class (TEC), `inferNamedOperandType` fails because it does not consider the passed operand as a suitable one. This is incorrect when inferring the type of an (unnamed) immediate operand.
* [mlir][bufferization][NFC] Clean up Bazel build files (#77429)Matthias Springer2024-01-092-5/+3
| | | `*OpsIncGen` should depend only on the respective `*OpsTdFiles`.
* [AMDGPU] Flip the default value of maybeAtomic. NFCI. (#75220)Jay Foad2024-01-098-14/+5
| | | | | | | In practice maybeAtomic = 0 is used to prevent SIMemoryLegalizer from interfering with instructions that are mayLoad or mayStore but lack MachineMemOperands. These instructions should be the exception not the rule, so this patch sets maybeAtomic = 1 by default and only overrides it to 0 where necessary.
* AMDGPU: Make v32bf16 a legal type (#76679)Matt Arsenault2024-01-093-2/+39
| | | Depends #76678
* [CodeGen] Port `GCLowering` to new pass manager (#75305)paperchalice2024-01-095-11/+37
|