| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Created using spr 1.3.4
|
|
|
|
|
| |
libclc does not have a half implementation for erf/erfc
Add one based on the float implementation by extending the input and
truncating the output.
|
|
|
| |
Fixes #77147.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a subset of the C++20 calendar data formatters:
- day,
- month,
- year,
- month_day,
- month_day_last, and
- year_month_day.
A followup patch will add the missing calendar data formatters:
- weekday,
- weekday_indexed,
- weekday_last,
- month_weekday,
- month_weekday_last,
- year_month,
- year_month_day_last
- year_month_weekday, and
- year_month_weekday_last.
|
|
|
|
|
|
|
|
| |
This is required for users of `TypeQuery` that limit the set of
languages of the query using APIs such as
`GetSupportedLanguagesForTypes` or
`GetSupportedLanguagesForExpressions`.
Example usage: https://github.com/apple/llvm-project/pull/7885
|
|
|
|
| |
Phab no longer knows about new revisions.
|
| |
|
| |
|
|
|
|
|
|
|
| |
Fixes: #77411
When substituting deduced type, noexcept expr in method should be
instantiated and evaluated.
ThisScrope should be switched to method context instead of origin sema
context
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch adds a configuration of the libc++ test suite that enables
optimizations when building the tests. It also adds a new CI
configuration to exercise this on a regular basis. This is added in the
context of [1], which requires building with optimizations in order to
hit the bug.
[1]: https://github.com/llvm/llvm-project/issues/68552
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR exposes four PGO functions
- `__llvm_profile_set_filename`
- `__llvm_profile_reset_counters`,
- `__llvm_profile_dump`
- `__llvm_orderfile_dump`
to user programs through the new header `instr_prof_interface.h` under
`compiler-rt/include/profile`. This way, the user can include the header
`profile/instr_prof_interface.h` to introduce these four names to their
programs.
Additionally, this PR defines macro `__LLVM_INSTR_PROFILE_GENERATE` when
the program is compiled with profile generation, and defines macro
`__LLVM_INSTR_PROFILE_USE` when the program is compiled with profile
use. `__LLVM_INSTR_PROFILE_GENERATE` together with
`instr_prof_interface.h` define the PGO functions only when the program
is compiled with profile generation. When profile generation is off,
these PGO functions are defined away and leave no trace in the user's
program.
Background:
https://discourse.llvm.org/t/pgo-are-the-llvm-profile-functions-stable-c-apis-across-llvm-releases/75832
|
|
|
|
|
| |
This change disables it on z/OS and AIX since it fails on both platforms with:
fatal error: error in backend: Objective-C support is unimplemented for object file format
|
|
|
|
|
|
|
|
| |
This document captures the design philosophy of the acc dialect. It also
shares the rationale behind the design and implementation of various
operations - and ties that back to the dialect design goals.
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
|
| |
|
| |
|
| |
|
|
|
|
| |
Previously, registers created for temporary defs in apply patterns were
rendered as uses, resulting in machine verifier errors.
|
|
|
|
|
| |
LiveDebugValues requires CFG only has one entry. BranchFolding and
MachineBlockPlacement may remove all predecessors of landing pad which
leaves it to be another entry.
|
|
|
|
|
|
|
|
|
| |
This is the logical equivalent for #76710 for APInt and uses the same
naming scheme.
Converted existing users through:
`git grep -l "cast<ConstantSDNode>\(.*\).*getAPIntValueValue" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getAPIntValue/\1->getAsAPIntVal/'`
|
| |
|
| |
|
| |
|
|
|
|
|
| |
(#77467)
…9c849f42cf
|
|
|
|
|
|
|
|
|
| |
subsystem (#77367)
This patch is based on a previous PR https://reviews.llvm.org/D144657
that added alloca address space handling to MLIR's DataLayout and DLTI
interface. This patch aims to add identical features to import and
access the global and program memory space through MLIR's
DataLayout/DLTI system.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
linking of GPU LIBC for the fortran and OpenMP runtime (#77135)
This patch seeks to add the -gpulibc and -nogpulibc for Flang, which
allows the linking of the GPU libc library, this allows the use of
memcpy and other useful library functions for GPU.
In particular, this allows the Fortran runtime (written in C++) to be
compiled for offload and then linked against the GPU LIBC library via
this option to resolve memcpy and other C library functions that the
fortran runtime depends on for AMD GPU devices (and likely other GPU
devices).
This is the current method I've tested and found to be able to utilise
the Fortran runtime when compiled for AMD GPU, albeit it requires
compiling libc for GPU and then the Fortran runtime for GPU, so not
particularly straight forward or user friendly yet.
Activating this option will allow the subset of C functions to also be
utilised for GPU in other C/C++ based Fortran libraries if any are made
when linking against GPU libc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the crash issue in the test:
CodeGen/LoongArch/can-not-realign-stack.ll
Register allocator may spill virtual registers to the stack, which
introduces stack alignment requirements (when the size of spilled
registers exceeds the default alignment size of the stack). If a
function does not have stack alignment requirements before register
allocation, registers used for stack alignment will not be preserved.
Therefore, we should implement `canRealignStack()` to inform the
register allocator whether it is allowed to perform stack realignment
operations.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test will crash with expensive check.
Crash message:
```
*** Bad machine code: Using an undefined physical register ***
- function: main
- basic block: %bb.0 entry (0x20fee70)
- instruction: $r3 = frame-destroy ADDI_D $r22, -288
- operand 1: $r22
```
|
|
|
|
|
|
|
|
|
|
| |
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
|
|
|
|
|
| |
These operations has been available for a while, but were not described
in the tutorial. Add a new chapter on using and defining match
operations.
|
|
|
|
|
|
| |
Introduce a new match combinator into the transform dialect. This
operation collects all operations that are yielded by a satisfactory
match into its results. This is a simpler version of `foreach_match`
that can be inserted directly into existing transform scripts.
|
|
|
| |
Update AMDGPU CodeGen lit tests to check for COV5 ABI.
|
| |
|
|
|
| |
Reland #77054.
|
| |
|
|
|
|
|
| |
- Add EmitC passes into Pass.md
- Modify header level of the pass description to under the
`LegalizeVectorStorage` pass
|
|
|
|
|
|
|
| |
(#75580)
Since Knight Landing and Knight Mill microarchitectures are EOL, we
would like to remove intrinsic supports for its specific ISA in LLVM 19.
In LLVM 18, we will first emit a warning for the usage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have added a new pass that looks for loops such as the following:
```
while (i != max_len)
if (a[i] != b[i])
break;
... use index i ...
```
Although similar to a memcmp, this is slightly different because instead
of returning the difference between the values of the first non-matching
pair of bytes, it returns the index of the first mismatch. As such, we
are not able to lower this to a memcmp call.
The new pass can now spot such idioms and transform them into a
specialised predicated loop that gives a significant performance
improvement for AArch64. It is intended as a stop-gap solution until
this can be handled by the vectoriser, which doesn't currently deal with
early exits.
This specialised loop makes use of a generic intrinsic that counts the
trailing zero elements in a predicate vector. This was added in
https://reviews.llvm.org/D159283 and for SVE we end up with brkb & incp
instructions.
Although we have added this pass only for AArch64, it was written in a
generic way so that in theory it could be used by other targets.
Currently the pass requires scalable vector support and needs to know
the minimum page size for the target, however it's possible to make it
work for fixed-width vectors too. Also, the llvm.experimental.cttz.elts
intrinsic used by the pass has generic lowering, but can be made
efficient for targets with instructions similar to SVE's brkb, cntp and
incp.
Original version of patch was posted on Phabricator:
https://reviews.llvm.org/D158291
Patch co-authored by Kerry McLaughlin (@kmclaughlin-arm) and David
Sherwood (@david-arm)
See the original discussion on Discourse:
https://discourse.llvm.org/t/aarch64-target-specific-loop-idiom-recognition/72383
|
|
|
|
|
|
|
|
|
|
|
|
| |
fir::isPolymorphic was returning false for TYPE(*) assumed-size arrays
causing bad fir.rebox to be created when passing a polymorphic actual
argument to such TYPE(*) dummy.
Fix fir::isAssumedSize to return true for fir.ref<fir.array<none>> and
fir.ref<none>.
@cabreraam, I found this bug when testing your patch, although it is not
caused by it, so you may hit it when passing TYPE(*) deferred shape of
to assumed size TYPE(*) with a different rank.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm-project/llvm/include/llvm/CodeGen/StackProtector.h:69:10:
error: class 'AnalysisInfoMixin' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
69 | friend class AnalysisInfoMixin<SSPLayoutAnalysis>;
| ^
llvm-project/llvm/include/llvm/IR/PassManager.h:414:8: note: previous use is here
414 | struct AnalysisInfoMixin : PassInfoMixin<DerivedT> {
| ^
llvm-project/llvm/include/llvm/CodeGen/StackProtector.h:69:10: note: did you mean struct here?
69 | friend class AnalysisInfoMixin<SSPLayoutAnalysis>;
| ^~~~~
| struct
1 error generated.
|
|
|
|
|
| |
The original `StackProtector` is both transform and analysis pass, break
it into two passes now. `getAnalysis<StackProtector>()` could be now
replaced by `FAM.getResult<SSPLayoutAnalysis>(F)` in new pass system.
|
|
|
|
|
| |
https://github.com/llvm/llvm-project/issues/75424
Add Coprocessor Instrinsics
|
|
|
|
| |
Fix test failures after auto-merge of f9fec402896a90f3b09cea359c330f65a0908649
|
|
|
| |
This patch updates `cxx_dr_status.html` to bring it in sync with Core Issues List Revision 113.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment, block and edge masks are created on demand, which means
that they are inserted at the point where they are demanded and then
cached. It is possible that the mask for a block is looked up later at a
point that's not dominated by the point where the mask has been
inserted.
To avoid this, create masks up front on entry to the corresponding basic
block and leave it to VPlan simplification to remove unneeded masks.
Note that we need to create masks for all blocks, if any of the blocks
in the loop needs predication, as computing the mask of a block depends
on the masks of its predecessor.
Needed for #76090.
https://github.com/llvm/llvm-project/pull/76635
|
|
|
|
|
|
| |
When there is just one element in the type equivalence class (TEC),
`inferNamedOperandType` fails because it does not consider the passed
operand as a suitable one. This is incorrect when inferring the type of
an (unnamed) immediate operand.
|
|
|
| |
`*OpsIncGen` should depend only on the respective `*OpsTdFiles`.
|
|
|
|
|
|
|
| |
In practice maybeAtomic = 0 is used to prevent SIMemoryLegalizer from
interfering with instructions that are mayLoad or mayStore but lack
MachineMemOperands. These instructions should be the exception not the
rule, so this patch sets maybeAtomic = 1 by default and only overrides
it to 0 where necessary.
|
|
|
| |
Depends #76678
|
| |
|