| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com>
Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
So that we don't duplicate tests in later patch.
Reviewers: topperc, dtcxzyw, asb
Reviewed By: asb
Pull Request: https://github.com/llvm/llvm-project/pull/79111
|
|
|
|
|
|
| |
This is an OOO core that has a vector unit. For more information see
https://www.sifive.com/cores/performance-p650-670.
Scheduler model and other tuning will come in separate patches.
|
|
|
| |
C backend is removed in 3.1.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Close https://github.com/llvm/llvm-project/issues/73023
The direct issue of https://github.com/llvm/llvm-project/issues/73023 is
that we entered a header which is marked as pragma once since the
compiler think it is OK if there is controlling macro.
It doesn't make sense. I feel like it should be sufficient to skip it
after we see the '#pragma once'.
From the context, it looks like the workaround is primarily for
ObjectiveC. So we might need reviewers from OC.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
CSR SGPR spilling currently uses the early available physical VGPRs. It
currently imposes a high register pressure while trying to allocate
large VGPR tuples within the default register budget.
This patch changes the spilling strategy by picking the VGPRs in the
reverse order, the highest available VGPR first and later after regalloc
shift them back to the lowest available range. With that, the initial
VGPRs would be available for allocation and possibility
of finding large number of contiguous registers will be more.
|
|
|
|
|
|
|
|
|
|
|
| |
Add new pass manager support to `llc`. Users can use
`--passes=pass1,pass2...` to run mir passes, and use `--enable-new-pm`
to run default codegen pipeline.
This patch is taken from [D83612](https://reviews.llvm.org/D83612), the
original author is @yuanfang-chen.
---------
Co-authored-by: Yuanfang Chen <455423+yuanfang-chen@users.noreply.github.com>
|
|
|
|
|
| |
Test an absolute relocation referencing a DSO symbol, relocating a
non-SHF_ALLOC section. Also test --gc-sections.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://github.com/llvm/llvm-project/commit/f9c2a341b94ca71508dcefa109ece843459f7f13
causes regressions when we have a slice with integer vector type that is
the same size as the partition, and a ptr load/store slice that is not
the size of the element type.
Ref `vector-promotion.ll:ptrLoadStoreTys`.
Before the patch, we would only consider `<4 x i32>` as a candidate type
for vector promotion, and would find that it is a viable type for all
the slices.
After the patch, we now add `<2 x ptr>` as a candidate type due to slice
with user `store ptr %val0, ptr %obj, align 8` -- and flag that we
`HaveVecPtrTy`. The pre-existing behavior of this flag results in
removing the viable `<4 x i32>` and keeping only the unviable `<2 x
ptr>`, which results in a failure to promote.
The end result is failing to promote an alloca that was previously
promoted -- this does not appear to be the intent of that patch, which
has the goal of increasing promotions by providing more promotion
opportunities.
This PR preserves this behavior via a simple reorganization of the
implemention: try first the slice types with same size as the partition,
then, if there is no promotable type, try the `LoadStoreTys.`
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#72962)
Refer to RISCV, we will fix up the alignment if linker relaxation
changes code size and breaks alignment. Insert enough Nops and emit
R_LARCH_ALIGN relocation type so that linker could satisfy the alignment
by removing Nops.
It does so only in sections with the SHF_EXECINSTR flag.
In LoongArch psABI v2.30, R_LARCH_ALIGN requires symbol index. The
lowest 8 bits of addend represent alignment and the other bits of addend
represent the maximum number of bytes to emit.
|
|
|
|
|
|
|
|
|
|
| |
The libclang python binding test CI job currently doesn't have any
restrictions on what branches it will run on when something is pushed
and also isn't restricted to the monorepo. This patch adds a branch
restriction for the push event, only running the CI job when something
is pushed to the main branch (and the path filter is met), and also adds
a filter to ensure that the job comes from a PR against the monorepo or
a push to a branch in the monorepo.
|
|
|
|
|
| |
Now that the work embedding PGO information in SHT_LLVM_BB_ADDR_MAP ELF
sections has landed, there is no longer a need to keep around the
mbb-profile-dump flag.
|
|
|
|
| |
Change-Id: I6b2346301f9bd840a0adceba4a0d03e9932af245
|
|
|
|
|
| |
Follow-up from previous pull request. Motivate the API change with an
attribute that decides between sugaring a sub-attribute or using an
alias
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds basic TLSDESC support in the RISC-V backend.
Specifically, we add new relocation types for TLSDESC, as prescribed in
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373, and add a
new pseudo instruction to simplify code generation.
This patch does not try to optimize the local dynamic case, which can be
improved in separate patches.
Linker side changes will also be handled separately.
The current implementation is only enabled when passing the new
`-enable-tlsdesc` codegen flag.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As I worked through changes to another PR
(https://github.com/llvm/llvm-project/pull/74912), I couldn't help but
rewrite a few methods for readability, maintainability, and possibly
some behavior correctness too.
1. Exiting early instead of nested `if`-statements, which:
- Reduces indentation levels for all subsequent lines
- Treats missing pre-conditions similar to an error
- Clearly indicates that the full length of the method is the "happy
path".
2. Explicitly return empty Value Object shared pointers for those error
(like) situations, which
- Reduces the time it takes a maintainer to figure out what the method
actually returns based on those conditions.
3. Converting a mix of `if` and `if`-`else`-statements around an enum
into one `switch` statement, which:
- Consolidates the former branching logic
- Lets the compiler warn you of a (future) missing enum case
- This one may actually change behavior slightly, because what was an
early test for one enum case, now happens later on in the `switch`.
4. Consolidating near-identical, "copy-pasta" logic into one place,
which:
- Separates the common code to the diverging paths.
- Highlights the differences between the code paths.
rdar://119833526
|
|
|
|
|
|
| |
This test was originally introduced in
https://github.com/llvm/llvm-project/pull/76976, but it incorrectly
tests braced-list initialization instead of parenthesized
initialization.
|
|
|
|
| |
Change-Id: Ib6f237cc791a097f8f2411bc1d6502f11d4a748e
|
|
|
|
|
| |
Missed cleanup from https://reviews.llvm.org/D134716.
Fixes: #79220
|
|
|
|
|
| |
This is a high level description and FAQ for what we're doing in
RemoveDIs, and how old code should be behave with new debug-info
(exactly the same 99% of the time).
|
|
|
|
|
|
|
| |
This reverts commit cb528ec5e6331ce207c7b835d7ab963bd5e13af7.
Reason: buildbot breakage (https://lab.llvm.org/buildbot/#/builders/5/builds/40364):
SUMMARY: AddressSanitizer: container-overflow /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan_ubsan/include/c++/v1/string:1870:29 in __get_long_pointer
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here's a raft of minor fixes for the RemoveDIs project that's replacing
dbg.value intrinsics with DPValue objects, all IMO trivial:
* When inserting functions or blocks and calling setIsNewDbgInfoFormat,
do that after setting the Parent pointer, just in case conversion from
(or to) dbg.value mode is triggered.
* When transferring DPValues from an empty range in a splice call, don't
transfer if there are no DPValues attached to the source block at all.
* stripNonLineTableDebugInfo should drop DPValues.
* In insertBefore, don't try to transfer DPValues if there aren't any.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many machine-learning applications (and most software written at AMD)
expect the operation that truncates floats to 8-bit floats to be
saturatinng. That is, they expect `truncf 256.0 : f32 to f8E4M3FNUZ` to
yield `240.0`, not `NaN`, and similarly for negative numbers. However,
the underlying hardware instruction that can be used for this truncation
implements overflow-to-NaN semantics.
To enable handling this usecase, we add the saturate-fp8-truncf option
to ArithToAMDGPU (off by default), which causes the requisite clamping
code to be emitted. Said clamping code ensures that Inf and NaN are
passed through exactly (and thus trancate to NaN).
Per review feedback, this commit efactors
createScalarOrSplatConstant() to the Arith dialect utilities and uses
it in this code. It also fixes naming of existing patterns and
switches from vector.extractelement/insertelement to
vector.extract/insert.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
SANITIZER_WEAK_IMPORT is useful for any call that needs to be
conditionally linked in. This is currently used for the
tsan_dispatch_interceptors, but can be used for other calls introduced
in newer versions of MacOS. (such as `aligned_alloc` in this PR
https://github.com/llvm/llvm-project/pull/79198).
This PR moves the definition to a higher level so it can be used in
other sanitizers.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We can directly call `clang -c -x cl -target amdgcn -mcpu=gfx90a test.cl
-o test.o`
to compile an OpenCL kernel file. However, when `--save-temps` is
enabled, it doesn't
work because the preprocessed file (`.i` file) is taken as C source file
when it
is fed to the front end, thus causing compilation error because those
OpenCL keywords
can't be recognized. This patch fixes the issue.
|
|
|
|
|
|
|
|
|
| |
We were previously allowlisting awaitable types returned by
`await_transform` instead of the type of the operand of the `co_await`
expression.
This previously used to give false positives and not respect the
`AllowedAwaitablesList` flag when `await_transform` is used. See added
test cases for such examples.
|
|
|
|
|
|
|
|
|
| |
Currently, the UnifiedLTO pipeline seems to have trouble with several
LTO features, like SplitLTO units, which means we cannot use important
optimizations like Whole Program Devirtualization or security hardening
instrumentation like CFI.
This patch reverts FatLTO to using distinct pipelines for Full LTO and
ThinLTO. It still avoids module cloning, since that was error prone.
|
| |
|
| |
|
|
|
|
|
| |
char *) (#79189)
This overload is completely unused.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for translating dense_resource attributes to
LLVMIR Target.
The support added is similar to how DenseElementsAttr is handled, except
we
don't need to handle splats.
Another possible way of doing this is adding iteration on
dense_resource, but that is
non-trivial as DenseResourceAttr is not meant to be something you should
directly
access. It has subclasses which you are supposed to use to iterate on
it.
|
|
|
|
|
|
|
|
|
| |
(#78378)
`--function=<regex>` Include functions matching regex in the output
`--no-function=<regex>` Exclude functions matching regex from the output
If both are specified, `--no-function` has a higher precedence if a
function name matches both filters
|
| |
|
|
|
|
|
| |
Use the LazyDetector also for the remaining toolchains to avoid unnecessarily checking for the Cuda and Rocm installations.
This fixes rdar://121397534.
|
|
|
|
|
| |
There are matching store opcodes (stfd, stxsiwx) for the load opcodes
that make 32-bit and 64-bit vector operations cheap with VSX, so stores
should also be cheap.
|
| |
|
| |
|
|
|
|
|
| |
No need in trying to analyze small graphs with gather node only to avoid
crash.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The code cleanup in #74794 accidentally broke detection of languages by
reading file content from stdin, e.g. via `clang-format -dump-config - <
/path/to/filename`.
This PR adds unit and integration tests to reproduce the issue and adds
a fix.
Fixes: #79023
|
|
|
|
| |
The nightly libc++ build was incorrectly set up to build at 22:00
Eastern when it intended to run at 03:00 Eastern. This patch fixes that.
|
| |
|
|
|
|
|
| |
It was accidentally in the middle of the floating point extensions
after the recent reordering.
|
|
|
|
| |
Reverts llvm/llvm-project#79151, necessary to revert #79128, which broke
all production builds and landed without review.
|