| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This reverts commit e05c1b46d0d3739cc48ad912dbe6e9affce05927.
|
|
|
|
|
|
|
|
| |
RISCVRegisterBankInfo::getInstrMapping.
This removes the special case for vectors. The default case in the
second switch can handle GPR in addition to vectors. We just won't
use the static ValueMapping entry.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a followup to
https://github.com/llvm/llvm-project/pull/86359
"[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory
(#86359)"
where I treat LLVM_COV segments in a Mach-O binary as non-loadable.
There is another codepath in
`DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to
set the load addresses for a Module to the file addresses. It has no
logic to detect a segment that is not loaded in virtual memory
(ObjectFileMachO::SectionIsLoadable), so it would set the load address
for this LLVM_COV segment to the file address and shadow actual code,
breaking lldb behavior.
This method currently sets the load address for any section that doesn't
have a load address set already. This presumes that a Module was added
to the Target, some mechanism set the correct load address for SOME
segments, and then this method is going to set the other segments to a
no-slide value, assuming they were forgotten.
ObjectFile base class doesn't, today, vend a SectionIsLoadable method,
but we do have ObjectFile::SetLoadAddress and at a higher level,
Module::SetLoadAddress, when we're setting the same slide to all
segments.
That's the behavior we want in this method. If any section has a load
address, we don't touch this Module. Otherwise we set all sections to
have a load address that is the same as the file address.
I also audited the other parts of lldb that are calling
SectionList::SectionLoadAddress and looked if they should be more
correctly using Module::SetLoadAddress for the entire binary. But in
most cases, we have the potential for different slides for different
sections so this section-by-section approach must be taken.
rdar://125800290
|
|
|
|
|
|
| |
DeclRef to field must be marked as LValue to be consistent with how the
field decl will be evaluated.
T->desugar() is unnecessary to call ->isArrayType().
|
|
|
|
|
|
| |
Reverts llvm/llvm-project#86812.
This commit caused a regression on the x86_64 MacOS buildbot:
https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/784/
|
|
|
|
|
|
|
| |
(#87582)
Previously the leading space was added in each string constant. This
patch moves the leading space out of the string constants and is instead
explicitly added to add clarity to the code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
solely on leading unit dims. (#85694)
Updates `castAwayContractionLeadingOneDim` to check for leading unit
dimensions before inserting `vector.transpose` ops.
Currently `castAwayContractionLeadingOneDim` removes all leading unit
dims based on the accumulator and transpose any subsequent operands to
match the accumulator indexing. This does not take into account if the
transpose is strictly necessary, for instance when given this
vector-matrix contract:
```mlir
%result = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %lhs, %rhs, %acc : vector<1x1x8xi32>, vector<1x8x8xi32> into vector<1x8xi32>
```
Passing this through `castAwayContractionLeadingOneDim` pattern produces
the following:
```mlir
%0 = vector.transpose %arg0, [1, 0, 2] : vector<1x1x8xi32> to vector<1x1x8xi32>
%1 = vector.extract %0[0] : vector<1x8xi32> from vector<1x1x8xi32>
%2 = vector.extract %arg2[0] : vector<8xi32> from vector<1x8xi32>
%3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %1, %arg1, %2 : vector<1x8xi32>, vector<1x8x8xi32> into vector<8xi32>
%4 = vector.broadcast %3 : vector<8xi32> to vector<1x8xi32>
```
The `vector.transpose` introduced does not affect the underlying data
layout (effectively a no op), but it cannot be folded automatically.
This change avoids inserting transposes when only leading unit
dimensions are involved.
Fixes #85691
|
|
|
|
|
|
|
| |
patterns (#86005)
Updates smmla unrolling patterns to handle vecmat contracts where `dimM=1`. This includes explicit vecmats in the form: `<1x8xi8> x <8x8xi8> --> <1x8xi32>` or implied with the leading dim folded: `<8xi8> x <8x8xi8> --> <8xi32>`
Since the smmla operates on two `<2x8xi8>` input vectors to produce `<2x2xi8>` accumulators, half of each 2x2 accumulator tile is dummy data not pertinent to the computation, resulting in half throughput.
|
|
|
|
|
|
| |
This reverts commit 23616c65e7d632e750ddb67d55cc39098a69a8a6
because it breaks Fuchsia Clang toolchain builders.
https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8751656876289840849/overview
|
|
|
|
| |
scalable vector type
|
|
|
|
| |
vector type
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
G_ICMP for scalable vector types
This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a
legal mask type, then the instruction is legalized as the element-wise
select, where the condition on the select is the mask typed source
operand, and the true and false values are 1 or -1 (for
zero/any-extension and sign extension) and zero. If the type is a legal integer
or vector integer type, then the instruction is marked as legal.
The legalization of the extends may introduce a G_SPLAT_VECTOR, which
needs to be legalized in this patch for the extend test cases to pass.
A G_SPLAT_VECTOR is legal if the vector type is a legal integer or
floating point vector type and the source operand is sXLen type. This is
because the SelectionDAG patterns only support sXLen typed
ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A
G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector
type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL
if the splat is all ones or all zeros respectivley. In the case of a
non-constant mask splat, we legalize by promoting the scalar value to
s8.
In order to get the s8 element vector back into s1 vector, we use a
G_ICMP. In order for the splat vector and extend tests to pass, we also
need to legalize G_ICMP in this patch.
A G_ICMP is legal if the destination type is a legal bool vector and the LHS and
RHS are legal integer vector types.
|
|
|
|
|
|
|
|
|
| |
bitfields""" (#87562)
Reverts llvm/llvm-project#87529
Reverts #87518
https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken
|
|
|
|
|
|
|
|
|
|
| |
`atomic_compare_exchange_{weak,strong}` (#87135)
Spotted this minor mistake in the tests as I was looking into testing
more thoroughly `atomic_ref`.
The two argument overloads are tested just above. The names of the
lambda clearly indicates that the intent was to test the one argument
overload.
|
| |
|
|
|
|
|
|
| |
costs for testing
Improves SSE vs AVX test results for #87510
|
|
|
|
|
|
|
|
|
|
|
| |
Compiler can improve analysis for operands of UIToFP/SIToFP instructions
and operands of ICmp instruction.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/85966
|
|
|
|
|
| |
Adding OffTType to fcntl.h and stdio.h 's Macro lists in libc/spec/posix.td as
mentioned here: #87266
|
|
|
|
| |
Some TUs apparently end up with an ambiguity between `::llvm::detail`
and `support::detail`, so we close that gap at the source.
|
|
|
|
|
|
|
| |
instructions."
This reverts commit 899855d2b11856a44e530fffe854d76be69b9008 to fix the
issue reported in https://lab.llvm.org/buildbot/#/builders/165/builds/51659.
|
|
|
|
|
|
|
|
|
|
|
| |
Compiler can improve analysis for operands of UIToFP/SIToFP instructions
and operands of ICmp instruction.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/85966
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous diff (and it's subsequent fix) were reverted as the tests
didn't work properly on the AArch64 & ARM LLDB buildbots. I made a
couple more minor changes to tests (from @clayborg's feedback) and
disabled them for non Linux-x86(_64) builds, as I don't have the ability
do anything about an ARM64 Linux failure. If I had to guess, I'd say the
toolchain on the buildbots isn't respecting the `-Wl,--build-id` flag.
Maybe, one day, when I have a Linux AArch64 system I'll dig in to it.
From the reverted PR:
I've migrated the tests in my
https://github.com/llvm/llvm-project/pull/79181 from shell to API (at
@JDevlieghere's suggestion) and addressed a couple issues that were
exposed during testing.
The tests first test the "normal" situation (no DebugInfoD involvement,
just normal debug files sitting around), then the "no debug info"
situation (to make sure the test is seeing failure properly), then it
tests to validate that when DebugInfoD returns the symbols, things work
properly. This is duplicated for DWP/split-dwarf scenarios.
---------
Co-authored-by: Kevin Frei <freik@meta.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on
GFX1150Plus
The w32 and w64 _e64_dpp assembler only real instructions were unused,
and erroneously constructed in a way that bugged parsing of the new
instructions. They are removed.
This patch is a follow up to PR
https://github.com/llvm/llvm-project/pull/87382
|
|
|
|
| |
We should consistently use PseudoInstr instead of Mnemonic to name
SIMCInstr, even though they may be the same in most cases
|
| |
|
|
|
|
| |
Add zext nneg tests and check we don't fold casts with different src types
|
|
|
|
|
|
|
|
|
|
| |
Implemented long-standing TODO to support commutative intrinsics.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/86316
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before all the call probe ids are after block ids, in this change, it
mixed the call probe and block probe by reordering them in
lexical(line-number) order. For example:
```
main():
BB1
if(...)
BB2 foo(..);
else
BB3 bar(...);
BB4
```
Before the profile is
```
main
1: ..
2: ..
3: ...
4: ...
5: foo ...
6: bar ...
```
Now the new order is
```
main
1: ..
2: ..
3: foo ...
4: ...
5: bar ...
6: ...
```
This can potentially make it more tolerant of profile mismatch, either from stale profile or frontend change. e.g. before if we add one block, even the block is the last one, all the call probes are shifted and mismatched. Moreover, this makes better use of call-anchor based stale profile matching. Blocks are matched based on the closest anchor, there would be more anchors used for the matching, reduce the mismatch scope.
|
|
|
|
|
|
|
|
|
|
| |
By generic intrinsics this mean things like dup, ext, zip and bsl that
can always be executed with integer s16 operations and do not require
fullfp16. This makes them always available, and brings them inline with
GCC.
https://godbolt.org/z/azs8eMv54
The relevant test cases have been moved into their own files, to allow
them to be tested with armv8-a and armv8.2-a+fp16.
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#87529)
Reverts llvm/llvm-project#87518
Revert is not needed as the regression was fixed with
1189e87951e59a81ee097eae847c06008276fef1.
I assumed the crash and warning are different issues, but according to
https://lab.llvm.org/buildbot/#/builders/240/builds/26629
fixing warning resolves the crash.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
available_externally functions (#87279)
This is to fix an assertion error. Apparently, `pseudo_probe_desc` could
still be available for import functions, and its checksum mismatch state
can be different from import function's `profile-checksum-mismatch`
attr. This happens when unstable IR or ODR violation issue occurs, the
definitions of the same function across different translation units
could be different and result in different checksums. During link time
deduplication, the internal function definition (the checksum in desc is
computed based on) is substituted by the `available_externally`
definition, which cause the inconsistency. Hence, we fix it to by always
checking the state for the new `available_externally` definition, which
is saved in the function attribute.
|
|
|
|
|
|
| |
`DefaultTimingManager::clear()` uses `out` to initialize `TimerImpl`,
but the `out` is `nullptr` by default. This means if
`DefaultTimingManager::setOutput()` is never called,
`DefaultTimingManager` destructor may generate SIGSEGV.
|
|
|
|
|
|
|
|
|
|
| |
Justifications:
- LWG3950: Done in #66206
- LWG3975: Wording changes only
- LWG4011: Wording changes only
- LWG4030: Wording changes only
- LWG4043: Wording changes only
- LWG3036 and P2875R4: We implemented neither, but the latter reverts
the former, so now we implement both without doing anything!
|
|
|
|
|
|
|
| |
This addition catches common cases of malformed `tosa.reshape` ops. This
prevents the `--tosa-to-tensor` pass from asserting when fed invalid
operations, as these will be caught ahead of time by the verifier.
Closes #87396
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reordering.
If the node has cmp instruction with 3 or more different but swappable
predicates, need to keep same kind of main/alternate opcodes to avoid
incorrect detection of opcodes after reordering. Reordering changes the
order and we may erroneously consider swappable opcodes as
non-compatible/alternate, which may lead to a later compiler crash.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/87267
|
|
|
|
|
|
|
|
|
|
|
| |
This was accidentally removed in
https://reviews.llvm.org/D137799#4657404 /
https://reviews.llvm.org/D137799#C3933303OL44, and downstream projects
are forced to add it back. For example,
https://git.savannah.gnu.org/cgit/guix.git/commit/?id=4e26331a5ee87928a16888c36d51e270f0f10f90
Fix this, by re-adding it.
Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
|
|
|
|
| |
llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-xor.mir
|
|
|
|
|
|
|
| |
This paper did not add any normative changes for us to check
conformance against. It added a note describing a potential behavioral
difference between compile-time and runtime evaluation of negative
floating-point values in the presence of rounding modes.
|
| |
|
|
|
|
|
| |
Reverts llvm/llvm-project#75481
Breaks multiple bots, see #75481
|
|
|
|
|
| |
Straightforward computation of `A − FLOOR (A / P) * P` should
produce NaN, when P is infinity. The -menable-no-infs lowering
can still use the relaxed operations sequence.
|
|
|
|
|
| |
Have to compare actual type size to pick up proper cast operation
opcode.
|
|
|
|
|
|
|
|
| |
The readme only states the goal and has links to further information,
e.g., our meetings.
---------
Co-authored-by: Shilei Tian <i@tianshilei.me>
|
|
|
|
|
|
|
|
| |
This patch fixes:
clang/lib/CodeGen/CGExpr.cpp:5607:11: error: variable 'Result' is
used uninitialized whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
|
|
|
|
| |
Convert math.fpowi to math.powf by converting dtype of power operand to
floating point.
|
|
|
|
|
|
|
|
|
| |
Currently -salvage-stale-profile is a no-op if the profile is not
probe-based. We observed that it can help for regular, non-probe- based
profiles too: some of our internal benchmarks show 0.2-0.3% QPS
improvement.
There seems to be no good reason to limit this flag to only work for
probe-based profiles.
|