clang/llvm.git - Vendor branches of https://github.com/llvm/llvm-project.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert "Reenable external categories (#87357)"upstream/revert-87357-reenable-external-categories	Vitaly Buka	2024-04-03	80	-3892/+8845
\| \| \| \|	This reverts commit e05c1b46d0d3739cc48ad912dbe6e9affce05927.
*	[RISCV] Remove G_TRUNC/ZEXT/SEXT/ANYEXT from the first switch in ↵	Craig Topper	2024-04-03	1	-10/+0
\| \| \| \| \| \| \| \|	RISCVRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.
*	[mlir][vector] Skip 0D vectors in vector linearization. (#87577)	Han-Chung Wang	2024-04-03	2	-0/+13
\|
*	[lldb] Set static Module's load addresses via ObjectFile (#87439)	Jason Molenda	2024-04-03	1	-24/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a followup to https://github.com/llvm/llvm-project/pull/86359 "[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory (#86359)" where I treat LLVM_COV segments in a Mach-O binary as non-loadable. There is another codepath in `DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to set the load addresses for a Module to the file addresses. It has no logic to detect a segment that is not loaded in virtual memory (ObjectFileMachO::SectionIsLoadable), so it would set the load address for this LLVM_COV segment to the file address and shadow actual code, breaking lldb behavior. This method currently sets the load address for any section that doesn't have a load address set already. This presumes that a Module was added to the Target, some mechanism set the correct load address for SOME segments, and then this method is going to set the other segments to a no-slide value, assuming they were forgotten. ObjectFile base class doesn't, today, vend a SectionIsLoadable method, but we do have ObjectFile::SetLoadAddress and at a higher level, Module::SetLoadAddress, when we're setting the same slide to all segments. That's the behavior we want in this method. If any section has a load address, we don't touch this Module. Otherwise we set all sections to have a load address that is the same as the file address. I also audited the other parts of lldb that are calling SectionList::SectionLoadAddress and looked if they should be more correctly using Module::SetLoadAddress for the entire binary. But in most cases, we have the potential for different slides for different sections so this section-by-section approach must be taken. rdar://125800290
*	[BoundsSafety] Minor fixes on counted_by (#87559)	Yeoul Na	2024-04-03	2	-3/+3
\| \| \| \| \| \|	DeclRef to field must be marked as LValue to be consistent with how the field decl will be evaluated. T->desugar() is unnecessary to call ->isArrayType().
*	Revert "DebugInfoD issues, take 2" (#87583)	Chelsea Cassanova	2024-04-03	10	-506/+17
\| \| \| \| \| \|	Reverts llvm/llvm-project#86812. This commit caused a regression on the x86_64 MacOS buildbot: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/784/
*	[Bounds-Safety][NFC] Clean up leading space emission for CountAttributedType ↵	Dan Liew	2024-04-03	1	-4/+5
\| \| \| \| \| \| \|	(#87582) Previously the leading space was added in each string constant. This patch moves the leading space out of the string constants and is instead explicitly added to add clarity to the code.
*	[mlir][vector] Update `castAwayContractionLeadingOneDim` to omit transposes ↵	Kojo Acquah	2024-04-03	2	-3/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	solely on leading unit dims. (#85694) Updates `castAwayContractionLeadingOneDim` to check for leading unit dimensions before inserting `vector.transpose` ops. Currently `castAwayContractionLeadingOneDim` removes all leading unit dims based on the accumulator and transpose any subsequent operands to match the accumulator indexing. This does not take into account if the transpose is strictly necessary, for instance when given this vector-matrix contract: ```mlir %result = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %lhs, %rhs, %acc : vector<1x1x8xi32>, vector<1x8x8xi32> into vector<1x8xi32> ``` Passing this through `castAwayContractionLeadingOneDim` pattern produces the following: ```mlir %0 = vector.transpose %arg0, [1, 0, 2] : vector<1x1x8xi32> to vector<1x1x8xi32> %1 = vector.extract %0[0] : vector<1x8xi32> from vector<1x1x8xi32> %2 = vector.extract %arg2[0] : vector<8xi32> from vector<1x8xi32> %3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %1, %arg1, %2 : vector<1x8xi32>, vector<1x8x8xi32> into vector<8xi32> %4 = vector.broadcast %3 : vector<8xi32> to vector<1x8xi32> ``` The `vector.transpose` introduced does not affect the underlying data layout (effectively a no op), but it cannot be folded automatically. This change avoids inserting transposes when only leading unit dimensions are involved. Fixes #85691
*	[mlir][ArmNeon] Updates LowerContractionToSMMLAPattern with vecmat unroll ↵	Kojo Acquah	2024-04-03	2	-31/+191
\| \| \| \| \| \| \|	patterns (#86005) Updates smmla unrolling patterns to handle vecmat contracts where `dimM=1`. This includes explicit vecmats in the form: `<1x8xi8> x <8x8xi8> --> <1x8xi32>` or implied with the leading dim folded: `<8xi8> x <8x8xi8> --> <8xi32>` Since the smmla operates on two `<2x8xi8>` input vectors to produce `<2x2xi8>` accumulators, half of each 2x2 accumulator tile is dummy data not pertinent to the computation, resulting in half throughput.
*	Revert "dsymutil: Re-add missing -latomic (#85380)"	Gulfem Savrun Yeniceri	2024-04-03	1	-1/+1
\| \| \| \| \| \|	This reverts commit 23616c65e7d632e750ddb67d55cc39098a69a8a6 because it breaks Fuchsia Clang toolchain builders. https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8751656876289840849/overview
*	[RISCV][GISEL] Instruction selection for G_ZEXT, G_SEXT, and G_ANYEXT with ↵	Michael Maitland	2024-04-03	3	-0/+2702
\| \| \| \|	scalable vector type
*	[RISCV][GISEL] Regbankselect for G_ZEXT, G_SEXT, and G_ANYEXT with scalable ↵	Michael Maitland	2024-04-03	4	-3/+2469
\| \| \| \|	vector type
*	[RISCV][GISEL] Instruction selection for G_ICMP	Michael Maitland	2024-04-03	1	-0/+534
\|
*	[RISCV][GISEL] Regbank select for scalable vector G_ICMP	Michael Maitland	2024-04-03	2	-1/+679
\|
*	[RISCV][GISEL] Legalize G_ZEXT, G_SEXT, and G_ANYEXT, G_SPLAT_VECTOR, and ↵	Michael Maitland	2024-04-03	14	-16/+7436
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	G_ICMP for scalable vector types This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a legal mask type, then the instruction is legalized as the element-wise select, where the condition on the select is the mask typed source operand, and the true and false values are 1 or -1 (for zero/any-extension and sign extension) and zero. If the type is a legal integer or vector integer type, then the instruction is marked as legal. The legalization of the extends may introduce a G_SPLAT_VECTOR, which needs to be legalized in this patch for the extend test cases to pass. A G_SPLAT_VECTOR is legal if the vector type is a legal integer or floating point vector type and the source operand is sXLen type. This is because the SelectionDAG patterns only support sXLen typed ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL if the splat is all ones or all zeros respectivley. In the case of a non-constant mask splat, we legalize by promoting the scalar value to s8. In order to get the s8 element vector back into s1 vector, we use a G_ICMP. In order for the splat vector and extend tests to pass, we also need to legalize G_ICMP in this patch. A G_ICMP is legal if the destination type is a legal bool vector and the LHS and RHS are legal integer vector types.
*	Revert "Revert "Revert "[clang][UBSan] Add implicit conversion check for ↵	Vitaly Buka	2024-04-03	11	-493/+73
\| \| \| \| \| \| \| \| \|	bitfields""" (#87562) Reverts llvm/llvm-project#87529 Reverts #87518 https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken
*	[libc++] Fix copy/pasta error in atomic tests for ↵	Damien L-G	2024-04-03	2	-4/+4
\| \| \| \| \| \| \| \| \| \|	`atomic_compare_exchange_{weak,strong}` (#87135) Spotted this minor mistake in the tests as I was looking into testing more thoroughly `atomic_ref`. The two argument overloads are tested just above. The names of the lambda clearly indicates that the intent was to test the one argument overload.
*	[flang][runtime] Enable I/O APIs in F18 runtime offload builds. (#87543)	Slava Zakharin	2024-04-03	8	-213/+235
\|
*	[VectorCombine][X86] shuffle-of-casts.ll - adjust zext nneg tests to improve ↵	Simon Pilgrim	2024-04-03	1	-16/+16
\| \| \| \| \| \|	costs for testing Improves SSE vs AVX test results for #87510
*	[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp instructions.	Alexey Bataev	2024-04-03	3	-16/+50
\| \| \| \| \| \| \| \| \| \| \|	Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/85966
*	[libc] Added transitive bindings for OffsetType (#87397)	Shourya Goel	2024-04-03	7	-12/+30
\| \| \| \| \|	Adding OffTType to fcntl.h and stdio.h 's Macro lists in libc/spec/posix.td as mentioned here: #87266
*	fully qualifies use of `detail` namespace (#87536)	Christopher Di Bella	2024-04-03	1	-4/+6
\| \| \| \|	Some TUs apparently end up with an ambiguity between `::llvm::detail` and `support::detail`, so we close that gap at the source.
*	Revert "[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp ↵	Alexey Bataev	2024-04-03	3	-48/+16
\| \| \| \| \| \| \|	instructions." This reverts commit 899855d2b11856a44e530fffe854d76be69b9008 to fix the issue reported in https://lab.llvm.org/buildbot/#/builders/165/builds/51659.
*	[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp instructions.	Alexey Bataev	2024-04-03	3	-16/+48
\| \| \| \| \| \| \| \| \| \| \|	Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/85966
*	[AMDGPU] Add a missing COV6 case to getAMDHSACodeObjectVersion() (#87492)	Emma Pilkington	2024-04-03	2	-0/+9
\|
*	DebugInfoD issues, take 2 (#86812)	Kevin Frei	2024-04-03	10	-17/+506
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous diff (and it's subsequent fix) were reverted as the tests didn't work properly on the AArch64 & ARM LLDB buildbots. I made a couple more minor changes to tests (from @clayborg's feedback) and disabled them for non Linux-x86(_64) builds, as I don't have the ability do anything about an ARM64 Linux failure. If I had to guess, I'd say the toolchain on the buildbots isn't respecting the `-Wl,--build-id` flag. Maybe, one day, when I have a Linux AArch64 system I'll dig in to it. From the reverted PR: I've migrated the tests in my https://github.com/llvm/llvm-project/pull/79181 from shell to API (at @JDevlieghere's suggestion) and addressed a couple issues that were exposed during testing. The tests first test the "normal" situation (no DebugInfoD involvement, just normal debug files sitting around), then the "no debug info" situation (to make sure the test is seeing failure properly), then it tests to validate that when DebugInfoD returns the symbols, things work properly. This is duplicated for DWP/split-dwarf scenarios. --------- Co-authored-by: Kevin Frei <freik@meta.com>
*	[AMDGPU][MC] Allow VOP3C dpp src1 to be imm or SGPR (#87418)	Joe Nash	2024-04-03	14	-86/+3218
\| \| \| \| \| \| \| \| \| \| \|	Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on GFX1150Plus The w32 and w64 _e64_dpp assembler only real instructions were unused, and erroneously constructed in a way that bugged parsing of the new instructions. They are removed. This patch is a follow up to PR https://github.com/llvm/llvm-project/pull/87382
*	AMDGPU: Use PseudoInstr to name SIMCInstr for DSDIR and SOPs, NFC (#87537)	Changpeng Fang	2024-04-03	2	-40/+40
\| \| \| \|	We should consistently use PseudoInstr instead of Mnemonic to name SIMCInstr, even though they may be the same in most cases
*	[AArch64] Add a test for non-temporal masked loads / stores. NFC	David Green	2024-04-03	1	-0/+75
\|
*	[VectorCombine][X86] Add additional tests for #87510	Simon Pilgrim	2024-04-03	1	-0/+42
\| \| \| \|	Add zext nneg tests and check we don't fold casts with different src types
*	[SLP]Add support for commutative intrinsics.	Alexey Bataev	2024-04-03	6	-27/+54
\| \| \| \| \| \| \| \| \| \|	Implemented long-standing TODO to support commutative intrinsics. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/86316
*	[PseudoProbe] Mix block and call probe ID in lexical order (#75092)	Lei Wang	2024-04-03	13	-79/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before all the call probe ids are after block ids, in this change, it mixed the call probe and block probe by reordering them in lexical(line-number) order. For example: ``` main(): BB1 if(...) BB2 foo(..); else BB3 bar(...); BB4 ``` Before the profile is ``` main 1: .. 2: .. 3: ... 4: ... 5: foo ... 6: bar ... ``` Now the new order is ``` main 1: .. 2: .. 3: foo ... 4: ... 5: bar ... 6: ... ``` This can potentially make it more tolerant of profile mismatch, either from stale profile or frontend change. e.g. before if we add one block, even the block is the last one, all the call probes are shifted and mismatched. Moreover, this makes better use of call-anchor based stale profile matching. Blocks are matched based on the closest anchor, there would be more anchors used for the matching, reduce the mismatch scope.
*	[AArch64][ARM] Make neon fp16 generic intrinsics always available. (#87467)	David Green	2024-04-03	6	-676/+1100
\| \| \| \| \| \| \| \| \| \|	By generic intrinsics this mean things like dup, ext, zip and bsl that can always be executed with integer s16 operations and do not require fullfp16. This makes them always available, and brings them inline with GCC. https://godbolt.org/z/azs8eMv54 The relevant test cases have been moved into their own files, to allow them to be tested with armv8-a and armv8.2-a+fp16.
*	Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields"" ↵	Vitaly Buka	2024-04-03	11	-73/+493
\| \| \| \| \| \| \| \| \| \| \| \|	(#87529) Reverts llvm/llvm-project#87518 Revert is not needed as the regression was fixed with 1189e87951e59a81ee097eae847c06008276fef1. I assumed the crash and warning are different issues, but according to https://lab.llvm.org/buildbot/#/builders/240/builds/26629 fixing warning resolves the crash.
*	Always check the function attribute to determine checksum mismatch for ↵	Lei Wang	2024-04-03	2	-10/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	available_externally functions (#87279) This is to fix an assertion error. Apparently, `pseudo_probe_desc` could still be available for import functions, and its checksum mismatch state can be different from import function's `profile-checksum-mismatch` attr. This happens when unstable IR or ODR violation issue occurs, the definitions of the same function across different translation units could be different and result in different checksums. During link time deduplication, the internal function definition (the checksum in desc is computed based on) is substituted by the `available_externally` definition, which cause the inconsistency. Hence, we fix it to by always checking the state for the new `available_externally` definition, which is saved in the function attribute.
*	[mlir] Initialize DefaultTimingManager::out. (#87522)	Chenguang Wang	2024-04-03	1	-1/+2
\| \| \| \| \| \|	`DefaultTimingManager::clear()` uses `out` to initialize `TimerImpl`, but the `out` is `nullptr` by default. This means if `DefaultTimingManager::setOutput()` is never called, `DefaultTimingManager` destructor may generate SIGSEGV.
*	[libc++] Mark some recent LWG issues and papers as done (#87502)	Louis Dionne	2024-04-03	4	-7/+8
\| \| \| \| \| \| \| \| \| \|	Justifications: - LWG3950: Done in #66206 - LWG3975: Wording changes only - LWG4011: Wording changes only - LWG4030: Wording changes only - LWG4043: Wording changes only - LWG3036 and P2875R4: We implemented neither, but the latter reverts the former, so now we implement both without doing anything!
*	Updates to 'tosa.reshape' verifier (#87416)	Rafael Ubal	2024-04-03	2	-17/+58
\| \| \| \| \| \| \|	This addition catches common cases of malformed `tosa.reshape` ops. This prevents the `--tosa-to-tensor` pass from asserting when fed invalid operations, as these will be caught ahead of time by the verifier. Closes #87396
*	[SLP]Fix PR87133: crash because of different altopcodes for cmps after ↵	Alexey Bataev	2024-04-03	3	-23/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reordering. If the node has cmp instruction with 3 or more different but swappable predicates, need to keep same kind of main/alternate opcodes to avoid incorrect detection of opcodes after reordering. Reordering changes the order and we may erroneously consider swappable opcodes as non-compatible/alternate, which may lead to a later compiler crash. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/87267
*	dsymutil: Re-add missing -latomic (#85380)	maflcko	2024-04-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This was accidentally removed in https://reviews.llvm.org/D137799#4657404 / https://reviews.llvm.org/D137799#C3933303OL44, and downstream projects are forced to add it back. For example, https://git.savannah.gnu.org/cgit/guix.git/commit/?id=4e26331a5ee87928a16888c36d51e270f0f10f90 Fix this, by re-adding it. Co-authored-by: MarcoFalke <*~=`'#}+{/-\|&$^_@721217.xyz>
*	[RISCV][GISEL] Run update_mir_test_checks on ↵	Michael Maitland	2024-04-03	1	-44/+44
\| \| \| \|	llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-xor.mir
*	[C23] Remove WG14 N2416 from the C status page	Aaron Ballman	2024-04-03	1	-5/+0
\| \| \| \| \| \| \|	This paper did not add any normative changes for us to check conformance against. It added a note describing a potential behavioral difference between compile-time and runtime evaluation of negative floating-point values in the presence of rounding modes.
*	[clang] Precommit test for `llvm.allow.ubsan.check()` (#87435)	Vitaly Buka	2024-04-03	1	-0/+207
\|
*	Revert "[clang][UBSan] Add implicit conversion check for bitfields" (#87518)	Vitaly Buka	2024-04-03	11	-493/+73
\| \| \| \| \|	Reverts llvm/llvm-project#75481 Breaks multiple bots, see #75481
*	[flang] Fixed MODULO(x, inf) to produce NaN. (#86145)	Slava Zakharin	2024-04-03	5	-15/+105
\| \| \| \| \|	Straightforward computation of `A − FLOOR (A / P) * P` should produce NaN, when P is infinity. The -menable-no-infs lowering can still use the relaxed operations sequence.
*	[SLP]Fix PR87477: fix alternate node cast cost/codegen.	Alexey Bataev	2024-04-03	2	-25/+74
\| \| \| \| \|	Have to compare actual type size to pick up proper cast operation opcode.
*	[Offload][NFC] Add offload subfolder and README (#77154)	Johannes Doerfert	2024-04-03	1	-0/+20
\| \| \| \| \| \| \| \|	The readme only states the goal and has links to further information, e.g., our meetings. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>
*	[CodeGen] Fix a warning	Kazu Hirata	2024-04-03	1	-1/+1
\| \| \| \| \| \| \| \|	This patch fixes: clang/lib/CodeGen/CGExpr.cpp:5607:11: error: variable 'Result' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
*	[mlir][math] Convert math.fpowi to math.powf in case of non constant (#87472)	Prashant Kumar	2024-04-03	2	-5/+63
\| \| \| \|	Convert math.fpowi to math.powf by converting dtype of power operand to floating point.
*	[SamplePGO] Support -salvage-stale-profile without probes too (#86116)	Krzysztof Pszeniczny	2024-04-03	3	-4/+256
\| \| \| \| \| \| \| \| \|	Currently -salvage-stale-profile is a no-op if the profile is not probe-based. We observed that it can help for regular, non-probe- based profiles too: some of our internal benchmarks show 0.2-0.3% QPS improvement. There seems to be no good reason to limit this flag to only work for probe-based profiles.