summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795)upstream/users/mbrkusanin/gfx12-wmma-swmmac-backportMirko Brkušanin2024-01-2565-111/+17708
| | | | | Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>
* Use rc version suffixTom Stellard2024-01-231-1/+1
|
* Bump version to 18.1.0Tom Stellard2024-01-234-4/+4
|
* [RISCV][MC] Split tests for A into Zaamo and Zalrsc partsWang Pengcheng2024-01-248-71/+92
| | | | | | | | | | So that we don't duplicate tests in later patch. Reviewers: topperc, dtcxzyw, asb Reviewed By: asb Pull Request: https://github.com/llvm/llvm-project/pull/79111
* [RISCV] Add sifive-p670 processor (#79015)Michael Maitland2024-01-234-3/+88
| | | | | | This is an OOO core that has a vector unit. For more information see https://www.sifive.com/cores/performance-p650-670. Scheduler model and other tuning will come in separate patches.
* [llc] Remove C backend support (#79237)paperchalice2024-01-241-11/+3
| | | C backend is removed in 3.1.
* [Modules] [HeaderSearch] Don't reenter headers if it is pragma once (#76119)Chuanqi Xu2024-01-242-39/+58
| | | | | | | | | | | | | Close https://github.com/llvm/llvm-project/issues/73023 The direct issue of https://github.com/llvm/llvm-project/issues/73023 is that we entered a header which is marked as pragma once since the compiler think it is OK if there is controlling macro. It doesn't make sense. I feel like it should be sufficient to skip it after we see the '#pragma once'. From the context, it looks like the workaround is primarily for ObjectiveC. So we might need reviewers from OC.
* [gn build] port 7e50f006f7f6Nico Weber2024-01-233-2/+9
|
* [LSR] Fix incorrect comment. NFC (#79207)Craig Topper2024-01-231-1/+1
|
* [AMDGPU] Pick available high VGPR for CSR SGPR spilling (#78669)Christudasan Devadasan2024-01-2431-4231/+4513
| | | | | | | | | | | CSR SGPR spilling currently uses the early available physical VGPRs. It currently imposes a high register pressure while trying to allocate large VGPR tuples within the default register budget. This patch changes the spilling strategy by picking the VGPRs in the reverse order, the highest available VGPR first and later after regalloc shift them back to the lowest available range. With that, the initial VGPRs would be available for allocation and possibility of finding large number of contiguous registers will be more.
* [NewPM][CodeGen][llc] Add NPM support (#70922)paperchalice2024-01-2417-203/+411
| | | | | | | | | | | Add new pass manager support to `llc`. Users can use `--passes=pass1,pass2...` to run mir passes, and use `--enable-new-pm` to run default codegen pipeline. This patch is taken from [D83612](https://reviews.llvm.org/D83612), the original author is @yuanfang-chen. --------- Co-authored-by: Yuanfang Chen <455423+yuanfang-chen@users.noreply.github.com>
* [ELF,test] Improve dead-reloc-in-nonalloc.sFangrui Song2024-01-231-12/+23
| | | | | Test an absolute relocation referencing a DSO symbol, relocating a non-SHF_ALLOC section. Also test --gc-sections.
* [SROA] Only try additional vector type candidates when needed (#77678)Jeffrey Byrnes2024-01-232-75/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | https://github.com/llvm/llvm-project/commit/f9c2a341b94ca71508dcefa109ece843459f7f13 causes regressions when we have a slice with integer vector type that is the same size as the partition, and a ptr load/store slice that is not the size of the element type. Ref `vector-promotion.ll:ptrLoadStoreTys`. Before the patch, we would only consider `<4 x i32>` as a candidate type for vector promotion, and would find that it is a viable type for all the slices. After the patch, we now add `<2 x ptr>` as a candidate type due to slice with user `store ptr %val0, ptr %obj, align 8` -- and flag that we `HaveVecPtrTy`. The pre-existing behavior of this flag results in removing the viable `<4 x i32>` and keeping only the unviable `<2 x ptr>`, which results in a failure to promote. The end result is failing to promote an alloca that was previously promoted -- this does not appear to be the intent of that patch, which has the goal of increasing promotions by providing more promotion opportunities. This PR preserves this behavior via a simple reorganization of the implemention: try first the slice types with same size as the partition, then, if there is no promotable type, try the `LoadStoreTys.`
* [LoongArch] Insert nops and emit align reloc when handle alignment directive ↵Jinyang He2024-01-246-2/+203
| | | | | | | | | | | | | (#72962) Refer to RISCV, we will fix up the alignment if linker relaxation changes code size and breaks alignment. Insert enough Nops and emit R_LARCH_ALIGN relocation type so that linker could satisfy the alignment by removing Nops. It does so only in sections with the SHF_EXECINSTR flag. In LoongArch psABI v2.30, R_LARCH_ALIGN requires symbol index. The lowest 8 bits of addend represent alignment and the other bits of addend represent the maximum number of bytes to emit.
* [Github] Only run libclang-python-tests on monorepo mainAiden Grossman2024-01-231-0/+3
| | | | | | | | | | The libclang python binding test CI job currently doesn't have any restrictions on what branches it will run on when something is pushed and also isn't restricted to the monorepo. This patch adds a branch restriction for the push event, only running the CI job when something is pushed to the main branch (and the path filter is met), and also adds a filter to ensure that the job comes from a PR against the monorepo or a push to a branch in the monorepo.
* [AsmPrinter] Remove mbb-profile-dump flag (#76595)Aiden Grossman2024-01-233-114/+0
| | | | | Now that the work embedding PGO information in SHT_LLVM_BB_ADDR_MAP ELF sections has landed, there is no longer a need to keep around the mbb-profile-dump flag.
* [SROA] NFC: Precommit test for pull/77678Jeffrey Byrnes2024-01-231-0/+168
| | | | Change-Id: I6b2346301f9bd840a0adceba4a0d03e9932af245
* [mlir] Add example of `printAlias` to test dialect (NFC) (#79232)Jeff Niu2024-01-233-2/+45
| | | | | Follow-up from previous pull request. Motivate the API change with an attribute that decides between sugaring a sub-attribute or using an alias
* [RISCV] Support TLSDESC in the RISC-V backend (#66915)Paul Kirth2024-01-2324-19/+479
| | | | | | | | | | | | | | | This patch adds basic TLSDESC support in the RISC-V backend. Specifically, we add new relocation types for TLSDESC, as prescribed in https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373, and add a new pseudo instruction to simplify code generation. This patch does not try to optimize the local dynamic case, which can be improved in separate patches. Linker side changes will also be handled separately. The current implementation is only enabled when passing the new `-enable-tlsdesc` codegen flag.
* [lldb] Improve maintainability and readability for ValueObject methods (#75865)Pete Lawrence2024-01-231-166/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As I worked through changes to another PR (https://github.com/llvm/llvm-project/pull/74912), I couldn't help but rewrite a few methods for readability, maintainability, and possibly some behavior correctness too. 1. Exiting early instead of nested `if`-statements, which: - Reduces indentation levels for all subsequent lines - Treats missing pre-conditions similar to an error - Clearly indicates that the full length of the method is the "happy path". 2. Explicitly return empty Value Object shared pointers for those error (like) situations, which - Reduces the time it takes a maintainer to figure out what the method actually returns based on those conditions. 3. Converting a mix of `if` and `if`-`else`-statements around an enum into one `switch` statement, which: - Consolidates the former branching logic - Lets the compiler warn you of a (future) missing enum case - This one may actually change behavior slightly, because what was an early test for one enum case, now happens later on in the `switch`. 4. Consolidating near-identical, "copy-pasta" logic into one place, which: - Separates the common code to the diverging paths. - Highlights the differences between the code paths. rdar://119833526
* [nfc][clang] Fix test in new-array-init.cpp (#79225)Alan Zhao2024-01-231-1/+1
| | | | | | This test was originally introduced in https://github.com/llvm/llvm-project/pull/76976, but it incorrectly tests braced-list initialization instead of parenthesized initialization.
* [SROA] NFC: Extract code to checkVectorTypesForPromotionJeffrey Byrnes2024-01-231-82/+99
| | | | Change-Id: Ib6f237cc791a097f8f2411bc1d6502f11d4a748e
* [libc] remove redundant call_once (#79226)Nick Desaulniers2024-01-232-78/+0
| | | | | Missed cleanup from https://reviews.llvm.org/D134716. Fixes: #79220
* [Docs][DebugInfo][RemoveDIs] Document some debug-info transition info (#79167)Jeremy Morse2024-01-232-0/+111
| | | | | This is a high level description and FAQ for what we're doing in RemoveDIs, and how old code should be behave with new debug-info (exactly the same 99% of the time).
* Revert "[ASan][libc++] Turn on ASan annotations for short strings (#79049)"Thurston Dang2024-01-235-429/+34
| | | | | | | This reverts commit cb528ec5e6331ce207c7b835d7ab963bd5e13af7. Reason: buildbot breakage (https://lab.llvm.org/buildbot/#/builders/5/builds/40364): SUMMARY: AddressSanitizer: container-overflow /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan_ubsan/include/c++/v1/string:1870:29 in __get_long_pointer
* [DebugInfo][RemoveDIs] "Final" cleanup for non-instr debug-info (#79121)Jeremy Morse2024-01-236-9/+21
| | | | | | | | | | | | Here's a raft of minor fixes for the RemoveDIs project that's replacing dbg.value intrinsics with DPValue objects, all IMO trivial: * When inserting functions or blocks and calling setIsNewDbgInfoFormat, do that after setting the Parent pointer, just in case conversion from (or to) dbg.value mode is triggered. * When transferring DPValues from an empty range in a splice call, don't transfer if there are no DPValues attached to the source block at all. * stripNonLineTableDebugInfo should drop DPValues. * In insertBefore, don't try to transfer DPValues if there aren't any.
* [mlir][ArithToAMDGPU] Add option for saturating truncation to fp8 (#74153)Krzysztof Drewniak2024-01-239-77/+208
| | | | | | | | | | | | | | | | | | | Many machine-learning applications (and most software written at AMD) expect the operation that truncates floats to 8-bit floats to be saturatinng. That is, they expect `truncf 256.0 : f32 to f8E4M3FNUZ` to yield `240.0`, not `NaN`, and similarly for negative numbers. However, the underlying hardware instruction that can be used for this truncation implements overflow-to-NaN semantics. To enable handling this usecase, we add the saturate-fp8-truncf option to ArithToAMDGPU (off by default), which causes the requisite clamping code to be emitted. Said clamping code ensures that Inf and NaN are passed through exactly (and thus trancate to NaN). Per review feedback, this commit efactors createScalarOrSplatConstant() to the Arith dialect utilities and uses it in this code. It also fixes naming of existing patterns and switches from vector.extractelement/insertelement to vector.extract/insert.
* [mlir][sparse] adjust compression scheme for example (#79212)Aart Bik2024-01-231-3/+3
|
* [NFCI] Move SANITIZER_WEAK_IMPORT to sanitizer_common (#79208)Chris Apple2024-01-232-8/+8
| | | | | | | | | | SANITIZER_WEAK_IMPORT is useful for any call that needs to be conditionally linked in. This is currently used for the tsan_dispatch_interceptors, but can be used for other calls introduced in newer versions of MacOS. (such as `aligned_alloc` in this PR https://github.com/llvm/llvm-project/pull/79198). This PR moves the definition to a higher level so it can be used in other sanitizers.
* AMDGPU: Add SourceOfDivergence for int_amdgcn_global_load_tr (#79218)Changpeng Fang2024-01-232-0/+82
|
* [Clang][Driver] Fix `--save-temps` for OpenCL AoT compilation (#78333)Shilei Tian2024-01-233-3/+26
| | | | | | | | | | | We can directly call `clang -c -x cl -target amdgcn -mcpu=gfx90a test.cl -o test.o` to compile an OpenCL kernel file. However, when `--save-temps` is enabled, it doesn't work because the preprocessed file (`.i` file) is taken as C source file when it is fed to the front end, thus causing compilation error because those OpenCL keywords can't be recognized. This patch fixes the issue.
* [misc-coroutine-hostile-raii] Use getOperand instead of getCommonExpr. (#79206)Utkarsh Saxena2024-01-232-2/+34
| | | | | | | | | We were previously allowlisting awaitable types returned by `await_transform` instead of the type of the operand of the `co_await` expression. This previously used to give false positives and not respect the `AllowedAwaitablesList` flag when `await_transform` is used. See added test cases for such examples.
* [clang][FatLTO] Avoid UnifiedLTO until it can support WPD/CFI (#79061)Paul Kirth2024-01-2314-72/+92
| | | | | | | | | Currently, the UnifiedLTO pipeline seems to have trouble with several LTO features, like SplitLTO units, which means we cannot use important optimizations like Whole Program Devirtualization or security hardening instrumentation like CFI. This patch reverts FatLTO to using distinct pipelines for Full LTO and ThinLTO. It still avoids module cloning, since that was error prone.
* [libc++] Fix outdated release procedure for release notesLouis Dionne2024-01-231-1/+1
|
* [Preprocessor][test] Test ARM64EC definitions (#78916)Billy Laws2024-01-231-0/+370
|
* [lldb][NFCI] Remove unused method BreakpointIDList::AddBreakpointID(const ↵Alex Langford2024-01-232-11/+0
| | | | | char *) (#79189) This overload is completely unused.
* [mlir][Target] Teach dense_resource conversion to LLVMIR Target (#78958)Kunwar Grover2024-01-233-0/+174
| | | | | | | | | | | | | | This patch adds support for translating dense_resource attributes to LLVMIR Target. The support added is similar to how DenseElementsAttr is handled, except we don't need to handle splats. Another possible way of doing this is adding iteration on dense_resource, but that is non-trivial as DenseResourceAttr is not meant to be something you should directly access. It has subclasses which you are supposed to use to iterate on it.
* Added feature in llvm-profdata merge to filter functions from the profile ↵William Junda Huang2024-01-234-3/+151
| | | | | | | | | (#78378) `--function=<regex>` Include functions matching regex in the output `--no-function=<regex>` Exclude functions matching regex from the output If both are specified, `--no-function` has a higher precedence if a function name matches both filters
* [libc] Fix implicit conversion in FEnvImpl for arm32 targets. (#79210)lntue2024-01-231-2/+2
|
* [clang] Use LazyDetector for all toolchains. (#79073)Juergen Ributzka2024-01-236-19/+22
| | | | | Use the LazyDetector also for the remaining toolchains to avoid unnecessarily checking for the Cuda and Rocm installations. This fixes rdar://121397534.
* [PowerPC] lower partial vector store cost (#78358)RolandF772024-01-232-2/+22
| | | | | There are matching store opcodes (stfd, stxsiwx) for the load opcodes that make 32-bit and 64-bit vector operations cheap with VSX, so stores should also be cheap.
* [ELF,test] Actually fix defsym.llFangrui Song2024-01-231-1/+1
|
* [ELF,test] Fix defsym.llFangrui Song2024-01-231-10/+12
|
* [SLP]Fix PR79193: skip analysis of gather nodes for minbitwidth.Alexey Bataev2024-01-232-1/+17
| | | | | No need in trying to analyze small graphs with gather node only to avoid crash.
* [IndVars] Add NUW variants to iv-poison.ll and variants with extra uses.Florian Hahn2024-01-231-0/+242
|
* [Format] Fix detection of languages when reading from stdin (#79051)Ben Hamilton (Ben Gertzfield)2024-01-233-11/+26
| | | | | | | | | | The code cleanup in #74794 accidentally broke detection of languages by reading file content from stdin, e.g. via `clang-format -dump-config - < /path/to/filename`. This PR adds unit and integration tests to reproduce the issue and adds a fix. Fixes: #79023
* [libc++] Run the nightly libc++ build at 03:00 Eastern for real (#79184)Louis Dionne2024-01-231-2/+2
| | | | The nightly libc++ build was incorrectly set up to build at 22:00 Eastern when it intended to run at 03:00 Eastern. This patch fixes that.
* [libc] Fix aliasing function name got accidentally deleted in #79128. (#79203)lntue2024-01-231-0/+1
|
* [RISCV] Move FeatureStdExtH in RISCVFeatures.td. NFCCraig Topper2024-01-231-8/+10
| | | | | It was accidentally in the middle of the floating point extensions after the recent reordering.
* Revert "[libc] Fix forward arm32 buildbot" (#79201)Roland McGrath2024-01-231-2/+2
| | | | Reverts llvm/llvm-project#79151, necessary to revert #79128, which broke all production builds and landed without review.