| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
While implementing the UTC clock it turns out that the implementation of
the leap seconds was not correct, it should store the individual value,
not the sum.
It also looks like LWG3359 has not been fully implemented.
Implements parts of:
- LWG3359 <chrono> leap second support should allow for negative leap seconds
|
| |
|
|
|
|
|
| |
Implements parts of:
- P0355 Extending to Calendars and Time Zones
|
|
|
|
|
|
| |
Testing with the get_info() returning a local_info revealed some issues
in the reverse lookup. This needed an additional quirk. Also the
skipping when not in the current continuation optimization was wrong. It
prevented merging two sys_info objects.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new test fixture class FEnvSafeTest (usable as a base
class for other fixtures) that ensures each test doesn't perturb
the `fenv_t` state that the next test will start with. It also
provides types and methods tests can use to explicitly wrap code
under test either to check that it doesn't perturb the state or
to save and restore the state around particular test code.
All the fenv and math tests are updated to use this so that none
can affect another. Expectations that code under test and/or
tests themselves don't perturb state can be added later.
|
|
|
|
|
|
|
|
|
| |
This test records the current behavior of HWASan, which doesn't utilize
the fixed shadow intrinsics of
https://github.com/llvm/llvm-project/commit/365bddf634993d5ea357e9715d8aacd7ee40c4b5
It is intended to be updated in future work ("Optimize outlined
memaccess for fixed shadow on Aarch64";
https://github.com/llvm/llvm-project/pull/88544)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The interesting bit is the zext folding. This is the first case where we
end up with a profitable fold of shNadd (zext x), y to shNadd.uw x, y.
See zext_mul68 from rv64zba.ll.
The test differences are cases where we can legally fold (only because
there's no one use check). These are not profitable or harmful, but we
can't a oneuse check without breaking the zext_mul68 case.
Note that XTHeadBa doesn't appear to have the equivalent patterns so
this only shows up in Zba.
|
|
|
|
|
| |
CodeGenFunction::generateAwaitSuspendWrapper (#89731)
Fixes https://github.com/llvm/llvm-project/issues/89723
|
|
|
|
|
| |
Resolves #88065
Added macros and functions.
|
| |
|
| |
|
|
|
|
|
| |
Test that ld.lld --debug-names (#86508) built per-module index can be
consumed by lldb. This has uncovered a bug during the development of the
lld feature.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR adds following options to the AddDebugInfo pass.
1. IsOptimized flag.
2. Level of debug info to generate.
3. Name of the source file
This enables us to remove the hard coded values from the code. It also
allows us to test the pass with different options. The tests have been
modified to take advantage of that.
The calling convention flag and producer name have also been improved.
|
|
|
|
|
|
| |
This opens up a door for reusing reassociation optimizations on
target-specific binary operations with non-standard operand list.
This is effectively a NFC.
|
|
|
|
|
|
|
|
|
|
|
| |
CMake has landed experimental support for using the Standard modules.
This will be part of the CMake 3.30 release. This updates the build
instructions to use modules with CMake.
The changes have been tested locally.
---------
Co-authored-by: Will Hawkins <whh8b@obs.cr>
|
|
|
|
|
|
| |
Motivation: LLDB is able to report errors about these scenarios whereas
LLVM's DWARF parser only gives a boolean success/fail. I want to migrate
LLDB to using LLVM's DWARFUnitHeader class, but I don't want to lose
some of the error reporting, so I'm adding it to the LLVM class first.
|
|
|
|
|
|
| |
The clang-tidy selection has been made automatic recently so this is not
longer needed.
Thanks to Louis for spotting this.
|
|
|
| |
This patch adds the PFM counter definitions for Intel alderlake CPUs.
|
|
|
|
|
|
|
|
|
|
|
| |
When trying to express a time before the epoch (e.g. "one nanosecond
before 00:01:40 on 1900-01-01")
the date would be shown as:
1900-01-01 00:01:39.-00000001
After this patch, that time would be correctly shown as:
1900-01-01 00:01:39.999999999
|
|
|
|
| |
This patch exports the `std::ranges::range_adaptor_closure` class
template implemented in #89148 from the C++ Modules file.
|
|
|
|
| |
Newer version allow `pure`, `elemental` and `recursive` on device
subprogram.
|
|
|
|
|
|
|
|
| |
We increment `NumOfCSPGOFunc` and `NumOfPGOFunc` in
`PGOUseFunc::readCounters()` already. We should do the same in
`PGOUseFunc::populateCoverage`.
https://github.com/llvm/llvm-project/blob/83bc7b57714dc2f6b33c188f2b95a0025468ba51/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1331
|
|
|
|
|
|
|
| |
In case the first element of a zip/uzp mask is undef, the isZIPMask and
isUZPMask functions have a 50% chance of picking the wrong
"WhichResult", meaning they don't match a zip/uzp where they could. This
patch alters the matching code to first check for the first non-undef
element, to try and get WhichResult correct.
|
|
|
|
|
|
| |
to use FDIV
Use of FDIV allows us to show a definite cost improvement with #88899
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Legalizer (#88469)
It does not make sense to scalarize G_FREEZE as it leads to the generation
of pairs of G_UNMERGE_VALUES and G_BUILD_VECTORs which are difficult to
optimize especially when operations like G_TRUNC operate before G_FREEZE
but after G_UNMERGE_VALUES.
Instead, it is better to legalize G_FREEZE like any other vector type
would be, as it gets lowered to a COPY during instruction selection
anyways.
This is an issue that was encountered when looking at the TSVC
benchmark, where the legalization of G_FREEZE would cause generation of
unnecessary MOVs that adversely affected the performance.
|
|
|
|
| |
Reduce diffs in #88899
|
|
|
|
|
|
|
| |
Complete support for rsqrt.approx with rsqrt.approx.f64 ([PTX ISA
9.7.3.17. Floating Point Instructions:
rsqrt.approx.ftz.f64](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-rsqrt-approx-ftz-f64)).
Additionally, add support for folding `sqrt` into `rsqrt`, with an
optional flag to disable.
|
|
|
| |
Reverts llvm/llvm-project#89342 due to build failure
|
| |
|
| |
|
| |
|
|
|
|
| |
The comment is misleading because `propertiesAttr` is not actually
ignored when the operation isn't unregistered.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#89263)"
Changes since original commit:
* Rebase over improved test coverage for theadba
* Revert change to use TargetConstant as it appears to prevent the uimm2
clause from matching in the XTheadBa patterns.
* Fix an order of operands bug in the THeadBa pattern visible in the new
test coverage.
Original commit message follows:
This implements a RISCV specific version of the SHL_ADD node proposed in
https://github.com/llvm/llvm-project/pull/88791.
If that lands, the infrastructure from this patch should seamlessly
switch over the to generic DAG node. I'm posting this separately because
I've run out of useful multiply strength reduction work to do without
having a way to represent MUL X, 3/5/9 as a single instruction.
The majority of this change is moving two sets of patterns out of
tablgen and into the post-legalize combine. The major reason for this is
that I have an upcoming change which needs to reuse the expansion logic,
but it also helps common up some code between zba and the THeadBa
variants.
On the test changes, there's a couple major categories:
* We chose a different lowering for mul x, 25. The new lowering involves
one fewer register and the same critical path, so this seems like a win.
* The order of the two multiplies changes in (3,5,9)*(3,5,9) in some
cases. I don't believe this matters.
* I'm removing the one use restriction on the multiply. This restriction
doesn't really make sense to me, and the test changes appear positive.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#89642)
For ASan, users already manually have to pass in the path to the lib,
and for other libraries they have to pass in the path to the libpath.
With LLVM's unreliable name of the lib (due to
LLVM_ENABLE_PER_TARGET_RUNTIME_DIR confusion and whatnot), it's useful
to be able to opt in to just explicitly passing the paths to the libs
everywhere.
Follow-up of sorts to https://reviews.llvm.org/D65543, and to #87866.
|
|
|
|
|
|
| |
This commit implements runtime verification for LinalgStructuredOps
using the existing `RuntimeVerifiableOpInterface`. The verification
checks that the runtime sizes of the operands match the runtime sizes
inferred by composing the loop ranges with the op's indexing maps.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch finalizes the std::ranges::range_adaptor_closure
class template from https://wg21.link/P2387R3.
// [range.adaptor.object], range adaptor objects
template<class D>
requires is_class_v<D> && same_as<D, remove_cv_t<D>>
class range_adaptor_closure { };
The current implementation of __range_adaptor_closure was introduced
in ee44dd8062a26541808fc0d3fd5c6703e19f6016 and has served as the
foundation for the range adaptors in libc++ for a while. This patch
keeps its implementation, with the exception of the following changes:
- __range_adaptor_closure now includes the missing constraints
`is_class_v<D> && same_as<D, remove_cv_t<D>>` to restrict the
type of class that can inherit from it. (https://eel.is/c++draft/ranges.syn)
- The operator| of __range_adaptor_closure no longer requires its
first argument to model viewable_range. (https://eel.is/c++draft/range.adaptor.object#1)
- The _RangeAdaptorClosure concept is refined to exclude cases where
T models range or where T has base classes of type range_adaptor_closure<U>
for another type U. (https://eel.is/c++draft/range.adaptor.object#2)
|
|
|
|
|
| |
Fixes #87394.
PR: https://github.com/llvm/llvm-project/pull/89160
|
|
|
|
|
| |
This change adds the z/OS personality function to the list of known EH
personality functions. It enables removing of the EH data/labels if the
personality function is not invoked.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
size (#83124)" (#89036)
When in-place new-ing a local variable of an array of trivial type, the
generated code calls 'memset' with the correct size of the array,
earlier it was generating size (squared of the typedef array + size).
The cause: typedef TYPE TArray[8]; TArray x; The type of declarator is
Tarray[8] and in SemaExprCXX.cpp::BuildCXXNew we check if it's of
typedef and of constant size then we get the original type and it works
fine for non-dependent cases.
But in case of template we do TreeTransform.h:TransformCXXNEWExpr and
there we again check the allocated type which is TArray[8] and it stays
that way, so ArraySize=(Tarray[8] type, alloc Tarray[8*type]) so the
squared size allocation.
ArraySize gets calculated earlier in TreeTransform.h so that
if(!ArraySize) condition was failing.
fix: I changed that condition to if(ArraySize).
fixes https://github.com/llvm/llvm-project/issues/41441
---------
Co-authored-by: erichkeane <ekeane@nvidia.com>
|
| |
|
|
|
|
|
|
| |
No need to try to vectorize single gather/buildvector with alternate
opcode graph, it is not profitable. In other cases, need to use last
instruction for inserting the vectorized code.
|
|
|
|
| |
As well as flipping the sense of the bit, GFX12 moved it from bit 0 to
bit 1 in the encoded simm16 operand.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We are almost ready to enable the use of debug records everywhere in
LLVM by default; part of the prep-work for this means ensuring that
every tool supports them. Every tool in the `llvm/` project supports
them, front-ends that use the `DIBuilder` will support them, and as far
as I can tell, the only other tool in the LLVM repo that needs to
support them but doesn't is `mlir-translate`. This patch trivially
unblocks them by converting from debug records to debug intrinsics
before translating a module.
|
|
|
|
|
|
| |
https://github.com/llvm/llvm-project/pull/78295 dropped private headers
in top level directory from libcxx.imp.
This PR re-adds them to libcxx.imp.
|
|
|
|
| |
Implement helper functions to identify leaf, composite, and combined
constructs.
|
|
|
|
| |
This will save later code from commuting it.
|
|
|
|
|
|
|
|
|
|
| |
Ignore incoming values with constant false masks when trying to simplify
VPBlendRecipes.
As a follow-on optimization, we should also be able to drop all incoming
values with false masks by creating a new VPBlendRecipe with those
operands dropped.
PR: https://github.com/llvm/llvm-project/pull/89384
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
declaration (#89494)
Since
[6163aa9](https://github.com/llvm/llvm-project/commit/6163aa96799cbad7f2f58e02c5bebee9647056a5#diff-3a7ef0bff7d2b73b4100de636f09ea68b72eda191b39c8091a6a1765d917c1a2),
we have introduced an optimization that almost always destroys
TemplateIdAnnotations at the end of a function declaration. This doesn't
always work properly: a lambda within a default template argument could
also result in such deallocation and hence a use-after-free bug while
building a type constraint on the template parameter.
This patch adds another flag to the parser to tell apart cases when we
shouldn't do such cleanups eagerly. A bit complicated as it is, this retains
the optimization on a highly templated function with lots of generic lambdas.
Note the test doesn't always trigger a conspicuous bug/crash even with a
debug build. But a sanitizer build can detect them, I believe.
Fixes https://github.com/llvm/llvm-project/issues/67235
Fixes https://github.com/llvm/llvm-project/issues/89127
|
|
|
|
| |
Also just get the value type from the SDValue instead of passing it separately.
|