forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 4
[pull] main from llvm:main #983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
+14,856
−1,195
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This weird pattern was introduced by LoopVectorize. But it was placed in an unreachable path, so we cannot assert that the indices are always valid in InstCombine. Closes #180233.
…embling (#180468) Reland #174731, resolve cyclic dependency issue. The use of LLVM_Object in LLVM_Util would cause cyclic dependency. Fix cyclic dependency by reimplement `getFeatureSetFromEFlag()`. Original description: --- This PR updates llvm-objdump to detect the specific AVR architecture from the ELF header flags when no specific CPU is provided. Fixes: #146451 Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Fixes #179700 Simple fix, if we are in batch mode, don't go into an interactive session after checking if there are commands to run. Testing it is more tricky. I tried a shell test as I thought it would be simplest. However to be able to FileCheck I had to pipe and the pipe turns off the prompt because it's non-interactive. The prompt is the thing that must not be printed. So I've just spawned lldb as a subprocess. If it doesn't quit quickly then something is wrong. The timeout is high not because it should normally take that long, but because sometimes a process will get stalled for a while and I don't want this to be flaky. (though in theory it can get stalled for much longer than a minute) If it does time out, the process will be cleaned up automatically. See https://docs.python.org/3/library/subprocess.html#subprocess.run > A timeout may be specified in seconds, it is internally > passed on to Popen.communicate(). If the timeout expires, > the child process will be killed and waited for.
) ForceTargetInstructionCost in the legacy cost model overrides any costs from InstsToScalarize. Match the behavior in the VPlan-based cost model. This fixes a crash with -force-target-instr-cost for the added test case. PR: #168269
VS Code and node versions should be synchronized. We use node >= 18.19, so expected VS Code version is 1.90.0. I checked it using this [info](https://github.com/ewanharris/vscode-versions).
Fix bug where Fast ISel incorrectly set `IncomingArgSize` to `0` for functions with no arguments, since `MIPS O32` uses _the reserved argument area_ of 16 bytes even for the functions with no args at all.
…179977) In some cases we decide to vectorise loops with first-order recurrences using VF=1, IC>1. We then attempt to unroll a vplan in replicateByVF, however when trying to erase the list of values from the parent we trigger the following assert: ``` virtual llvm::VPRecipeValue::~VPRecipeValue(): Assertion `Users.empty() && "trying to delete a VPRecipeValue with remaining users"' failed. ``` The problem seems to stem from this code: ``` DefR->replaceUsesWithIf(LaneDefs[0], [DefR](VPUser &U, unsigned) { return U.usesFirstLaneOnly(DefR); }); ``` since usesFirstLaneOnly returns false and we fail to replace uses of DefR with LaneDefs[0]. Upon inspection the only VPUser objects that return false are VPInstruction::FirstOrderRecurrenceSplice and VPFirstOrderRecurrencePHIRecipe. Since the values are all scalar it's simply not possible for us to be using anything other than the first lane. I've fixed this by bailing out of replicateByVF early for plans with only a scalar VF. Fixes #179671
4096cb6 removed the quotes around PluginInterface
#178642 added `lldb/test/Shell/DAP/TestSTDINConsole.test` with an incorrect `%lldb-dap` expansion. This patch fixes it.
Fixes a minor test regression introduced by #180226 in file llvm/test/Transforms/LoopVectorize/phi-with-fastflags-vplan.ll
This patch adds cross platform (Darwin, Linux, Windows) commands in `Makefile.rules` which is used to build lldb test targets. This maps POSIX commands like `mkdir -p` to their Windows equivalent, which allows to create cross platform `Makefile` for lldb's test targets. This is currently not needed by any test but might become useful later as we are working on enabling more lldb Windows tests. This was originally done in the `swiftlang/llvm-project` fork (swiftlang#12127)
selectConst() was asserting for constants wider than 64 bits. Add APInt overloads of getOrCreateConstInt and getOrCreateConstVector that avoid the uint64_t truncation.
…`y` is power-of-2 (#180148) This PR adds a small, targeted InstCombine fold for the pattern: ``` %idx = srem i64 %x, 2^k %p = getelementptr inbounds nuw i8, ptr %base, i64 %idx ``` When the GEP is inbounds + nuw, and the divisor is a non-zero power-of-two constant, the signed remainder cannot produce a negative offset without violating the inbounds/nuw constraints. In that case we can canonicalize the index to a non-negative form and expose the common power-of-two rewrite: - Rewrite the GEP index from `srem %x, 2^k` to `urem %x, 2^k` - Create a new GEP with the new index and replace the original GEP - the `urem %x, 2^k` will further folds to `and %x (2^k-1)` resulting the following pattern ``` %idx = and i64 %x, (2^k-1) %p = getelementptr inbounds nuw i8, ptr %base, i64 %idx ``` Fixes #180097. generalized alive2 proof: https://alive2.llvm.org/ce/z/8EBxug
Verify that produced messages/fixes are located in the right place. With this patch, we can proceed to do #180344
…9705) This is a follow-up PR of #169045 and the second part of #179086. In #179086, we added support for defining regions in Python-defined ops, but its usefulness was quite limited because we still couldn’t mark an op as a `Terminator` or `NoTerminator`. In this PR, we port the `DynamicOpTrait` (introduced on the C++ side for `DynamicDialect` in #177735) to Python, so we can dynamically attach traits to Python-defined ops.
…180387) `getLit64Encoding` uses a different approach to determine whether 64-bit literal encoding is used, which caused a size mismatch between the `MachineInstr` and the `MCInst`. For `!isValid32BitLiteral`, it is effectively `!(isInt<32>(Val) || isUInt<32>(Val))`, which is `!isInt<32>(Val) && !isUInt<32>(Val)`, but in `getLit64Encoding`, it is `!isInt<32>(Val) || !isUInt<32>(Val)`.
Commit a94060c ("[ELF] Pass Ctx & to Relocations") swapped the InputSectionBase &c argument for an InputSectionBase *sec member, and so "c." was replaced with "sec->". However, this must have done in such a way that "Local-Exec." was transformed to "Local-Exesec->" and "RISCV::relocateAlloc." to "RISCV::relocateAllosec->", i.e. without the use of something like clangd, and without appropriate word boundaries in a regex.
…_size benchmarks (#179922) Testing a bunch of sizes has relatively little value. This reduces the number of benchmarks so we can run them on a regular basis. This saves ~8 minutes when running the benchmarks.
…bled (#166310) When `-pass-remarks=loop-vectorize` is specified, the subsequent logic is executed to display detailed debug messages even if no PreHeader exists in the loop. Therefore, an assert occurs when the `getLoopPreHeader()` function is called. This commit resolves that issue. Fixed: #165377
…0524) We can only use block pointers here.
Make it clear that the returned object in the case where a variable offset is found is the first value to introduce a non-constant offset, not necessarily the actual underlying object. Found while investigating #180361.
…180172) Fixes [#179425](#179425). Allocate clause is allowed inside DO and parallel DO constructs as per [13.6.2](https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-6-0.pdf) but flang seemed to throw diagnostic against the same. This patch enables initial support for allocate clause in DO construct.
…rams in DIBuilder (#180294) Fix a regression introduced by #165032, where DIBuilder could attach local metadata nodes to the wrong subprogram during finalization. DIBuilder records freshly created local variables, labels, and types in `DIBuilder::SubprogramTrackedNodes`, and later attaches them to their parent subprogram's retainedNodes in `finalizeSubprogram()`. However, a temporary local type created via `createReplaceableCompositeType()` may later be replaced by a type with a different scope. DIBuilder does not currently verify that the scopes of the original and replacement types match. As a result, local types can be incorrectly attached to the retainedNodes of an unrelated subprogram. This issue is observable in clang with limited debug info mode (see `clang/test/DebugInfo/CXX/ctor-homing-local-type.cpp`). This patch updates `DIBuilder::finalizeSubprogram()` to verify that tracked metadata nodes still belong to the subprogram being finalized, and avoids adding nodes whose scopes no longer match to retainedNodes field of an unrelated subprogram.
…lvm-profgen (#180581) #66164 changed the hashing in `SampleContextFrame` from `std::hash` to `MD5` in a very hot function (ContextTrieNode::getOrCrateChildContext()) in llvm-profgen. This creates over 2x run time regression when running llvm-profgen with csspgo preinliner enabled, since the MD5 computation is tripled comparing to the Murmur hash in the std library. An llvm-profgen run time comparison shows follows: ``` $ time llvm-profgen -binary $BINARY--perfscript $SAMPLES --populate-profile-symbol-list --show-density --output=XXX # MD5 hash real 105m31.644s user 104m51.334s sys 0m35.033s # std::hash real 46m0.340s user 45m17.998s sys 0m38.420s ``` Can confirm that this patch recovers the run time regression in llvm-profgen, and the perf testing in our internal services shows neutral.
Simplify the conditional compilation and skip the problematic warnings only on 32-bit Arm.
Follow-up to #178739. The locality check assumed that immediately after the initial symbol resolution (i.e. prior to the OpenMP code in resolve-directives.cpp), the scope that owns a given symbol is the scope which owns the symbol's storage. Turns out that this isn't necessarily true as illustrated by the included testcase, roughly something like: ``` program main integer :: j ! host j (storage-owning) contains subroutine f !$omp parallel ! scope that owns j, but j is host-associated do j = ... end do !$omp end parallel end end program ``` In such cases, the locality should be checked for the symbol that owns storage, i.e. a clone of the symbol that is has been privatized or a symbol that is not host- or use-associated. This is similar to obtaning the ultimate symbol (i.e. from the end of association chain), except the chain traversal would stop at a privatized symbol, potentially before reaching the end. This fixes a few regressions in the Fujitsu test suite: Fujitsu/Fortran/0160/Fujitsu-Fortran-0160_0000.test Fujitsu/Fortran/0160/Fujitsu-Fortran-0160_0012.test Fujitsu/Fortran/0160/Fujitsu-Fortran-0160_0013.test Fujitsu/Fortran/0660/Fujitsu-Fortran-0660_0096.test Fujitsu/Fortran/0660/Fujitsu-Fortran-0660_0097.test Fujitsu/Fortran/1052/Fujitsu-Fortran-1052_0108.test Fujitsu/Fortran/1052/Fujitsu-Fortran-1052_0112.test
Some LLDB tests will only run if compiler-rt is built. This includes at least two tsan tests that passed in a PR (#179115) but then failed on other PRs that included compiler-rt in the build.
Include VPWidenPHIRecipe in phi simplification if there's a single incoming value.
This was failing validation against main and sending everyone emails. Try adding the fix that was suggested in the workflow run.
…ng release notes (#180299)" This reverts commit b6ee085. This reverts commit e624d50. This was causing failures like the following: https://github.com/llvm/llvm-project/actions/runs/21842945533. The follow up fix is also reverted as it did not actually fix the issue.
…79312) This patch implements the SPIR-V lowering for the following HLSL intrinsics: - SampleBias - SampleGrad - SampleLevel - SampleCmp - SampleCmpLevelZero It defines the required LLVM intrinsics in 'IntrinsicsDirectX.td' and 'IntrinsicsSPIRV.td'. It updates 'SPIRVInstructionSelector.cpp' to handle the new intrinsics and generates the correct 'OpImageSample*' instructions with the required operands (Bias, Grad, Lod, ConstOffset, MinLod, etc.). CodeGen tests are added to verify the implementation for images with dimension 1D, 2D, 3D, and Cube. Assisted-by: Gemini
This is similar to etext/_etext in the ELF linker. Its useful in emscripten to know where the RO data data ends and the data begins (even though the Wasm format itself has no concept of RO data). See emscripten-core/emscripten#25939 (reply in thread)
This PR adds new preprocessor callback that's invoked whenever the single-module-parse-mode skips over a module import. This will be used later on from the dependency scanner.
Compressing to a single shuffle doesn't remove any information and the backend can better apply specific optimizations to a single shuffle. Addresses #176218. --------- Co-authored-by: Luke Lau <luke_lau@igalia.com>
…9985) Extend operands when computing ub - lb to avoid overflow in signed arithmetic. E.g., i8: ub=127, lb=-128 yields 255, which overflows without extension.
Adds missing test coverage for reductions with intermediate stores, including partial reductions with intermediate stores, as well as chained min/max reductions with intermediate stores.
Clinger fast path bloats baremetal targets which are constrained in binary size. Disabling it for baremetal libc builds.
The code seems to have considered the potential problem but did not quite succeed in solving it ;)
Use new UTC support to re-generate check lines.
An AI told me these were missing and helped me add them.
Add set of FindLast tests where the selected expression is based on an IV and could be sunk.
…serves X15 (#179738) The target function to be checked by the Control Flow Guard Check function is stored in `X15` on AArch64. This register is guaranteed to be preserved by that function (on success), thus after it returns `X15` can be used to branch to the target function instead of having to load it from another register or the stack.
#180347) Update stale links and remove duplication in table.
…ared-libsan` (#164842) This PR contains two commits: - Add required dependencies when using `-shared-libsan` and fuzzer. Since libFuzzer is a static library we need to make sure that we add its dependencies when building with `-shared-libsan`. E.g libFuzzer uses `ceilf()` from `libm.so` when building on Gnu toolchain. Previously, the resulting command did not contain the required link libraries, giving build failures (only a static sanitizer runtime would trigger the call to `linkSanitizerRuntimeDeps`). - Correcting dependency order when using fuzzer. When building using `-shared-libsan` the sanitizer library needs to be first in link order. Since the fuzzer requires `-lstdc++` we have to make sure that the sanitizer library is added before `-lstdc++`. --------- Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
Get the shared cache filepath and uuid that the inferior process is using from debugserver, try to open that shared cache on the lldb host mac and if the UUID matches, index all of the binaries in that shared cache. When looking for binaries loaded in the process, get them from the already-indexed shared cache. Every time a binary is loaded, PlatformMacOSX may query the shared cache filepath and uuid from the Process, and pass that to HostInfo::GetSharedCacheImageInfo() if available (else fall back to the old HostInfo::GetSharedCacheImageInfo method which only looks at lldb's own shared cache), to get the file being requested. ProcessGDBRemote caches the shared cache filepath and uuid from the inferior, once it has a non-zero UUID. I added a lock for this ivar specifically, so I don't have 20 threads all asking for the shared cache information from debugserver and updating the cached answer. If we never get back a non-zero UUID shared cache reply, we will re-query at every library loaded notification. debugserver has been providing the shared cache UUID since 2013, although I only added the shared cache filepath field last November. Note that a process will not report its shared cache filepath or uuid at initial launch. As dyld gets a chance to execute a bit, it will start returning binaries -- it will be available at the point when libraries start loading. (it won't be available yet when the binary & dyld are the only two binaries loaded in the process) I tested this by disabling lldb's scan of its own shared cache pre-execution -- only loading the system shared cache when the inferior process reports that it is using that. I got 6-7 additional testsuite failures running lldb like that, because no system binaries were loaded before exeuction start, and the tests assumed they would be. rdar://148939795 --------- Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
…se notes (#180299) (#180650) We were using one token for both pushing to the llvmbot fork and for creating a pull request against the www-releases repository, since the fork and the repository have different owners, we were using a classic access token which has very coarse-grained permissions. By using two separate tokens, we limit the permissions to just what we need to do the task. This is a re-commit of b6ee085 minus the environment changes which were causing the workflow to fail.
This adds atomicrmw `uinc_wrap` and `udec_wrap` operations support for SPIR-V. Since SPIR-V doesn't provide dedicated instructions for those two operations, we have to use the `AtomicExpand` pass to expand the operations into CAS forms. Closes #177204.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )