[llvm] [SLP] NFC. Replace TreeEntry::setOperandsInOrder with VLOperands. (PR #113880)
Han-Kuan Chen via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 3 02:07:45 PST 2024
HanKuanChen wrote:
> > > > Why do we need to check the ID while the old code does not have to?
> > >
> > >
> > > Because the load code explicitly worked only for CallInst
> >
> >
> > But we have a `if (auto *CI = dyn_cast<CallInst>(VL[0])) {` check here.
> > > > Because only the commutative intrinsics check it in the old code.
> > >
> > >
> > > The non-commutative intrinsics also need to have a check for isVectorIntrinsicWithScalarOpAtArg
> >
> >
> > If we add `isVectorIntrinsicWithScalarOpAtArg` check for non-commutative intrinsics, `llvm/test/Transforms/SLPVectorizer/X86/extract_in_tree_user.ll` will fail even without this PR. The bug is not caused by this PR.
> > > > I think we should NOT add isVectorIntrinsicWithScalarOpAtArg check for setOperand
> > >
> > >
> > > The check should be there or this whole patch should be dropped. We should not create pseudo tree node for the scalar instructions
> >
> >
> > If LLVM has an intrinsic in the future and
> >
> > * the intrinsic is commutative
> > * one of the operand is `isVectorIntrinsicWithScalarOpAtArg`, and the operand is NOT `Constant`
> >
> > SLP will get fail because the operand is NOT in `TreeEntry` and it will trigger the assert in
> > ```
> > #if !defined(NDEBUG) || defined(EXPENSIVE_CHECKS)
> > // Check that all schedulable entities got scheduled
> > for (auto *I = BS->ScheduleStart; I != BS->ScheduleEnd; I = I->getNextNode()) {
> > ScheduleData *SD = BS->getScheduleData(I);
> > if (SD && SD->isSchedulingEntity() && SD->hasValidDependencies())
> > assert(SD->IsScheduled && "must be scheduled at this point");
> > }
> > #endif
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Have [664f2d3](https://github.com/llvm/llvm-project/commit/664f2d333ee051c30693b95ff8c59cf262c35cce) will cause a potential bug in the future.
> > In addition, the current code already has scalar instructions in `TreeEntry` because non-commutative intrinsics use `setOperandsInOrder` (and `setOperandsInOrder` does not have `isVectorIntrinsicWithScalarOpAtArg` check).
>
> It is not in tree, it is just set as the operand of the treeentry, but there is no treeentry for such operands. Such operands will end up as just broadcast gather treeentries and won't be scheduled anyway
SLP will call `tryScheduleBundle` and `calculateDependencies`. The latter one will calculate the dependency of `ScheduleData`.
```
// Handle def-use chain dependencies.
for (User *U : BundleMember->Inst->users()) {
if (ScheduleData *UseSD = getScheduleData(cast<Instruction>(U))) {
BundleMember->Dependencies++;
ScheduleData *DestBundle = UseSD->FirstInBundle;
if (!DestBundle->IsScheduled)
BundleMember->incrementUnscheduledDeps(1);
if (!DestBundle->hasValidDependencies())
WorkList.push_back(DestBundle);
}
}
```
After that, it will call `scheduleBlock`, `BS->resetSchedule()` (it will update `ScheduleData.UnscheduledDeps` to `ScheduleData.Dependencies`) and do
```
// Do the "real" scheduling.
while (!ReadyInsts.empty()) {
ScheduleData *Picked = *ReadyInsts.begin();
ReadyInsts.erase(ReadyInsts.begin());
// Move the scheduled instruction(s) to their dedicated places, if not
// there yet.
for (ScheduleData *BundleMember = Picked; BundleMember;
BundleMember = BundleMember->NextInBundle) {
Instruction *PickedInst = BundleMember->Inst;
if (PickedInst->getNextNonDebugInstruction() != LastScheduledInst)
PickedInst->moveAfter(LastScheduledInst->getPrevNode());
LastScheduledInst = PickedInst;
}
BS->schedule(Picked, ReadyInsts);
}
```
Then the assert will be triggered because the `UnscheduledDeps` of `%add1` is 5 (because it has 5 usage) but the second operand (`@llvm.powi.f32.i32(float %fp1,i32 %add1)`) is NOT in the `TreeEntry` (if we use `isVectorIntrinsicWithScalarOpAtArg` to filter that). It will make `DecrUnsched` cannot be called and the bundle of `%add1` cannot do schedule eventually.
```
for (unsigned OpIdx = 0, NumOperands = TE->getNumOperands();
OpIdx != NumOperands; ++OpIdx)
if (auto *I = dyn_cast<Instruction>(TE->getOperand(OpIdx)[Lane]))
DecrUnsched(I);
```
Maybe we need to modify `BundleMember->Dependencies++` like this?
```
@@ -16921,9 +16920,14 @@ void BoUpSLP::BlockScheduling::calculateDependencies(ScheduleData *SD,
BundleMember->resetUnscheduledDeps();
// Handle def-use chain dependencies.
- for (User *U : BundleMember->Inst->users()) {
- if (ScheduleData *UseSD = getScheduleData(cast<Instruction>(U))) {
- BundleMember->Dependencies++;
+ for (Use &U : BundleMember->Inst->uses()) {
+ User *User = U.getUser();
+ if (ScheduleData *UseSD = getScheduleData(cast<Instruction>(User))) {
+ auto *CI = dyn_cast<CallInst>(User);
+ if (!CI ||
+ !isVectorIntrinsicWithScalarOpAtArg(
+ getVectorIntrinsicIDForCall(CI, SLP->TLI), U.getOperandNo()))
+ BundleMember->Dependencies++;
ScheduleData *DestBundle = UseSD->FirstInBundle;
if (!DestBundle->IsScheduled)
BundleMember->incrementUnscheduledDeps(1);
```
https://github.com/llvm/llvm-project/pull/113880
More information about the llvm-commits
mailing list