[PATCH] D28637: [PPC] Inline expansion of memcmp

Mon Jan 30 06:20:14 PST 2017

nemanjai added a comment.

I find the logic in this patch quite difficult to follow. For example, I had trouble figuring out exactly what happens when we have a 1-byte load (i.e. branching past the block that calculates the return value, etc.). I also find that a rather peculiar special case.
Furthermore, it is not clear to me that it would be easy to extend this setup to [say] do this expansion into vector code.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:85
+STATISTIC(NumMemCmpNotConstant, "Number of memcmp calls without constant size");
+STATISTIC(NumMemCmpGreaterThanMax, "Number of memcmp calls with size greater than max size");
+STATISTIC(NumMemCmpInlined, "Number of inlined memcmp calls");
----------------
Line too long?

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1878
+// Populates the EndBlock with a sequence to calculate the memcmp result.
+Value *getResult(LLVMContext &C, PHINode *PhiXor, PHINode *PhiSrc1,
+                 PHINode *PhiSrc2, Type *LoadType, BasicBlock *EndBlock) {
----------------
The name is far too general.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1879
+Value *getResult(LLVMContext &C, PHINode *PhiXor, PHINode *PhiSrc1,
+                 PHINode *PhiSrc2, Type *LoadType, BasicBlock *EndBlock) {
+
----------------
It appears that `LoadType` is poorly named. From the body of the function, it appears that this is not a pointer type but the type of the [presumably] loaded value.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1882
+  IRBuilder<> Builder(C);
+  Function *F;
+  Value *Res;
----------------
I don't see a reason to declare this here and initialize it later. Same with `Res`. In fact, I don't see a real need to declare either at all.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1893
+  uint64_t Mask;
+  Mask = (LoadType == Type::getInt64Ty(C)) ? UINT64_MAX << 3 : UINT_MAX << 3;
+  CntZerosMasked = Builder.CreateAnd(
----------------
It is not at all obvious why we're masking out the bottom 3 bits here. Please explain (in a comment).

And (this is a personal preference) it seems clearer what you're doing if you write this as something like `~ 7LU` or `~ 7U`.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1896
+      CntZerosMasked,
+      ConstantInt::get((LoadType == Type::getInt64Ty(C)) ? LoadType
+                                                         : Type::getInt32Ty(C),
----------------
This appears to make an implicit assumption that the type of `PhiXor` is exactly `LoadType` and that furthermore, they can be either `i32` or `i64`.
I think an assert to make sure of that is in order.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1909
+// Populates the load compare block for the given LoadType.
+// If LoadType is i8, we can just subtract and return.
+// If LoadType is greater than i8, we need to populate the EndBlock
----------------
Why? Why is this not the case for `i16`, `i32`, etc...

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1912
+// with a sequence that calculates the memcmp result.
+void EmitLoadCompareBlock(LLVMContext &C, BasicBlock *LoadBlockCurr,
+                          BasicBlock *LoadBlockNext, Value *Source1,
----------------
It appears likely that a function with this many parameters might need refactoring. I imagine it is even hard to keep track of whether all of these are input parameters or if some of them are output or input/output. At the very least, I'd like to see a Doxygen comment that makes this explicit.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1923
+
+  Source1Cast = Builder.CreateBitCast(Source1, LoadPtrTy);
+  Source2Cast = Builder.CreateBitCast(Source2, LoadPtrTy);
----------------
What if `Source1` or `Source2` are already of type `LoadPtrTy`?

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1946
+
+  // If LoadType is i8, we can just subtract and return
+  if (LoadType == Type::getInt8Ty(C)) {
----------------
I don't believe this comment is accurate. It does not at all appear that you "subtract and return". In fact, you'll still emit a compare against zero and branch if this not the last load block.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1968
+
+Type *getPtrTypeFromSize(LLVMContext &C, unsigned Size) {
+  Type *LoadPtrTy;
----------------
I imagine this isn't needed. We have `IntegerType::get(LLVMContext &C, unsigned NumBits)`.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1991
+
+Type *getTypeFromSize(LLVMContext &C, unsigned Size) {
+  Type *LoadType;
----------------
This seems entirely unnecessary since there is `Type::getPointerTo()`.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1993
+  Type *LoadType;
+  switch (Size) {
+  case 8: {
----------------
`default:`?
I would imagine an `llvm_unreachable` is in order.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:2089
+  uint64_t SizeVal = SizeCast->getZExtValue();
+  if (SizeVal > TLI->getMaxLoadSizeMemcmp(0)) {
+    NumMemCmpGreaterThanMax++;
----------------
Please, use `false` for values of type `bool`. Besides, don't we want to actually pass the correct argument here? If we *are* compiling with `-Os` or `-Oz`, this should be `true` I imagine. Or even more likely, we should exit early from this function when compiling with `-Oz`.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:2136
+  int Remainder = SizeVal;
+  GEPIndex = 0;
+
----------------
There is absolutely no reason to separate the initialization of this from its declaration.

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:2397
+  bool Expanded = false;
+  if( TLInfo->getLibFunc(F->getName(), Func) && Func==LibFunc_memcmp){
+      bool ByteSwapLoads, AllowUnalignedLoads;
----------------
The formatting in this entire block is messed up. Please run it through clang-format.

https://reviews.llvm.org/D28637