[PATCH] D38196: [AArch64] Avoid interleaved SIMD store instructions for Exynos

Abderrazek Zaafrani via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 6 17:09:43 PST 2017


az added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64VectorByElementOpt.cpp:71-73
+  // This is used to cache instruction replacement decisions within function
+  // units and across function units.
+  std::map<unsigned, bool> SIMDInstrTable;
----------------
kristof.beyls wrote:
> The cache here will also need to take into account the TargetSubtargetInfo as the key into the cache.
> Otherwise, wouldn't you get unexpected results for e.g. the following .ll file with a mixture of targeting different subtargets (see "target-cpu" function attributes)?
> 
> ```
> declare <2 x float> @llvm.fma.v2f32(<2 x float>, <2 x float>, <2 x float>)
> 
> define <2 x float> @test_vfma_lane_f32_a57(<2 x float> %a, <2 x float> %b, <2 x float> %v)
> "target-cpu"="cortex-a57" {
> entry:
>   %lane = shufflevector <2 x float> %v, <2 x float> undef, <2 x i32> <i32 1, i32 1>
>   %0 = tail call <2 x float> @llvm.fma.v2f32(<2 x float> %lane, <2 x float> %b, <2 x float> %a)
>   ret <2 x float> %0
> }
> 
> define <2 x float> @test_vfma_lane_f32_exynos(<2 x float> %a, <2 x float> %b, <2 x float> %v)
> "target-cpu"="exynos-m1" {
> entry:
>   %lane = shufflevector <2 x float> %v, <2 x float> undef, <2 x i32> <i32 1, i32 1>
>   %0 = tail call <2 x float> @llvm.fma.v2f32(<2 x float> %lane, <2 x float> %b, <2 x float> %a)
>   ret <2 x float> %0
> }
> ```
> 
> I have no idea how to get a stable key from TargetSubtargetInfo to use in this cache.
> 
> At this point, I'd measure the overhead of this pass e.g. using CTMark to make sure that the caching mechanism is actually needed before investing more time in it.
> 
I am adding subtarget as part of key. 


================
Comment at: llvm/lib/Target/AArch64/AArch64VectorByElementOpt.cpp:163-170
+bool AArch64SIMDInstrOpt::
+shouldReplaceInst(MachineFunction *MF, const MCInstrDesc *InstDesc,
+                  SmallVectorImpl<const MCInstrDesc*> &InstDescRepl) {
+  // Check if replacement decision is already available in the cached table.
   // if so, return it.
-  if (!VecInstElemTable.empty() &&
-      VecInstElemTable.find(InstDesc->getOpcode()) != VecInstElemTable.end())
-    return VecInstElemTable[InstDesc->getOpcode()];
+  if (!SIMDInstrTable.empty() &&
+      SIMDInstrTable.find(InstDesc->getOpcode()) != SIMDInstrTable.end())
----------------
kristof.beyls wrote:
> Is the only reason you need to drop const from shouldReplaceInst and other methods that you're using operator[] here to get to an element that's guaranteed to be in the map?
> If so, wouldn't it be better to use syntax "*SIMDInstrTable.find(InstDesc->getOpcode())", so that the const can remain on all those methods?
Yes, I was having trouble setting SIMDInstrTable[...] to true or false (few lines below  the line of code you mention above). It thinks that the map member variable I added to the const class has the constant modifier too.


================
Comment at: llvm/lib/Target/AArch64/AArch64VectorByElementOpt.cpp:333-353
+      return false;
+    ReplInstrMCID.clear();
+    OriginalMCID = &TII->get(AArch64::ST4Fourv16b);
+    for (unsigned i=0; i<4; i++) {
+      ReplInstrMCID.push_back(&TII->get(AArch64::ZIP1v16i8));
+      ReplInstrMCID.push_back(&TII->get(AArch64::ZIP2v16i8));
+    }
----------------
kristof.beyls wrote:
> From a logical point of view, this is a lot of code that repeats information encoded in a different way in the big switch statements in optimizeLdStInterleave.
> It'll be easy to update one location in the future and forget the other location.
> Couldn't the logic in the switch statements in optimizeLdStInterleave be reused here somehow, so that the information is only encoded once?
Good suggestion to simplify that code. There are several ways to improve it and here is one way: store the re-writing rules in a table member variable of the class and use that table in shouldExitEarly and in optimizeLdStInterleave. Note that I have not changed optimizeLdStInterleave yet (only used new table in shouldExitEarly). I would like to make sure that what I did is the kind of changes you had originally in mind and if so, then I will use it in optimizeLdStInterleave.



https://reviews.llvm.org/D38196





More information about the llvm-commits mailing list