[llvm-dev] target triple in 3.8
Frank Winter via llvm-dev
llvm-dev at lists.llvm.org
Fri Feb 19 12:08:32 PST 2016
I have some trouble making the SIMD vector length visible to the passes.
My application is basically on the level of 'opt'.
What I did in version 3.6 was
functionPassManager->add(new
llvm::TargetLibraryInfo(llvm::Triple(Mod->getTargetTriple())));
functionPassManager->add(new llvm::DataLayoutPass());
and then the -basicaa and -loop-vectorizer were able to vectorize the
input IR for AVX.
Now, with 3.8 that didn't compile. What I do instead is just setting the
datalayout to the Module (got that from the Kaleido example).
Mod->setDataLayout( targetMachine->createDataLayout() );
I don't add anything to the pass manager anymore, right? Especially I
don't set the target triple..?!
However, the SIMD size doesn't shine through. The debug output of the
loop vectorizer says:
LV: Checking a loop in "main" from module
LV: Loop hints: force=? width=0 unroll=0
LV: Found a loop: L3
LV: Found an induction variable.
LV: We can vectorize this loop!
LV: Found trip count: 8
LV: The Smallest and Widest types: 32 / 32 bits.
LV: The Widest register is: 32 bits.
LV: Found an estimated cost of 0 for VF 1 For instruction: %6 = phi
i64 [ %14, %L3 ], [ 0, %L5 ]
LV: Found an estimated cost of 1 for VF 1 For instruction: %7 = add
nsw i64 %19, %6
LV: Found an estimated cost of 0 for VF 1 For instruction: %8 =
getelementptr float, float* %arg1, i64 %7
LV: Found an estimated cost of 1 for VF 1 For instruction: %9 = load
float, float* %8
LV: Found an estimated cost of 0 for VF 1 For instruction: %10 =
getelementptr float, float* %arg2, i64 %7
LV: Found an estimated cost of 1 for VF 1 For instruction: %11 = load
float, float* %10
LV: Found an estimated cost of 1 for VF 1 For instruction: %12 = fadd
float %11, %9
LV: Found an estimated cost of 0 for VF 1 For instruction: %13 =
getelementptr float, float* %arg0, i64 %7
LV: Found an estimated cost of 1 for VF 1 For instruction: store float
%12, float* %13
LV: Found an estimated cost of 1 for VF 1 For instruction: %14 = add
nsw i64 %6, 1
LV: Found an estimated cost of 1 for VF 1 For instruction: %15 = icmp
sge i64 %14, 8
LV: Found an estimated cost of 1 for VF 1 For instruction: br i1 %15,
label %L4, label %L3
LV: Scalar loop costs: 8.
LV: Selecting VF: 1.
LV: Vectorization is possible but not beneficial.
LV: Interleaving is not beneficial.
The problematic line is:
LV: The Widest register is: 32 bits.
Before, with 3.6 on the same hardware it showed 256 bits. (which is
correct).
Something is a miss here. I know, there were some changes to the target
triple, but I didn't follow it too closely. Anyone knows how this is
done now?
Thanks,
Frank
More information about the llvm-dev
mailing list