<html><head><style>body{font-family:Helvetica,Arial;font-size:13px}</style></head><body style="word-wrap:break-word;line-break:after-white-space"><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px;margin:0px;line-height:auto"><div id="bloop_customfont" style="margin:0px">Hi,</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">I've been using LLVM coroutines with great success over the last couple months. Everything works nicely, but I encountered the following strange optimization quirk. Take a look at the following two cases, which I'd expect to produce identical optimized code:</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">### Case 1: Constant argument</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">Code (simple integer iterator up to a specified value):</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px">def range(n: Int) -> Int {</div><div id="bloop_customfont" style="margin:0px">  var i = 0</div><div id="bloop_customfont" style="margin:0px">  while i < n {</div><div id="bloop_customfont" style="margin:0px">    yield i</div><div id="bloop_customfont" style="margin:0px">    i = i + 1</div><div id="bloop_customfont" style="margin:0px">  }</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">for i in range(2) {</div><div id="bloop_customfont" style="margin:0px">  print i</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">Generated IR (similar structure to the examples in the coroutine docs):</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px">define private void @main([...]) {</div><div id="bloop_customfont" style="margin:0px">preamble:</div><div id="bloop_customfont" style="margin:0px">  [...]</div><div id="bloop_customfont" style="margin:0px">  %2 = alloca i64, i64 1</div><div id="bloop_customfont" style="margin:0px">  br label %entry</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">entry:                                            ; preds = %preamble</div><div id="bloop_customfont" style="margin:0px">  %3 = call i8* @range(i64 2)</div><div id="bloop_customfont" style="margin:0px">  br label %for</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">for_cont:                                         ; No predecessors!</div><div id="bloop_customfont" style="margin:0px">  br label %for</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">for:                                              ; preds = %body, %for_cont, %entry</div><div id="bloop_customfont" style="margin:0px">  call void @llvm.coro.resume(i8* %3)</div><div id="bloop_customfont" style="margin:0px">  %4 = call i1 @llvm.coro.done(i8* %3)</div><div id="bloop_customfont" style="margin:0px">  br i1 %4, label %cleanup, label %body</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">body:                                             ; preds = %for</div><div id="bloop_customfont" style="margin:0px">  %5 = call i8* @llvm.coro.promise(i8* %3, i32 8, i1 false)</div><div id="bloop_customfont" style="margin:0px">  %6 = bitcast i8* %5 to i64*</div><div id="bloop_customfont" style="margin:0px">  %7 = load i64, i64* %6</div><div id="bloop_customfont" style="margin:0px">  store i64 %7, i64* %2</div><div id="bloop_customfont" style="margin:0px">  %8 = load i64, i64* %2</div><div id="bloop_customfont" style="margin:0px">  call void @print_int(i64 %8)</div><div id="bloop_customfont" style="margin:0px">  br label %for</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">cleanup:                                          ; preds = %for</div><div id="bloop_customfont" style="margin:0px">  call void @llvm.coro.destroy(i8* %3)</div><div id="bloop_customfont" style="margin:0px">  br label %exit</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">exit:                                             ; preds = %cleanup</div><div id="bloop_customfont" style="margin:0px">  ret void</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">define private i8* @range(i64) {</div><div id="bloop_customfont" style="margin:0px">preamble:</div><div id="bloop_customfont" style="margin:0px">  %promise = alloca i64, i64 1</div><div id="bloop_customfont" style="margin:0px">  %1 = bitcast i64* %promise to i8*</div><div id="bloop_customfont" style="margin:0px">  %id = call token @<a href="http://llvm.coro.id">llvm.coro.id</a>(i32 0, i8* %1, i8* null, i8* null)</div><div id="bloop_customfont" style="margin:0px">  %2 = alloca i64, i64 1</div><div id="bloop_customfont" style="margin:0px">  store i64 %0, i64* %2</div><div id="bloop_customfont" style="margin:0px">  %3 = alloca i64, i64 1</div><div id="bloop_customfont" style="margin:0px">  %4 = call i1 @llvm.coro.alloc(token %id)</div><div id="bloop_customfont" style="margin:0px">  br i1 %4, label %alloc, label %entry</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">alloc:                                            ; preds = %preamble</div><div id="bloop_customfont" style="margin:0px">  %5 = call i64 @llvm.coro.size.i64()</div><div id="bloop_customfont" style="margin:0px">  %6 = call i8* @alloc(i64 %5)</div><div id="bloop_customfont" style="margin:0px">  br label %entry</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">entry:                                            ; preds = %preamble, %alloc</div><div id="bloop_customfont" style="margin:0px">  %7 = phi i8* [ null, %preamble ], [ %6, %alloc ]</div><div id="bloop_customfont" style="margin:0px">  %hdl = call i8* @llvm.coro.begin(token %id, i8* %7)</div><div id="bloop_customfont" style="margin:0px">  %8 = call i8 @llvm.coro.suspend(token none, i1 false)</div><div id="bloop_customfont" style="margin:0px">  switch i8 %8, label %suspend [</div><div id="bloop_customfont" style="margin:0px">    i8 0, label %10</div><div id="bloop_customfont" style="margin:0px">    i8 1, label %cleanup</div><div id="bloop_customfont" style="margin:0px">  ]</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">final:                                            ; preds = %exit</div><div id="bloop_customfont" style="margin:0px">  %9 = call i8 @llvm.coro.suspend(token none, i1 true)</div><div id="bloop_customfont" style="margin:0px">  switch i8 %9, label %suspend [</div><div id="bloop_customfont" style="margin:0px">    i8 0, label %21</div><div id="bloop_customfont" style="margin:0px">    i8 1, label %cleanup</div><div id="bloop_customfont" style="margin:0px">  ]</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">; <label>:10:                                     ; preds = %entry</div><div id="bloop_customfont" style="margin:0px">  store i64 0, i64* %3</div><div id="bloop_customfont" style="margin:0px">  br label %while</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">while:                                            ; preds = %18, %10</div><div id="bloop_customfont" style="margin:0px">  %11 = load i64, i64* %3</div><div id="bloop_customfont" style="margin:0px">  %12 = load i64, i64* %2</div><div id="bloop_customfont" style="margin:0px">  %13 = icmp slt i64 %11, %12</div><div id="bloop_customfont" style="margin:0px">  %14 = zext i1 %13 to i8</div><div id="bloop_customfont" style="margin:0px">  %15 = trunc i8 %14 to i1</div><div id="bloop_customfont" style="margin:0px">  br i1 %15, label %body, label %exit</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">body:                                             ; preds = %while</div><div id="bloop_customfont" style="margin:0px">  %16 = load i64, i64* %3</div><div id="bloop_customfont" style="margin:0px">  store i64 %16, i64* %promise</div><div id="bloop_customfont" style="margin:0px">  %17 = call i8 @llvm.coro.suspend(token none, i1 false)</div><div id="bloop_customfont" style="margin:0px">  switch i8 %17, label %suspend [</div><div id="bloop_customfont" style="margin:0px">    i8 0, label %18</div><div id="bloop_customfont" style="margin:0px">    i8 1, label %cleanup</div><div id="bloop_customfont" style="margin:0px">  ]</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">; <label>:18:                                     ; preds = %body</div><div id="bloop_customfont" style="margin:0px">  %19 = load i64, i64* %3</div><div id="bloop_customfont" style="margin:0px">  %20 = add i64 %19, 1</div><div id="bloop_customfont" style="margin:0px">  store i64 %20, i64* %3</div><div id="bloop_customfont" style="margin:0px">  br label %while</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">exit:                                             ; preds = %while</div><div id="bloop_customfont" style="margin:0px">  br label %final</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">; <label>:21:                                     ; preds = %final</div><div id="bloop_customfont" style="margin:0px">  unreachable</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">cleanup:                                          ; preds = %final, %body, %entry</div><div id="bloop_customfont" style="margin:0px">  %22 = call i8* @llvm.coro.free(token %id, i8* %hdl)</div><div id="bloop_customfont" style="margin:0px">  br label %suspend</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">suspend:                                          ; preds = %final, %body, %entry, %cleanup</div><div id="bloop_customfont" style="margin:0px">  %23 = call i1 @llvm.coro.end(i8* %hdl, i1 false)</div><div id="bloop_customfont" style="margin:0px">  ret i8* %hdl</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">Optimized IR:</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px">define i32 @main([...]) local_unnamed_addr {</div><div id="bloop_customfont" style="margin:0px">entry:</div><div id="bloop_customfont" style="margin:0px">  [...]</div><div id="bloop_customfont" style="margin:0px">  tail call void @print_int(i64 0)</div><div id="bloop_customfont" style="margin:0px">  tail call void @print_int(i64 1)</div><div id="bloop_customfont" style="margin:0px">  ret i32 0</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">Everything works as expected in this case.</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">### Case 2: Inline constant</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">Almost identical code, but get rid of the argument and just paste the `2` into the function:</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px">def range() -> Int {</div><div id="bloop_customfont" style="margin:0px">  var i = 0</div><div id="bloop_customfont" style="margin:0px">  while i < 2 {</div><div id="bloop_customfont" style="margin:0px">    yield i</div><div id="bloop_customfont" style="margin:0px">    i = i + 1</div><div id="bloop_customfont" style="margin:0px">  }</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">for i in range() {</div><div id="bloop_customfont" style="margin:0px">  print i</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">Unoptimized LLVM code is almost identical, except `range` is called without arguments (and the corresponding `alloca`+`store` in the first block of `range` is gone) and the 2nd argument of the `icmp` is a constant `2`.</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">But strangely the optimized code is very different:</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px">define i32 @main([...]) local_unnamed_addr {</div><div id="bloop_customfont" style="margin:0px">entry:</div><div id="bloop_customfont" style="margin:0px">  [...]</div><div id="bloop_customfont" style="margin:0px">  br label %body.i</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">body.i:                                           ; preds = %body.i.backedge, %exit</div><div id="bloop_customfont" style="margin:0px">  %.sroa.6.1.i = phi i64 [ 0, %exit ], [ %.<a href="http://sroa.6.1.i.be">sroa.6.1.i.be</a>, %body.i.backedge ]</div><div id="bloop_customfont" style="margin:0px">  %.sroa.11.1.i = phi i2 [ 1, %exit ], [ %.<a href="http://sroa.11.1.i.be">sroa.11.1.i.be</a>, %body.i.backedge ]</div><div id="bloop_customfont" style="margin:0px">  tail call void @print_int(i64 %.sroa.6.1.i)</div><div id="bloop_customfont" style="margin:0px">  switch i2 %.sroa.11.1.i, label %unreachable.i8.i [</div><div id="bloop_customfont" style="margin:0px">    i2 0, label %body.i.backedge</div><div id="bloop_customfont" style="margin:0px">    i2 1, label %AfterCoroSuspend11.i5.i</div><div id="bloop_customfont" style="margin:0px">    i2 -2, label %main.exit</div><div id="bloop_customfont" style="margin:0px">  ]</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">AfterCoroSuspend11.i5.i:                          ; preds = %body.i</div><div id="bloop_customfont" style="margin:0px">  br label %body.i.backedge</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">body.i.backedge:                                  ; preds = %AfterCoroSuspend11.i5.i, %body.i</div><div id="bloop_customfont" style="margin:0px">  %.<a href="http://sroa.6.1.i.be">sroa.6.1.i.be</a> = phi i64 [ 1, %AfterCoroSuspend11.i5.i ], [ 0, %body.i ]</div><div id="bloop_customfont" style="margin:0px">  %.<a href="http://sroa.11.1.i.be">sroa.11.1.i.be</a> = phi i2 [ -2, %AfterCoroSuspend11.i5.i ], [ 1, %body.i ]</div><div id="bloop_customfont" style="margin:0px">  br label %body.i</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">unreachable.i8.i:                                 ; preds = %body.i</div><div id="bloop_customfont" style="margin:0px">  unreachable</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">main.exit:                                        ; preds = %body.i</div><div id="bloop_customfont" style="margin:0px">  ret i32 0</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">What's going on here? Am I doing something wrong when optimizing through the C++ API? This is my optimization code:</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px">codegen(module);</div><div id="bloop_customfont" style="margin:0px">// verifyModule etc.</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">std::unique_ptr<legacy::PassManager> pm(new legacy::PassManager());</div><div id="bloop_customfont" style="margin:0px">std::unique_ptr<legacy::FunctionPassManager> fpm(new legacy::FunctionPassManager(module));</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">unsigned optLevel = 3;</div><div id="bloop_customfont" style="margin:0px">unsigned sizeLevel = 0;</div><div id="bloop_customfont" style="margin:0px">PassManagerBuilder builder;</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">if (!debug) {</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre"> </span>builder.OptLevel = optLevel;</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre">    </span>builder.SizeLevel = sizeLevel;</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre">  </span>builder.Inliner = createFunctionInliningPass(optLevel, sizeLevel, false);</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre">       </span>builder.DisableUnitAtATime = false;</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre">     </span>builder.DisableUnrollLoops = false;</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre">     </span>builder.LoopVectorize = true;</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre">   </span>builder.SLPVectorize = true;</div><div id="bloop_customfont" style="margin:0px">}</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">builder.MergeFunctions = true;</div><div id="bloop_customfont" style="margin:0px">addCoroutinePassesToExtensionPoints(builder);</div><div id="bloop_customfont" style="margin:0px">builder.populateModulePassManager(*pm);</div><div id="bloop_customfont" style="margin:0px">builder.populateFunctionPassManager(*fpm);</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">fpm->doInitialization();</div><div id="bloop_customfont" style="margin:0px">for (Function& f : *module)</div><div id="bloop_customfont" style="margin:0px"><span class="Apple-tab-span" style="white-space:pre">     </span>fpm->run(f);</div><div id="bloop_customfont" style="margin:0px">fpm->doFinalization();</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">pm->run(*module);</div><div id="bloop_customfont" style="margin:0px">```</div><div id="bloop_customfont" style="margin:0px"><br></div><div id="bloop_customfont" style="margin:0px">Any help would be much appreciated.</div><div><br></div><div>Many Thanks,</div><div>Ariya</div><div><br></div></div><br><div id="bloop_sign_1538849588450667008" class="bloop_sign"></div></body></html>