[llvm] r330706 - Correct dwarf unwind information in function epilogue

Tue May 8 06:57:28 PDT 2018

Looks like Petar added the flag in r331635. I've verified there's on
error with Vlad's test case now.

On Thu, May 3, 2018 at 8:09 PM, Petar Jovanovic via llvm-commits
<llvm-commits at lists.llvm.org> wrote:
> Hi Craig,
>
>
> Can you take a look at the newly uploaded patch [1]? It should resolve the
> issue you are seeing.
>
> It does not fix the issue Vlad reported, that will be looked by Violeta
> when she gets back in the office on Monday.
>
> Generally, we may want to make CFIInstrInserter::verify() optional, similar
> to how it is done for MachineVerifier.
> These issues should not break anyone's build, that's too strict.
>
> Petar
>
> [1] https://reviews.llvm.org/D46399
>
>
> ________________________________
> From: Craig Topper <craig.topper at gmail.com>
> Sent: Thursday, May 3, 2018 1:09 AM
> To: Vlad Tsyrklevich
> Cc: Petar Jovanovic; violeta.vukobrat at rt-rk.com;
> djordje.lj.kovacevic at rt-rk.com; llvm-commits
> Subject: Re: [llvm] r330706 - Correct dwarf unwind information in function
> epilogue
>
> We're seeing a similar error I've reduced to this IR.  You just need to run
> with "llc -O0"
>
> ; ModuleID = 'bugpoint-reduced-simplified.ll'
> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
> target triple = "x86_64-unknown-linux-gnu"
>
> ; Function Attrs: cold noinline nounwind optnone uwtable
> define hidden void @foo() #0 {
> bb:
>   br label %bb1
>
> bb1:                                              ; preds = %bb3, %bb
>   %tmp = icmp ne i32 0, 0
>   br i1 %tmp, label %bb2, label %bb3
>
> bb2:                                              ; preds = %bb1
>   br label %bb3
>
> bb3:                                              ; preds = %bb2, %bb1
>   br label %bb1
> }
>
> attributes #0 = { noinline nounwind optnone uwtable
> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
> "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" }
>
>
> ~Craig
>
>
> On Tue, May 1, 2018 at 1:35 PM Vlad Tsyrklevich via llvm-commits
> <llvm-commits at lists.llvm.org> wrote:
>>
>> Hello, this change is causing a build failure during a Chromium build with
>> a particular build configuration. The build fails with:
>> *** Inconsistent CFA register and/or offset between pred and succ ***
>> Pred:  outgoing CFA Reg:6
>> Pred:  outgoing CFA Offset:16
>> Succ:  incoming CFA Reg:7
>> Succ:  incoming CFA Offset:8
>>
>> A minimized test case is included below that fails to compile with clang++
>> -fuse-ld=lld -fsanitize=cfi -fwhole-program-vtables -flto
>> -fvisibility=hidden -g -O2 -fno-omit-frame-pointer
>>
>> I'm not very familiar with how debug information is handled in LLVM;
>> however, it appears that call frame information might not be correctly
>> preserved when calls to TRAP instructions are inlined.
>>
>> The test case:
>> #include <stdio.h>
>>
>> class A {
>>  public:
>>   virtual void f() { printf("A\n"); }
>> };
>> class B {
>>  public:
>>   virtual void f() { printf("B\n"); }
>> };
>>
>> void c(void) {
>>   volatile void *b = (void*)new A;
>>   // Unsatisfiable with cfi-vcall, hence always results in a call to
>> llvm.trap()
>>   ((B*)b)->f();
>> }
>>
>> int main(int argc, char *argv[]) {
>>   if (argv) {
>>     char foo[8];
>>     printf("a");
>>     c();
>>   } else {
>>     c();
>>   }
>> }
>>
>> The failing MIR:
>> # Machine code for function main: NoPHIs, TracksLiveness, NoVRegs
>> Frame Objects:
>>   fi#-1: size=8, align=16, fixed, at location [SP-8]
>> Function Live Ins: $rsi
>>
>> bb.0 (%ir-block.2):
>>   successors: %bb.2(0x40000000), %bb.1(0x40000000); %bb.2(50.00%),
>> %bb.1(50.00%)
>>   liveins: $rsi
>>   DBG_VALUE debug-use $edi, debug-use $noreg, !"argc", !DIExpression(),
>> debug-location !29; line no:18
>>   DBG_VALUE debug-use $rsi, debug-use $noreg, !"argv", !DIExpression(),
>> debug-location !30; line no:18
>>   DBG_VALUE debug-use $rsi, debug-use $noreg, !"argv", !DIExpression(),
>> debug-location !30; line no:18
>>   TEST64rr killed renamable $rsi, renamable $rsi, implicit-def $eflags,
>> debug-location !31
>>   JE_1 %bb.2, implicit $eflags, debug-location !32
>>
>> bb.1 (%ir-block.4):
>> ; predecessors: %bb.0
>>   successors: %bb.2(0x80000000); %bb.2(200.00%)
>>
>>   DBG_VALUE debug-use $rsi, debug-use $noreg, !"argv", !DIExpression(),
>> debug-location !30; line no:18
>>   DBG_VALUE debug-use $edi, debug-use $noreg, !"argc", !DIExpression(),
>> debug-location !29; line no:18
>>   frame-setup PUSH64r killed $rbp, implicit-def $rsp, implicit $rsp
>>   CFI_INSTRUCTION def_cfa_offset 16
>>   CFI_INSTRUCTION offset $rbp, -16
>>   $rbp = frame-setup MOV64rr $rsp
>>   CFI_INSTRUCTION def_cfa_register $rbp
>>   $edi = MOV32ri 97, debug-location !33
>>   CALL64pcrel32 @putchar, <regmask $bh $bl $bp $bpl $bx $ebp $ebx $hbp
>> $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12d $r13d $r14d
>> $r15d $r12w $r13w $r14w $r15w>, implicit $rsp, implicit $ssp, implicit
>> killed $edi, implicit-def $rsp, implicit-def $ssp, implicit-def dead $eax,
>> debug-location !33
>>
>> bb.2 (%ir-block.6):
>> ; predecessors: %bb.0, %bb.1
>>
>>   TRAP debug-location !43
>>
>> # End machine code for function main.
>>
>> On Tue, Apr 24, 2018 at 3:35 AM Petar Jovanovic via llvm-commits
>> <llvm-commits at lists.llvm.org> wrote:
>>>
>>> Author: petarj
>>> Date: Tue Apr 24 03:32:08 2018
>>> New Revision: 330706
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=330706&view=rev
>>> Log:
>>> Correct dwarf unwind information in function epilogue
>>>
>>> This patch aims to provide correct dwarf unwind information in function
>>> epilogue for X86.
>>> It consists of two parts. The first part inserts CFI instructions that
>>> set
>>> appropriate cfa offset and cfa register in emitEpilogue() in
>>> X86FrameLowering. This part is X86 specific.
>>>
>>> The second part is platform independent and ensures that:
>>>
>>> * CFI instructions do not affect code generation (they are not counted as
>>>   instructions when tail duplicating or tail merging)
>>> * Unwind information remains correct when a function is modified by
>>>   different passes. This is done in a late pass by analyzing information
>>>   about cfa offset and cfa register in BBs and inserting additional CFI
>>>   directives where necessary.
>>>
>>> Added CFIInstrInserter pass:
>>>
>>> * analyzes each basic block to determine cfa offset and register are
>>> valid
>>>   at its entry and exit
>>> * verifies that outgoing cfa offset and register of predecessor blocks
>>> match
>>>   incoming values of their successors
>>> * inserts additional CFI directives at basic block beginning to correct
>>> the
>>>   rule for calculating CFA
>>>
>>> Having CFI instructions in function epilogue can cause incorrect CFA
>>> calculation rule for some basic blocks. This can happen if, due to basic
>>> block reordering, or the existence of multiple epilogue blocks, some of
>>> the
>>> blocks have wrong cfa offset and register values set by the epilogue
>>> block
>>> above them.
>>> CFIInstrInserter is currently run only on X86, but can be used by any
>>> target
>>> that implements support for adding CFI instructions in epilogue.
>>>
>>> Patch by Violeta Vukobrat.
>>>
>>> Differential Revision: https://reviews.llvm.org/D42848
>>>
>>> Added:
>>>     llvm/trunk/lib/CodeGen/CFIInstrInserter.cpp
>>>     llvm/trunk/test/CodeGen/X86/cfi-inserter-check-order.ll
>>>     llvm/trunk/test/CodeGen/X86/epilogue-cfi-fp.ll
>>>     llvm/trunk/test/CodeGen/X86/epilogue-cfi-no-fp.ll
>>>     llvm/trunk/test/CodeGen/X86/merge-sp-updates-cfi.ll
>>>     llvm/trunk/test/CodeGen/X86/throws-cfi-fp.ll
>>>     llvm/trunk/test/CodeGen/X86/throws-cfi-no-fp.ll
>>> Modified:
>>>     llvm/trunk/include/llvm/CodeGen/Passes.h
>>>     llvm/trunk/include/llvm/CodeGen/TargetFrameLowering.h
>>>     llvm/trunk/include/llvm/InitializePasses.h
>>>     llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>>     llvm/trunk/lib/CodeGen/CMakeLists.txt
>>>     llvm/trunk/lib/CodeGen/CodeGen.cpp
>>>     llvm/trunk/lib/CodeGen/TargetFrameLoweringImpl.cpp
>>>     llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
>>>     llvm/trunk/lib/Target/X86/X86FrameLowering.h
>>>     llvm/trunk/lib/Target/X86/X86TargetMachine.cpp
>>>     llvm/trunk/test/CodeGen/AArch64/taildup-cfi.ll
>>>     llvm/trunk/test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll
>>>     llvm/trunk/test/CodeGen/X86/2011-10-19-widen_vselect.ll
>>>     llvm/trunk/test/CodeGen/X86/GlobalISel/brcond.ll
>>>     llvm/trunk/test/CodeGen/X86/GlobalISel/callingconv.ll
>>>     llvm/trunk/test/CodeGen/X86/GlobalISel/frameIndex.ll
>>>     llvm/trunk/test/CodeGen/X86/O0-pipeline.ll
>>>     llvm/trunk/test/CodeGen/X86/O3-pipeline.ll
>>>     llvm/trunk/test/CodeGen/X86/TruncAssertZext.ll
>>>     llvm/trunk/test/CodeGen/X86/avoid-sfb.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512-regcall-Mask.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512-regcall-NoMask.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512-schedule.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512-select.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512-vbroadcast.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-fast-isel.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll
>>>     llvm/trunk/test/CodeGen/X86/avx512vl-vbroadcast.ll
>>>     llvm/trunk/test/CodeGen/X86/bool-vector.ll
>>>     llvm/trunk/test/CodeGen/X86/cmp.ll
>>>     llvm/trunk/test/CodeGen/X86/cmpxchg-i128-i1.ll
>>>     llvm/trunk/test/CodeGen/X86/emutls-pie.ll
>>>     llvm/trunk/test/CodeGen/X86/emutls.ll
>>>     llvm/trunk/test/CodeGen/X86/fast-isel-int-float-conversion.ll
>>>     llvm/trunk/test/CodeGen/X86/fast-isel-store.ll
>>>     llvm/trunk/test/CodeGen/X86/fmaxnum.ll
>>>     llvm/trunk/test/CodeGen/X86/fminnum.ll
>>>     llvm/trunk/test/CodeGen/X86/fp-arith.ll
>>>     llvm/trunk/test/CodeGen/X86/frame-lowering-debug-intrinsic-2.ll
>>>     llvm/trunk/test/CodeGen/X86/frame-lowering-debug-intrinsic.ll
>>>     llvm/trunk/test/CodeGen/X86/h-registers-1.ll
>>>     llvm/trunk/test/CodeGen/X86/haddsub-2.ll
>>>     llvm/trunk/test/CodeGen/X86/hipe-cc64.ll
>>>     llvm/trunk/test/CodeGen/X86/illegal-bitfield-loadstore.ll
>>>     llvm/trunk/test/CodeGen/X86/imul.ll
>>>     llvm/trunk/test/CodeGen/X86/lea-opt-cse1.ll
>>>     llvm/trunk/test/CodeGen/X86/lea-opt-cse2.ll
>>>     llvm/trunk/test/CodeGen/X86/lea-opt-cse3.ll
>>>     llvm/trunk/test/CodeGen/X86/lea-opt-cse4.ll
>>>     llvm/trunk/test/CodeGen/X86/legalize-shift-64.ll
>>>     llvm/trunk/test/CodeGen/X86/legalize-shl-vec.ll
>>>     llvm/trunk/test/CodeGen/X86/live-out-reg-info.ll
>>>     llvm/trunk/test/CodeGen/X86/load-combine.ll
>>>     llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll
>>>     llvm/trunk/test/CodeGen/X86/memset-nonzero.ll
>>>     llvm/trunk/test/CodeGen/X86/merge-consecutive-loads-128.ll
>>>     llvm/trunk/test/CodeGen/X86/mmx-arith.ll
>>>     llvm/trunk/test/CodeGen/X86/movtopush.ll
>>>     llvm/trunk/test/CodeGen/X86/mul-constant-result.ll
>>>     llvm/trunk/test/CodeGen/X86/mul-i256.ll
>>>     llvm/trunk/test/CodeGen/X86/mul128.ll
>>>     llvm/trunk/test/CodeGen/X86/musttail-varargs.ll
>>>     llvm/trunk/test/CodeGen/X86/pr21792.ll
>>>     llvm/trunk/test/CodeGen/X86/pr29061.ll
>>>     llvm/trunk/test/CodeGen/X86/pr29112.ll
>>>     llvm/trunk/test/CodeGen/X86/pr30430.ll
>>>     llvm/trunk/test/CodeGen/X86/pr32241.ll
>>>     llvm/trunk/test/CodeGen/X86/pr32256.ll
>>>     llvm/trunk/test/CodeGen/X86/pr32282.ll
>>>     llvm/trunk/test/CodeGen/X86/pr32284.ll
>>>     llvm/trunk/test/CodeGen/X86/pr32329.ll
>>>     llvm/trunk/test/CodeGen/X86/pr32345.ll
>>>     llvm/trunk/test/CodeGen/X86/pr32451.ll
>>>     llvm/trunk/test/CodeGen/X86/pr34088.ll
>>>     llvm/trunk/test/CodeGen/X86/pr34592.ll
>>>     llvm/trunk/test/CodeGen/X86/pr34653.ll
>>>     llvm/trunk/test/CodeGen/X86/pr9743.ll
>>>     llvm/trunk/test/CodeGen/X86/push-cfi-debug.ll
>>>     llvm/trunk/test/CodeGen/X86/push-cfi-obj.ll
>>>     llvm/trunk/test/CodeGen/X86/push-cfi.ll
>>>     llvm/trunk/test/CodeGen/X86/rdtsc.ll
>>>     llvm/trunk/test/CodeGen/X86/return-ext.ll
>>>     llvm/trunk/test/CodeGen/X86/rtm.ll
>>>     llvm/trunk/test/CodeGen/X86/schedule-x86_32.ll
>>>     llvm/trunk/test/CodeGen/X86/select-mmx.ll
>>>     llvm/trunk/test/CodeGen/X86/setcc-lowering.ll
>>>     llvm/trunk/test/CodeGen/X86/shrink_vmul.ll
>>>     llvm/trunk/test/CodeGen/X86/stack-probe-red-zone.ll
>>>     llvm/trunk/test/CodeGen/X86/statepoint-call-lowering.ll
>>>     llvm/trunk/test/CodeGen/X86/statepoint-gctransition-call-lowering.ll
>>>     llvm/trunk/test/CodeGen/X86/statepoint-invoke.ll
>>>     llvm/trunk/test/CodeGen/X86/statepoint-vector.ll
>>>     llvm/trunk/test/CodeGen/X86/swift-return.ll
>>>     llvm/trunk/test/CodeGen/X86/test-shrink-bug.ll
>>>     llvm/trunk/test/CodeGen/X86/test-vs-bittest.ll
>>>     llvm/trunk/test/CodeGen/X86/vector-arith-sat.ll
>>>     llvm/trunk/test/CodeGen/X86/vector-sext.ll
>>>     llvm/trunk/test/CodeGen/X86/vector-shuffle-avx512.ll
>>>     llvm/trunk/test/CodeGen/X86/wide-integer-cmp.ll
>>>     llvm/trunk/test/CodeGen/X86/x86-64-psub.ll
>>>     llvm/trunk/test/CodeGen/X86/x86-framelowering-trap.ll
>>>     llvm/trunk/test/CodeGen/X86/x86-interleaved-access.ll
>>>     llvm/trunk/test/CodeGen/X86/x86-no_caller_saved_registers-preserve.ll
>>>
>>> Modified: llvm/trunk/include/llvm/CodeGen/Passes.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/include/llvm/CodeGen/Passes.h (original)
>>> +++ llvm/trunk/include/llvm/CodeGen/Passes.h Tue Apr 24 03:32:08 2018
>>> @@ -434,6 +434,9 @@ namespace llvm {
>>>    // This pass expands indirectbr instructions.
>>>    FunctionPass *createIndirectBrExpandPass();
>>>
>>> +  /// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp
>>> +  FunctionPass *createCFIInstrInserter();
>>> +
>>>  } // End llvm namespace
>>>
>>>  #endif
>>>
>>> Modified: llvm/trunk/include/llvm/CodeGen/TargetFrameLowering.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/TargetFrameLowering.h?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/include/llvm/CodeGen/TargetFrameLowering.h (original)
>>> +++ llvm/trunk/include/llvm/CodeGen/TargetFrameLowering.h Tue Apr 24
>>> 03:32:08 2018
>>> @@ -345,6 +345,14 @@ public:
>>>            return false;
>>>      return true;
>>>    }
>>> +
>>> +  /// Return initial CFA offset value i.e. the one valid at the
>>> beginning of the
>>> +  /// function (before any stack operations).
>>> +  virtual int getInitialCFAOffset(const MachineFunction &MF) const;
>>> +
>>> +  /// Return initial CFA register value i.e. the one valid at the
>>> beginning of
>>> +  /// the function (before any stack operations).
>>> +  virtual unsigned getInitialCFARegister(const MachineFunction &MF)
>>> const;
>>>  };
>>>
>>>  } // End llvm namespace
>>>
>>> Modified: llvm/trunk/include/llvm/InitializePasses.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/include/llvm/InitializePasses.h (original)
>>> +++ llvm/trunk/include/llvm/InitializePasses.h Tue Apr 24 03:32:08 2018
>>> @@ -91,6 +91,7 @@ void initializeCFGOnlyViewerLegacyPassPa
>>>  void initializeCFGPrinterLegacyPassPass(PassRegistry&);
>>>  void initializeCFGSimplifyPassPass(PassRegistry&);
>>>  void initializeCFGViewerLegacyPassPass(PassRegistry&);
>>> +void initializeCFIInstrInserterPass(PassRegistry&);
>>>  void initializeCFLAndersAAWrapperPassPass(PassRegistry&);
>>>  void initializeCFLSteensAAWrapperPassPass(PassRegistry&);
>>>  void initializeCallGraphDOTPrinterPass(PassRegistry&);
>>>
>>> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Apr 24 03:32:08 2018
>>> @@ -296,6 +296,11 @@ static unsigned HashEndOfMBB(const Machi
>>>    return HashMachineInstr(*I);
>>>  }
>>>
>>> +///  Whether MI should be counted as an instruction when calculating
>>> common tail.
>>> +static bool countsAsInstruction(const MachineInstr &MI) {
>>> +  return !(MI.isDebugValue() || MI.isCFIInstruction());
>>> +}
>>> +
>>>  /// ComputeCommonTailLength - Given two machine basic blocks, compute
>>> the number
>>>  /// of instructions they actually have in common together at their end.
>>> Return
>>>  /// iterators for the first shared instruction in each block.
>>> @@ -310,26 +315,27 @@ static unsigned ComputeCommonTailLength(
>>>    while (I1 != MBB1->begin() && I2 != MBB2->begin()) {
>>>      --I1; --I2;
>>>      // Skip debugging pseudos; necessary to avoid changing the code.
>>> -    while (I1->isDebugValue()) {
>>> +    while (!countsAsInstruction(*I1)) {
>>>        if (I1==MBB1->begin()) {
>>> -        while (I2->isDebugValue()) {
>>> -          if (I2==MBB2->begin())
>>> +        while (!countsAsInstruction(*I2)) {
>>> +          if (I2==MBB2->begin()) {
>>>              // I1==DBG at begin; I2==DBG at begin
>>> -            return TailLen;
>>> +            goto SkipTopCFIAndReturn;
>>> +          }
>>>            --I2;
>>>          }
>>>          ++I2;
>>>          // I1==DBG at begin; I2==non-DBG, or first of DBGs not at begin
>>> -        return TailLen;
>>> +        goto SkipTopCFIAndReturn;
>>>        }
>>>        --I1;
>>>      }
>>>      // I1==first (untested) non-DBG preceding known match
>>> -    while (I2->isDebugValue()) {
>>> +    while (!countsAsInstruction(*I2)) {
>>>        if (I2==MBB2->begin()) {
>>>          ++I1;
>>>          // I1==non-DBG, or first of DBGs not at begin; I2==DBG at begin
>>> -        return TailLen;
>>> +        goto SkipTopCFIAndReturn;
>>>        }
>>>        --I2;
>>>      }
>>> @@ -368,6 +374,37 @@ static unsigned ComputeCommonTailLength(
>>>      }
>>>      ++I1;
>>>    }
>>> +
>>> +SkipTopCFIAndReturn:
>>> +  // Ensure that I1 and I2 do not point to a CFI_INSTRUCTION. This can
>>> happen if
>>> +  // I1 and I2 are non-identical when compared and then one or both of
>>> them ends
>>> +  // up pointing to a CFI instruction after being incremented. For
>>> example:
>>> +  /*
>>> +    BB1:
>>> +    ...
>>> +    INSTRUCTION_A
>>> +    ADD32ri8  <- last common instruction
>>> +    ...
>>> +    BB2:
>>> +    ...
>>> +    INSTRUCTION_B
>>> +    CFI_INSTRUCTION
>>> +    ADD32ri8  <- last common instruction
>>> +    ...
>>> +  */
>>> +  // When INSTRUCTION_A and INSTRUCTION_B are compared as not equal,
>>> after
>>> +  // incrementing the iterators, I1 will point to ADD, however I2 will
>>> point to
>>> +  // the CFI instruction. Later on, this leads to BB2 being 'hacked off'
>>> at the
>>> +  // wrong place (in ReplaceTailWithBranchTo()) which results in losing
>>> this CFI
>>> +  // instruction.
>>> +  while (I1 != MBB1->end() && I1->isCFIInstruction()) {
>>> +    ++I1;
>>> +  }
>>> +
>>> +  while (I2 != MBB2->end() && I2->isCFIInstruction()) {
>>> +    ++I2;
>>> +  }
>>> +
>>>    return TailLen;
>>>  }
>>>
>>> @@ -454,7 +491,7 @@ static unsigned EstimateRuntime(MachineB
>>>                                  MachineBasicBlock::iterator E) {
>>>    unsigned Time = 0;
>>>    for (; I != E; ++I) {
>>> -    if (I->isDebugValue())
>>> +    if (!countsAsInstruction(*I))
>>>        continue;
>>>      if (I->isCall())
>>>        Time += 10;
>>> @@ -814,12 +851,12 @@ mergeOperations(MachineBasicBlock::itera
>>>      assert(MBBI != MBBIE && "Reached BB end within common tail
>>> length!");
>>>      (void)MBBIE;
>>>
>>> -    if (MBBI->isDebugValue()) {
>>> +    if (!countsAsInstruction(*MBBI)) {
>>>        ++MBBI;
>>>        continue;
>>>      }
>>>
>>> -    while ((MBBICommon != MBBIECommon) && MBBICommon->isDebugValue())
>>> +    while ((MBBICommon != MBBIECommon) &&
>>> !countsAsInstruction(*MBBICommon))
>>>        ++MBBICommon;
>>>
>>>      assert(MBBICommon != MBBIECommon &&
>>> @@ -859,7 +896,7 @@ void BranchFolder::mergeCommonTails(unsi
>>>    }
>>>
>>>    for (auto &MI : *MBB) {
>>> -    if (MI.isDebugValue())
>>> +    if (!countsAsInstruction(MI))
>>>        continue;
>>>      DebugLoc DL = MI.getDebugLoc();
>>>      for (unsigned int i = 0 ; i < NextCommonInsts.size() ; i++) {
>>> @@ -869,7 +906,7 @@ void BranchFolder::mergeCommonTails(unsi
>>>        auto &Pos = NextCommonInsts[i];
>>>        assert(Pos != SameTails[i].getBlock()->end() &&
>>>            "Reached BB end within common tail");
>>> -      while (Pos->isDebugValue()) {
>>> +      while (!countsAsInstruction(*Pos)) {
>>>          ++Pos;
>>>          assert(Pos != SameTails[i].getBlock()->end() &&
>>>              "Reached BB end within common tail");
>>>
>>> Added: llvm/trunk/lib/CodeGen/CFIInstrInserter.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CFIInstrInserter.cpp?rev=330706&view=auto
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/CodeGen/CFIInstrInserter.cpp (added)
>>> +++ llvm/trunk/lib/CodeGen/CFIInstrInserter.cpp Tue Apr 24 03:32:08 2018
>>> @@ -0,0 +1,308 @@
>>> +//===------ CFIInstrInserter.cpp - Insert additional CFI instructions
>>> -----===//
>>> +//
>>> +//                     The LLVM Compiler Infrastructure
>>> +//
>>> +// This file is distributed under the University of Illinois Open Source
>>> +// License. See LICENSE.TXT for details.
>>> +//
>>>
>>> +//===----------------------------------------------------------------------===//
>>> +//
>>> +/// \file This pass verifies incoming and outgoing CFA information of
>>> basic
>>> +/// blocks. CFA information is information about offset and register set
>>> by CFI
>>> +/// directives, valid at the start and end of a basic block. This pass
>>> checks
>>> +/// that outgoing information of predecessors matches incoming
>>> information of
>>> +/// their successors. Then it checks if blocks have correct CFA
>>> calculation rule
>>> +/// set and inserts additional CFI instruction at their beginnings if
>>> they
>>> +/// don't. CFI instructions are inserted if basic blocks have incorrect
>>> offset
>>> +/// or register set by previous blocks, as a result of a non-linear
>>> layout of
>>> +/// blocks in a function.
>>>
>>> +//===----------------------------------------------------------------------===//
>>> +
>>> +#include "llvm/CodeGen/MachineFunctionPass.h"
>>> +#include "llvm/CodeGen/MachineInstrBuilder.h"
>>> +#include "llvm/CodeGen/MachineModuleInfo.h"
>>> +#include "llvm/CodeGen/Passes.h"
>>> +#include "llvm/CodeGen/TargetFrameLowering.h"
>>> +#include "llvm/CodeGen/TargetInstrInfo.h"
>>> +#include "llvm/CodeGen/TargetSubtargetInfo.h"
>>> +#include "llvm/Target/TargetMachine.h"
>>> +using namespace llvm;
>>> +
>>> +namespace {
>>> +class CFIInstrInserter : public MachineFunctionPass {
>>> + public:
>>> +  static char ID;
>>> +
>>> +  CFIInstrInserter() : MachineFunctionPass(ID) {
>>> +    initializeCFIInstrInserterPass(*PassRegistry::getPassRegistry());
>>> +  }
>>> +
>>> +  void getAnalysisUsage(AnalysisUsage &AU) const override {
>>> +    AU.setPreservesAll();
>>> +    MachineFunctionPass::getAnalysisUsage(AU);
>>> +  }
>>> +
>>> +  bool runOnMachineFunction(MachineFunction &MF) override {
>>> +    if (!MF.getMMI().hasDebugInfo() &&
>>> +        !MF.getFunction().needsUnwindTableEntry())
>>> +      return false;
>>> +
>>> +    MBBVector.resize(MF.getNumBlockIDs());
>>> +    calculateCFAInfo(MF);
>>> +#ifndef NDEBUG
>>> +    if (unsigned ErrorNum = verify(MF))
>>> +      report_fatal_error("Found " + Twine(ErrorNum) +
>>> +                         " in/out CFI information errors.");
>>> +#endif
>>> +    bool insertedCFI = insertCFIInstrs(MF);
>>> +    MBBVector.clear();
>>> +    return insertedCFI;
>>> +  }
>>> +
>>> + private:
>>> +  struct MBBCFAInfo {
>>> +    MachineBasicBlock *MBB;
>>> +    /// Value of cfa offset valid at basic block entry.
>>> +    int IncomingCFAOffset = -1;
>>> +    /// Value of cfa offset valid at basic block exit.
>>> +    int OutgoingCFAOffset = -1;
>>> +    /// Value of cfa register valid at basic block entry.
>>> +    unsigned IncomingCFARegister = 0;
>>> +    /// Value of cfa register valid at basic block exit.
>>> +    unsigned OutgoingCFARegister = 0;
>>> +    /// If in/out cfa offset and register values for this block have
>>> already
>>> +    /// been set or not.
>>> +    bool Processed = false;
>>> +  };
>>> +
>>> +  /// Contains cfa offset and register values valid at entry and exit of
>>> basic
>>> +  /// blocks.
>>> +  std::vector<MBBCFAInfo> MBBVector;
>>> +
>>> +  /// Calculate cfa offset and register values valid at entry and exit
>>> for all
>>> +  /// basic blocks in a function.
>>> +  void calculateCFAInfo(MachineFunction &MF);
>>> +  /// Calculate cfa offset and register values valid at basic block exit
>>> by
>>> +  /// checking the block for CFI instructions. Block's incoming CFA info
>>> remains
>>> +  /// the same.
>>> +  void calculateOutgoingCFAInfo(MBBCFAInfo &MBBInfo);
>>> +  /// Update in/out cfa offset and register values for successors of the
>>> basic
>>> +  /// block.
>>> +  void updateSuccCFAInfo(MBBCFAInfo &MBBInfo);
>>> +
>>> +  /// Check if incoming CFA information of a basic block matches
>>> outgoing CFA
>>> +  /// information of the previous block. If it doesn't, insert CFI
>>> instruction
>>> +  /// at the beginning of the block that corrects the CFA calculation
>>> rule for
>>> +  /// that block.
>>> +  bool insertCFIInstrs(MachineFunction &MF);
>>> +  /// Return the cfa offset value that should be set at the beginning of
>>> a MBB
>>> +  /// if needed. The negated value is needed when creating CFI
>>> instructions that
>>> +  /// set absolute offset.
>>> +  int getCorrectCFAOffset(MachineBasicBlock *MBB) {
>>> +    return -MBBVector[MBB->getNumber()].IncomingCFAOffset;
>>> +  }
>>> +
>>> +  void report(const MBBCFAInfo &Pred, const MBBCFAInfo &Succ);
>>> +  /// Go through each MBB in a function and check that outgoing offset
>>> and
>>> +  /// register of its predecessors match incoming offset and register of
>>> that
>>> +  /// MBB, as well as that incoming offset and register of its
>>> successors match
>>> +  /// outgoing offset and register of the MBB.
>>> +  unsigned verify(MachineFunction &MF);
>>> +};
>>> +}  // namespace
>>> +
>>> +char CFIInstrInserter::ID = 0;
>>> +INITIALIZE_PASS(CFIInstrInserter, "cfi-instr-inserter",
>>> +                "Check CFA info and insert CFI instructions if needed",
>>> false,
>>> +                false)
>>> +FunctionPass *llvm::createCFIInstrInserter() { return new
>>> CFIInstrInserter(); }
>>> +
>>> +void CFIInstrInserter::calculateCFAInfo(MachineFunction &MF) {
>>> +  // Initial CFA offset value i.e. the one valid at the beginning of the
>>> +  // function.
>>> +  int InitialOffset =
>>> +      MF.getSubtarget().getFrameLowering()->getInitialCFAOffset(MF);
>>> +  // Initial CFA register value i.e. the one valid at the beginning of
>>> the
>>> +  // function.
>>> +  unsigned InitialRegister =
>>> +      MF.getSubtarget().getFrameLowering()->getInitialCFARegister(MF);
>>> +
>>> +  // Initialize MBBMap.
>>> +  for (MachineBasicBlock &MBB : MF) {
>>> +    MBBCFAInfo MBBInfo;
>>> +    MBBInfo.MBB = &MBB;
>>> +    MBBInfo.IncomingCFAOffset = InitialOffset;
>>> +    MBBInfo.OutgoingCFAOffset = InitialOffset;
>>> +    MBBInfo.IncomingCFARegister = InitialRegister;
>>> +    MBBInfo.OutgoingCFARegister = InitialRegister;
>>> +    MBBVector[MBB.getNumber()] = MBBInfo;
>>> +  }
>>> +
>>> +  // Set in/out cfa info for all blocks in the function. This traversal
>>> is based
>>> +  // on the assumption that the first block in the function is the entry
>>> block
>>> +  // i.e. that it has initial cfa offset and register values as incoming
>>> CFA
>>> +  // information.
>>> +  for (MachineBasicBlock &MBB : MF) {
>>> +    if (MBBVector[MBB.getNumber()].Processed) continue;
>>> +    calculateOutgoingCFAInfo(MBBVector[MBB.getNumber()]);
>>> +    updateSuccCFAInfo(MBBVector[MBB.getNumber()]);
>>> +  }
>>> +}
>>> +
>>> +void CFIInstrInserter::calculateOutgoingCFAInfo(MBBCFAInfo &MBBInfo) {
>>> +  // Outgoing cfa offset set by the block.
>>> +  int SetOffset = MBBInfo.IncomingCFAOffset;
>>> +  // Outgoing cfa register set by the block.
>>> +  unsigned SetRegister = MBBInfo.IncomingCFARegister;
>>> +  const std::vector<MCCFIInstruction> &Instrs =
>>> +      MBBInfo.MBB->getParent()->getFrameInstructions();
>>> +
>>> +  // Determine cfa offset and register set by the block.
>>> +  for (MachineInstr &MI : *MBBInfo.MBB) {
>>> +    if (MI.isCFIInstruction()) {
>>> +      unsigned CFIIndex = MI.getOperand(0).getCFIIndex();
>>> +      const MCCFIInstruction &CFI = Instrs[CFIIndex];
>>> +      switch (CFI.getOperation()) {
>>> +      case MCCFIInstruction::OpDefCfaRegister:
>>> +        SetRegister = CFI.getRegister();
>>> +        break;
>>> +      case MCCFIInstruction::OpDefCfaOffset:
>>> +        SetOffset = CFI.getOffset();
>>> +        break;
>>> +      case MCCFIInstruction::OpAdjustCfaOffset:
>>> +        SetOffset += CFI.getOffset();
>>> +        break;
>>> +      case MCCFIInstruction::OpDefCfa:
>>> +        SetRegister = CFI.getRegister();
>>> +        SetOffset = CFI.getOffset();
>>> +        break;
>>> +      case MCCFIInstruction::OpRememberState:
>>> +        // TODO: Add support for handling cfi_remember_state.
>>> +#ifndef NDEBUG
>>> +        report_fatal_error(
>>> +            "Support for cfi_remember_state not implemented! Value of
>>> CFA "
>>> +            "may be incorrect!\n");
>>> +#endif
>>> +        break;
>>> +      case MCCFIInstruction::OpRestoreState:
>>> +        // TODO: Add support for handling cfi_restore_state.
>>> +#ifndef NDEBUG
>>> +        report_fatal_error(
>>> +            "Support for cfi_restore_state not implemented! Value of CFA
>>> may "
>>> +            "be incorrect!\n");
>>> +#endif
>>> +        break;
>>> +      // Other CFI directives do not affect CFA value.
>>> +      case MCCFIInstruction::OpSameValue:
>>> +      case MCCFIInstruction::OpOffset:
>>> +      case MCCFIInstruction::OpRelOffset:
>>> +      case MCCFIInstruction::OpEscape:
>>> +      case MCCFIInstruction::OpRestore:
>>> +      case MCCFIInstruction::OpUndefined:
>>> +      case MCCFIInstruction::OpRegister:
>>> +      case MCCFIInstruction::OpWindowSave:
>>> +      case MCCFIInstruction::OpGnuArgsSize:
>>> +        break;
>>> +      }
>>> +    }
>>> +  }
>>> +
>>> +  MBBInfo.Processed = true;
>>> +
>>> +  // Update outgoing CFA info.
>>> +  MBBInfo.OutgoingCFAOffset = SetOffset;
>>> +  MBBInfo.OutgoingCFARegister = SetRegister;
>>> +}
>>> +
>>> +void CFIInstrInserter::updateSuccCFAInfo(MBBCFAInfo &MBBInfo) {
>>> +  for (MachineBasicBlock *Succ : MBBInfo.MBB->successors()) {
>>> +    MBBCFAInfo &SuccInfo = MBBVector[Succ->getNumber()];
>>> +    if (SuccInfo.Processed) continue;
>>> +    SuccInfo.IncomingCFAOffset = MBBInfo.OutgoingCFAOffset;
>>> +    SuccInfo.IncomingCFARegister = MBBInfo.OutgoingCFARegister;
>>> +    calculateOutgoingCFAInfo(SuccInfo);
>>> +    updateSuccCFAInfo(SuccInfo);
>>> +  }
>>> +}
>>> +
>>> +bool CFIInstrInserter::insertCFIInstrs(MachineFunction &MF) {
>>> +  const MBBCFAInfo *PrevMBBInfo = &MBBVector[MF.front().getNumber()];
>>> +  const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
>>> +  bool InsertedCFIInstr = false;
>>> +
>>> +  for (MachineBasicBlock &MBB : MF) {
>>> +    // Skip the first MBB in a function
>>> +    if (MBB.getNumber() == MF.front().getNumber()) continue;
>>> +
>>> +    const MBBCFAInfo &MBBInfo = MBBVector[MBB.getNumber()];
>>> +    auto MBBI = MBBInfo.MBB->begin();
>>> +    DebugLoc DL = MBBInfo.MBB->findDebugLoc(MBBI);
>>> +
>>> +    if (PrevMBBInfo->OutgoingCFAOffset != MBBInfo.IncomingCFAOffset) {
>>> +      // If both outgoing offset and register of a previous block don't
>>> match
>>> +      // incoming offset and register of this block, add a def_cfa
>>> instruction
>>> +      // with the correct offset and register for this block.
>>> +      if (PrevMBBInfo->OutgoingCFARegister !=
>>> MBBInfo.IncomingCFARegister) {
>>> +        unsigned CFIIndex =
>>> MF.addFrameInst(MCCFIInstruction::createDefCfa(
>>> +            nullptr, MBBInfo.IncomingCFARegister,
>>> getCorrectCFAOffset(&MBB)));
>>> +        BuildMI(*MBBInfo.MBB, MBBI, DL,
>>> TII->get(TargetOpcode::CFI_INSTRUCTION))
>>> +            .addCFIIndex(CFIIndex);
>>> +        // If outgoing offset of a previous block doesn't match incoming
>>> offset
>>> +        // of this block, add a def_cfa_offset instruction with the
>>> correct
>>> +        // offset for this block.
>>> +      } else {
>>> +        unsigned CFIIndex =
>>> +            MF.addFrameInst(MCCFIInstruction::createDefCfaOffset(
>>> +                nullptr, getCorrectCFAOffset(&MBB)));
>>> +        BuildMI(*MBBInfo.MBB, MBBI, DL,
>>> TII->get(TargetOpcode::CFI_INSTRUCTION))
>>> +            .addCFIIndex(CFIIndex);
>>> +      }
>>> +      InsertedCFIInstr = true;
>>> +      // If outgoing register of a previous block doesn't match incoming
>>> +      // register of this block, add a def_cfa_register instruction with
>>> the
>>> +      // correct register for this block.
>>> +    } else if (PrevMBBInfo->OutgoingCFARegister !=
>>> +               MBBInfo.IncomingCFARegister) {
>>> +      unsigned CFIIndex =
>>> +          MF.addFrameInst(MCCFIInstruction::createDefCfaRegister(
>>> +              nullptr, MBBInfo.IncomingCFARegister));
>>> +      BuildMI(*MBBInfo.MBB, MBBI, DL,
>>> TII->get(TargetOpcode::CFI_INSTRUCTION))
>>> +          .addCFIIndex(CFIIndex);
>>> +      InsertedCFIInstr = true;
>>> +    }
>>> +    PrevMBBInfo = &MBBInfo;
>>> +  }
>>> +  return InsertedCFIInstr;
>>> +}
>>> +
>>> +void CFIInstrInserter::report(const MBBCFAInfo &Pred,
>>> +                              const MBBCFAInfo &Succ) {
>>> +  errs() << "*** Inconsistent CFA register and/or offset between pred
>>> and succ "
>>> +            "***\n";
>>> +  errs() << "Pred: " << Pred.MBB->getName()
>>> +         << " outgoing CFA Reg:" << Pred.OutgoingCFARegister << "\n";
>>> +  errs() << "Pred: " << Pred.MBB->getName()
>>> +         << " outgoing CFA Offset:" << Pred.OutgoingCFAOffset << "\n";
>>> +  errs() << "Succ: " << Succ.MBB->getName()
>>> +         << " incoming CFA Reg:" << Succ.IncomingCFARegister << "\n";
>>> +  errs() << "Succ: " << Succ.MBB->getName()
>>> +         << " incoming CFA Offset:" << Succ.IncomingCFAOffset << "\n";
>>> +}
>>> +
>>> +unsigned CFIInstrInserter::verify(MachineFunction &MF) {
>>> +  unsigned ErrorNum = 0;
>>> +  for (MachineBasicBlock &CurrMBB : MF) {
>>> +    const MBBCFAInfo &CurrMBBInfo = MBBVector[CurrMBB.getNumber()];
>>> +    for (MachineBasicBlock *Succ : CurrMBB.successors()) {
>>> +      const MBBCFAInfo &SuccMBBInfo = MBBVector[Succ->getNumber()];
>>> +      // Check that incoming offset and register values of successors
>>> match the
>>> +      // outgoing offset and register values of CurrMBB
>>> +      if (SuccMBBInfo.IncomingCFAOffset != CurrMBBInfo.OutgoingCFAOffset
>>> ||
>>> +          SuccMBBInfo.IncomingCFARegister !=
>>> CurrMBBInfo.OutgoingCFARegister) {
>>> +        report(CurrMBBInfo, SuccMBBInfo);
>>> +        ErrorNum++;
>>> +      }
>>> +    }
>>> +  }
>>> +  return ErrorNum;
>>> +}
>>>
>>> Modified: llvm/trunk/lib/CodeGen/CMakeLists.txt
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CMakeLists.txt?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/CodeGen/CMakeLists.txt (original)
>>> +++ llvm/trunk/lib/CodeGen/CMakeLists.txt Tue Apr 24 03:32:08 2018
>>> @@ -10,6 +10,7 @@ add_llvm_library(LLVMCodeGen
>>>    BuiltinGCs.cpp
>>>    CalcSpillWeights.cpp
>>>    CallingConvLower.cpp
>>> +  CFIInstrInserter.cpp
>>>    CodeGen.cpp
>>>    CodeGenPrepare.cpp
>>>    CriticalAntiDepBreaker.cpp
>>>
>>> Modified: llvm/trunk/lib/CodeGen/CodeGen.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGen.cpp?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/CodeGen/CodeGen.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/CodeGen.cpp Tue Apr 24 03:32:08 2018
>>> @@ -23,6 +23,7 @@ void llvm::initializeCodeGen(PassRegistr
>>>    initializeAtomicExpandPass(Registry);
>>>    initializeBranchFolderPassPass(Registry);
>>>    initializeBranchRelaxationPass(Registry);
>>> +  initializeCFIInstrInserterPass(Registry);
>>>    initializeCodeGenPreparePass(Registry);
>>>    initializeDeadMachineInstructionElimPass(Registry);
>>>    initializeDetectDeadLanesPass(Registry);
>>>
>>> Modified: llvm/trunk/lib/CodeGen/TargetFrameLoweringImpl.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetFrameLoweringImpl.cpp?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/CodeGen/TargetFrameLoweringImpl.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/TargetFrameLoweringImpl.cpp Tue Apr 24
>>> 03:32:08 2018
>>> @@ -124,3 +124,12 @@ unsigned TargetFrameLowering::getStackAl
>>>
>>>    return 0;
>>>  }
>>> +
>>> +int TargetFrameLowering::getInitialCFAOffset(const MachineFunction &MF)
>>> const {
>>> +  llvm_unreachable("getInitialCFAOffset() not implemented!");
>>> +}
>>> +
>>> +unsigned TargetFrameLowering::getInitialCFARegister(const
>>> MachineFunction &MF)
>>> +    const {
>>> +  llvm_unreachable("getInitialCFARegister() not implemented!");
>>> +}
>>> \ No newline at end of file
>>>
>>> Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FrameLowering.cpp?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86FrameLowering.cpp (original)
>>> +++ llvm/trunk/lib/Target/X86/X86FrameLowering.cpp Tue Apr 24 03:32:08
>>> 2018
>>> @@ -399,28 +399,30 @@ int X86FrameLowering::mergeSPUpdates(Mac
>>>      return 0;
>>>
>>>    MachineBasicBlock::iterator PI = doMergeWithPrevious ? std::prev(MBBI)
>>> : MBBI;
>>> -  MachineBasicBlock::iterator NI = doMergeWithPrevious ? nullptr
>>> -                                                       :
>>> std::next(MBBI);
>>> +
>>>    PI = skipDebugInstructionsBackward(PI, MBB.begin());
>>> -  if (NI != nullptr)
>>> -    NI = skipDebugInstructionsForward(NI, MBB.end());
>>> +  // It is assumed that ADD/SUB/LEA instruction is succeded by one CFI
>>> +  // instruction, and that there are no DBG_VALUE or other instructions
>>> between
>>> +  // ADD/SUB/LEA and its corresponding CFI instruction.
>>> +  /* TODO: Add support for the case where there are multiple CFI
>>> instructions
>>> +    below the ADD/SUB/LEA, e.g.:
>>> +    ...
>>> +    add
>>> +    cfi_def_cfa_offset
>>> +    cfi_offset
>>> +    ...
>>> +  */
>>> +  if (doMergeWithPrevious && PI != MBB.begin() &&
>>> PI->isCFIInstruction())
>>> +    PI = std::prev(PI);
>>>
>>>    unsigned Opc = PI->getOpcode();
>>>    int Offset = 0;
>>>
>>> -  if (!doMergeWithPrevious && NI != MBB.end() &&
>>> -      NI->getOpcode() == TargetOpcode::CFI_INSTRUCTION) {
>>> -    // Don't merge with the next instruction if it has CFI.
>>> -    return Offset;
>>> -  }
>>> -
>>>    if ((Opc == X86::ADD64ri32 || Opc == X86::ADD64ri8 ||
>>>         Opc == X86::ADD32ri || Opc == X86::ADD32ri8) &&
>>>        PI->getOperand(0).getReg() == StackPtr){
>>>      assert(PI->getOperand(1).getReg() == StackPtr);
>>> -    Offset += PI->getOperand(2).getImm();
>>> -    MBB.erase(PI);
>>> -    if (!doMergeWithPrevious) MBBI = NI;
>>> +    Offset = PI->getOperand(2).getImm();
>>>    } else if ((Opc == X86::LEA32r || Opc == X86::LEA64_32r) &&
>>>               PI->getOperand(0).getReg() == StackPtr &&
>>>               PI->getOperand(1).getReg() == StackPtr &&
>>> @@ -428,17 +430,19 @@ int X86FrameLowering::mergeSPUpdates(Mac
>>>               PI->getOperand(3).getReg() == X86::NoRegister &&
>>>               PI->getOperand(5).getReg() == X86::NoRegister) {
>>>      // For LEAs we have: def = lea SP, FI, noreg, Offset, noreg.
>>> -    Offset += PI->getOperand(4).getImm();
>>> -    MBB.erase(PI);
>>> -    if (!doMergeWithPrevious) MBBI = NI;
>>> +    Offset = PI->getOperand(4).getImm();
>>>    } else if ((Opc == X86::SUB64ri32 || Opc == X86::SUB64ri8 ||
>>>                Opc == X86::SUB32ri || Opc == X86::SUB32ri8) &&
>>>               PI->getOperand(0).getReg() == StackPtr) {
>>>      assert(PI->getOperand(1).getReg() == StackPtr);
>>> -    Offset -= PI->getOperand(2).getImm();
>>> -    MBB.erase(PI);
>>> -    if (!doMergeWithPrevious) MBBI = NI;
>>> -  }
>>> +    Offset = -PI->getOperand(2).getImm();
>>> +  } else
>>> +    return 0;
>>> +
>>> +  PI = MBB.erase(PI);
>>> +  if (PI != MBB.end() && PI->isCFIInstruction()) PI = MBB.erase(PI);
>>> +  if (!doMergeWithPrevious)
>>> +    MBBI = skipDebugInstructionsForward(PI, MBB.end());
>>>
>>>    return Offset;
>>>  }
>>> @@ -1573,6 +1577,11 @@ void X86FrameLowering::emitEpilogue(Mach
>>>    bool HasFP = hasFP(MF);
>>>    uint64_t NumBytes = 0;
>>>
>>> +  bool NeedsDwarfCFI =
>>> +      (!MF.getTarget().getTargetTriple().isOSDarwin() &&
>>> +       !MF.getTarget().getTargetTriple().isOSWindows()) &&
>>> +      (MF.getMMI().hasDebugInfo() ||
>>> MF.getFunction().needsUnwindTableEntry());
>>> +
>>>    if (IsFunclet) {
>>>      assert(HasFP && "EH funclets without FP not yet implemented");
>>>      NumBytes = getWinEHFuncletFrameSize(MF);
>>> @@ -1595,6 +1604,13 @@ void X86FrameLowering::emitEpilogue(Mach
>>>      BuildMI(MBB, MBBI, DL, TII.get(Is64Bit ? X86::POP64r : X86::POP32r),
>>>              MachineFramePtr)
>>>          .setMIFlag(MachineInstr::FrameDestroy);
>>> +    if (NeedsDwarfCFI) {
>>> +      unsigned DwarfStackPtr =
>>> +          TRI->getDwarfRegNum(Is64Bit ? X86::RSP : X86::ESP, true);
>>> +      BuildCFI(MBB, MBBI, DL, MCCFIInstruction::createDefCfa(
>>> +                                  nullptr, DwarfStackPtr, -SlotSize));
>>> +      --MBBI;
>>> +    }
>>>    }
>>>
>>>    MachineBasicBlock::iterator FirstCSPop = MBBI;
>>> @@ -1658,6 +1674,11 @@ void X86FrameLowering::emitEpilogue(Mach
>>>    } else if (NumBytes) {
>>>      // Adjust stack pointer back: ESP += numbytes.
>>>      emitSPUpdate(MBB, MBBI, DL, NumBytes, /*InEpilogue=*/true);
>>> +    if (!hasFP(MF) && NeedsDwarfCFI) {
>>> +      // Define the current CFA rule to use the provided offset.
>>> +      BuildCFI(MBB, MBBI, DL, MCCFIInstruction::createDefCfaOffset(
>>> +                                  nullptr, -CSSize - SlotSize));
>>> +    }
>>>      --MBBI;
>>>    }
>>>
>>> @@ -1670,6 +1691,23 @@ void X86FrameLowering::emitEpilogue(Mach
>>>    if (NeedsWin64CFI && MF.hasWinCFI())
>>>      BuildMI(MBB, MBBI, DL, TII.get(X86::SEH_Epilogue));
>>>
>>> +  if (!hasFP(MF) && NeedsDwarfCFI) {
>>> +    MBBI = FirstCSPop;
>>> +    int64_t Offset = -CSSize - SlotSize;
>>> +    // Mark callee-saved pop instruction.
>>> +    // Define the current CFA rule to use the provided offset.
>>> +    while (MBBI != MBB.end()) {
>>> +      MachineBasicBlock::iterator PI = MBBI;
>>> +      unsigned Opc = PI->getOpcode();
>>> +      ++MBBI;
>>> +      if (Opc == X86::POP32r || Opc == X86::POP64r) {
>>> +        Offset += SlotSize;
>>> +        BuildCFI(MBB, MBBI, DL,
>>> +                 MCCFIInstruction::createDefCfaOffset(nullptr, Offset));
>>> +      }
>>> +    }
>>> +  }
>>> +
>>>    if (Terminator == MBB.end() ||
>>> !isTailCallOpcode(Terminator->getOpcode())) {
>>>      // Add the return addr area delta back since we are not tail
>>> calling.
>>>      int Offset = -1 * X86FI->getTCReturnAddrDelta();
>>> @@ -2719,7 +2757,6 @@ eliminateCallFramePseudoInstr(MachineFun
>>>
>>>      // Add Amount to SP to destroy a frame, or subtract to setup.
>>>      int64_t StackAdjustment = isDestroy ? Amount : -Amount;
>>> -    int64_t CfaAdjustment = -StackAdjustment;
>>>
>>>      if (StackAdjustment) {
>>>        // Merge with any previous or following adjustment instruction.
>>> Note: the
>>> @@ -2744,6 +2781,7 @@ eliminateCallFramePseudoInstr(MachineFun
>>>        // offset to be correct at each call site, while for debugging we
>>> want
>>>        // it to be more precise.
>>>
>>> +      int64_t CfaAdjustment = -StackAdjustment;
>>>        // TODO: When not using precise CFA, we also need to adjust for
>>> the
>>>        // InternalAmt here.
>>>        if (CfaAdjustment) {
>>> @@ -2874,6 +2912,15 @@ MachineBasicBlock::iterator X86FrameLowe
>>>    return MBBI;
>>>  }
>>>
>>> +int X86FrameLowering::getInitialCFAOffset(const MachineFunction &MF)
>>> const {
>>> +  return TRI->getSlotSize();
>>> +}
>>> +
>>> +unsigned X86FrameLowering::getInitialCFARegister(const MachineFunction
>>> &MF)
>>> +    const {
>>> +  return TRI->getDwarfRegNum(StackPtr, true);
>>> +}
>>> +
>>>  namespace {
>>>  // Struct used by orderFrameObjects to help sort the stack objects.
>>>  struct X86FrameSortingObject {
>>>
>>> Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FrameLowering.h?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86FrameLowering.h (original)
>>> +++ llvm/trunk/lib/Target/X86/X86FrameLowering.h Tue Apr 24 03:32:08 2018
>>> @@ -168,6 +168,10 @@ public:
>>>                                MachineBasicBlock::iterator MBBI,
>>>                                const DebugLoc &DL, bool RestoreSP =
>>> false) const;
>>>
>>> +  int getInitialCFAOffset(const MachineFunction &MF) const override;
>>> +
>>> +  unsigned getInitialCFARegister(const MachineFunction &MF) const
>>> override;
>>> +
>>>  private:
>>>    uint64_t calculateMaxStackAlign(const MachineFunction &MF) const;
>>>
>>>
>>> Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.cpp?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86TargetMachine.cpp (original)
>>> +++ llvm/trunk/lib/Target/X86/X86TargetMachine.cpp Tue Apr 24 03:32:08
>>> 2018
>>> @@ -495,4 +495,10 @@ void X86PassConfig::addPreEmitPass() {
>>>
>>>  void X86PassConfig::addPreEmitPass2() {
>>>    addPass(createX86RetpolineThunksPass());
>>> +  // Verify basic block incoming and outgoing cfa offset and register
>>> values and
>>> +  // correct CFA calculation rule where needed by inserting appropriate
>>> CFI
>>> +  // instructions.
>>> +  const Triple &TT = TM->getTargetTriple();
>>> +  if (!TT.isOSDarwin() && !TT.isOSWindows())
>>> +    addPass(createCFIInstrInserter());
>>>  }
>>>
>>> Modified: llvm/trunk/test/CodeGen/AArch64/taildup-cfi.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/taildup-cfi.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/AArch64/taildup-cfi.ll (original)
>>> +++ llvm/trunk/test/CodeGen/AArch64/taildup-cfi.ll Tue Apr 24 03:32:08
>>> 2018
>>> @@ -2,8 +2,6 @@
>>>  ; RUN: llc -mtriple=arm64-unknown-linux-gnu -debug-only=tailduplication
>>> %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=LINUX
>>>  ; RUN: llc -mtriple=arm64-apple-darwin -debug-only=tailduplication %s -o
>>> /dev/null 2>&1 | FileCheck %s --check-prefix=DARWIN
>>>
>>> -; ModuleID = 'taildup-cfi.c'
>>> -source_filename = "taildup-cfi.c"
>>>  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>
>>>  @g = common local_unnamed_addr global i32 0, align 4
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll Tue Apr 24
>>> 03:32:08 2018
>>> @@ -23,6 +23,7 @@ lpad:         ; preds = %cont, %entry
>>>  }
>>>
>>>  ; CHECK: lpad
>>> +; CHECK-NEXT: .cfi_def_cfa_offset 16
>>>  ; CHECK-NEXT: Ltmp
>>>
>>>  declare i32 @__gxx_personality_v0(...)
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/2011-10-19-widen_vselect.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2011-10-19-widen_vselect.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/2011-10-19-widen_vselect.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/2011-10-19-widen_vselect.ll Tue Apr 24
>>> 03:32:08 2018
>>> @@ -87,6 +87,7 @@ define void @full_test() {
>>>  ; X32-NEXT:    movss %xmm4, {{[0-9]+}}(%esp)
>>>  ; X32-NEXT:    movss %xmm0, {{[0-9]+}}(%esp)
>>>  ; X32-NEXT:    addl $60, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: full_test:
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/GlobalISel/brcond.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/GlobalISel/brcond.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/GlobalISel/brcond.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/GlobalISel/brcond.ll Tue Apr 24 03:32:08
>>> 2018
>>> @@ -36,6 +36,7 @@ define i32 @test_1(i32 %a, i32 %b, i32 %
>>>  ; X32-NEXT:    movl %eax, (%esp)
>>>  ; X32-NEXT:    movl (%esp), %eax
>>>  ; X32-NEXT:    popl %ecx
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  entry:
>>>    %retval = alloca i32, align 4
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/GlobalISel/callingconv.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/GlobalISel/callingconv.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/GlobalISel/callingconv.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/GlobalISel/callingconv.ll Tue Apr 24
>>> 03:32:08 2018
>>> @@ -117,6 +117,7 @@ define <8 x i32> @test_v8i32_args(<8 x i
>>>  ; X32-NEXT:    movups {{[0-9]+}}(%esp), %xmm1
>>>  ; X32-NEXT:    movaps %xmm2, %xmm0
>>>  ; X32-NEXT:    addl $12, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_v8i32_args:
>>> @@ -135,6 +136,7 @@ define void @test_trivial_call() {
>>>  ; X32-NEXT:    .cfi_def_cfa_offset 16
>>>  ; X32-NEXT:    calll trivial_callee
>>>  ; X32-NEXT:    addl $12, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_trivial_call:
>>> @@ -143,6 +145,7 @@ define void @test_trivial_call() {
>>>  ; X64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; X64-NEXT:    callq trivial_callee
>>>  ; X64-NEXT:    popq %rax
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>    call void @trivial_callee()
>>>    ret void
>>> @@ -160,6 +163,7 @@ define void @test_simple_arg_call(i32 %i
>>>  ; X32-NEXT:    movl %eax, {{[0-9]+}}(%esp)
>>>  ; X32-NEXT:    calll simple_arg_callee
>>>  ; X32-NEXT:    addl $12, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_simple_arg_call:
>>> @@ -171,6 +175,7 @@ define void @test_simple_arg_call(i32 %i
>>>  ; X64-NEXT:    movl %eax, %esi
>>>  ; X64-NEXT:    callq simple_arg_callee
>>>  ; X64-NEXT:    popq %rax
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>    call void @simple_arg_callee(i32 %in1, i32 %in0)
>>>    ret void
>>> @@ -193,6 +198,7 @@ define void @test_simple_arg8_call(i32 %
>>>  ; X32-NEXT:    movl %eax, {{[0-9]+}}(%esp)
>>>  ; X32-NEXT:    calll simple_arg8_callee
>>>  ; X32-NEXT:    addl $44, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_simple_arg8_call:
>>> @@ -208,6 +214,7 @@ define void @test_simple_arg8_call(i32 %
>>>  ; X64-NEXT:    movl %edi, %r9d
>>>  ; X64-NEXT:    callq simple_arg8_callee
>>>  ; X64-NEXT:    addq $24, %rsp
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>    call void @simple_arg8_callee(i32 %in0, i32 %in0, i32 %in0, i32
>>> %in0,i32 %in0, i32 %in0, i32 %in0, i32 %in0)
>>>    ret void
>>> @@ -224,6 +231,7 @@ define i32 @test_simple_return_callee()
>>>  ; X32-NEXT:    calll simple_return_callee
>>>  ; X32-NEXT:    addl %eax, %eax
>>>  ; X32-NEXT:    addl $12, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_simple_return_callee:
>>> @@ -234,6 +242,7 @@ define i32 @test_simple_return_callee()
>>>  ; X64-NEXT:    callq simple_return_callee
>>>  ; X64-NEXT:    addl %eax, %eax
>>>  ; X64-NEXT:    popq %rcx
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>    %call = call i32 @simple_return_callee(i32 5)
>>>    %r = add i32 %call, %call
>>> @@ -254,6 +263,7 @@ define <8 x i32> @test_split_return_call
>>>  ; X32-NEXT:    paddd (%esp), %xmm0 # 16-byte Folded Reload
>>>  ; X32-NEXT:    paddd {{[0-9]+}}(%esp), %xmm1 # 16-byte Folded Reload
>>>  ; X32-NEXT:    addl $44, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_split_return_callee:
>>> @@ -268,6 +278,7 @@ define <8 x i32> @test_split_return_call
>>>  ; X64-NEXT:    paddd (%rsp), %xmm0 # 16-byte Folded Reload
>>>  ; X64-NEXT:    paddd {{[0-9]+}}(%rsp), %xmm1 # 16-byte Folded Reload
>>>  ; X64-NEXT:    addq $40, %rsp
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>    %call = call <8 x i32> @split_return_callee(<8 x i32> %arg2)
>>>    %r = add <8 x i32> %arg1, %call
>>> @@ -281,6 +292,7 @@ define void @test_indirect_call(void()*
>>>  ; X32-NEXT:    .cfi_def_cfa_offset 16
>>>  ; X32-NEXT:    calll *{{[0-9]+}}(%esp)
>>>  ; X32-NEXT:    addl $12, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_indirect_call:
>>> @@ -289,6 +301,7 @@ define void @test_indirect_call(void()*
>>>  ; X64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; X64-NEXT:    callq *%rdi
>>>  ; X64-NEXT:    popq %rax
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>    call void %func()
>>>    ret void
>>> @@ -317,8 +330,11 @@ define void @test_abi_exts_call(i8* %add
>>>  ; X32-NEXT:    movl %esi, (%esp)
>>>  ; X32-NEXT:    calll take_char
>>>  ; X32-NEXT:    addl $4, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 12
>>>  ; X32-NEXT:    popl %esi
>>> +; X32-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X32-NEXT:    popl %ebx
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_abi_exts_call:
>>> @@ -335,6 +351,7 @@ define void @test_abi_exts_call(i8* %add
>>>  ; X64-NEXT:    movl %ebx, %edi
>>>  ; X64-NEXT:    callq take_char
>>>  ; X64-NEXT:    popq %rbx
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>    %val = load i8, i8* %addr
>>>    call void @take_char(i8 %val)
>>> @@ -357,6 +374,7 @@ define void @test_variadic_call_1(i8** %
>>>  ; X32-NEXT:    movl %ecx, {{[0-9]+}}(%esp)
>>>  ; X32-NEXT:    calll variadic_callee
>>>  ; X32-NEXT:    addl $12, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_variadic_call_1:
>>> @@ -368,6 +386,7 @@ define void @test_variadic_call_1(i8** %
>>>  ; X64-NEXT:    movb $0, %al
>>>  ; X64-NEXT:    callq variadic_callee
>>>  ; X64-NEXT:    popq %rax
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>
>>>    %addr = load i8*, i8** %addr_ptr
>>> @@ -393,6 +412,7 @@ define void @test_variadic_call_2(i8** %
>>>  ; X32-NEXT:    movl %ecx, 4(%eax)
>>>  ; X32-NEXT:    calll variadic_callee
>>>  ; X32-NEXT:    addl $12, %esp
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X64-LABEL: test_variadic_call_2:
>>> @@ -405,6 +425,7 @@ define void @test_variadic_call_2(i8** %
>>>  ; X64-NEXT:    movb $1, %al
>>>  ; X64-NEXT:    callq variadic_callee
>>>  ; X64-NEXT:    popq %rax
>>> +; X64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X64-NEXT:    retq
>>>
>>>    %addr = load i8*, i8** %addr_ptr
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/GlobalISel/frameIndex.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/GlobalISel/frameIndex.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/GlobalISel/frameIndex.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/GlobalISel/frameIndex.ll Tue Apr 24
>>> 03:32:08 2018
>>> @@ -18,6 +18,7 @@ define i32* @allocai32() {
>>>  ; X32-NEXT:    .cfi_def_cfa_offset 8
>>>  ; X32-NEXT:    movl %esp, %eax
>>>  ; X32-NEXT:    popl %ecx
>>> +; X32-NEXT:    .cfi_def_cfa_offset 4
>>>  ; X32-NEXT:    retl
>>>  ;
>>>  ; X32ABI-LABEL: allocai32:
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/O0-pipeline.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/O0-pipeline.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/O0-pipeline.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/O0-pipeline.ll Tue Apr 24 03:32:08 2018
>>> @@ -61,6 +61,7 @@
>>>  ; CHECK-NEXT:       Insert XRay ops
>>>  ; CHECK-NEXT:       Implement the 'patchable-function' attribute
>>>  ; CHECK-NEXT:       X86 Retpoline Thunks
>>> +; CHECK-NEXT:       Check CFA info and insert CFI instructions if needed
>>>  ; CHECK-NEXT:       Lazy Machine Block Frequency Analysis
>>>  ; CHECK-NEXT:       Machine Optimization Remark Emitter
>>>  ; CHECK-NEXT:       X86 Assembly Printer
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/O3-pipeline.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/O3-pipeline.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/O3-pipeline.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/O3-pipeline.ll Tue Apr 24 03:32:08 2018
>>> @@ -160,6 +160,7 @@
>>>  ; CHECK-NEXT:       Insert XRay ops
>>>  ; CHECK-NEXT:       Implement the 'patchable-function' attribute
>>>  ; CHECK-NEXT:       X86 Retpoline Thunks
>>> +; CHECK-NEXT:       Check CFA info and insert CFI instructions if needed
>>>  ; CHECK-NEXT:       Lazy Machine Block Frequency Analysis
>>>  ; CHECK-NEXT:       Machine Optimization Remark Emitter
>>>  ; CHECK-NEXT:       X86 Assembly Printer
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/TruncAssertZext.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/TruncAssertZext.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/TruncAssertZext.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/TruncAssertZext.ll Tue Apr 24 03:32:08
>>> 2018
>>> @@ -25,6 +25,7 @@ define i64 @main() {
>>>  ; CHECK-NEXT:    subq %rcx, %rax
>>>  ; CHECK-NEXT:    shrq $32, %rax
>>>  ; CHECK-NEXT:    popq %rcx
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 8
>>>  ; CHECK-NEXT:    retq
>>>    %b = call i64 @foo()
>>>    %or = and i64 %b, 18446744069414584575 ; this is 0xffffffff000000ff
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/avoid-sfb.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avoid-sfb.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/avoid-sfb.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/avoid-sfb.ll Tue Apr 24 03:32:08 2018
>>> @@ -854,10 +854,15 @@ define void @test_limit_all(%struct.S* n
>>>  ; CHECK-NEXT:    movups (%rbx), %xmm0
>>>  ; CHECK-NEXT:    movups %xmm0, (%r12)
>>>  ; CHECK-NEXT:    popq %rbx
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 40
>>>  ; CHECK-NEXT:    popq %r12
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 32
>>>  ; CHECK-NEXT:    popq %r14
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 24
>>>  ; CHECK-NEXT:    popq %r15
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 16
>>>  ; CHECK-NEXT:    popq %rbp
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 8
>>>  ; CHECK-NEXT:    retq
>>>  ;
>>>  ; DISABLED-LABEL: test_limit_all:
>>> @@ -896,10 +901,15 @@ define void @test_limit_all(%struct.S* n
>>>  ; DISABLED-NEXT:    movups (%rbx), %xmm0
>>>  ; DISABLED-NEXT:    movups %xmm0, (%r12)
>>>  ; DISABLED-NEXT:    popq %rbx
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 40
>>>  ; DISABLED-NEXT:    popq %r12
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 32
>>>  ; DISABLED-NEXT:    popq %r14
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 24
>>>  ; DISABLED-NEXT:    popq %r15
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 16
>>>  ; DISABLED-NEXT:    popq %rbp
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 8
>>>  ; DISABLED-NEXT:    retq
>>>  ;
>>>  ; CHECK-AVX2-LABEL: test_limit_all:
>>> @@ -938,10 +948,15 @@ define void @test_limit_all(%struct.S* n
>>>  ; CHECK-AVX2-NEXT:    vmovups (%rbx), %xmm0
>>>  ; CHECK-AVX2-NEXT:    vmovups %xmm0, (%r12)
>>>  ; CHECK-AVX2-NEXT:    popq %rbx
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 40
>>>  ; CHECK-AVX2-NEXT:    popq %r12
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 32
>>>  ; CHECK-AVX2-NEXT:    popq %r14
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 24
>>>  ; CHECK-AVX2-NEXT:    popq %r15
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 16
>>>  ; CHECK-AVX2-NEXT:    popq %rbp
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 8
>>>  ; CHECK-AVX2-NEXT:    retq
>>>  ;
>>>  ; CHECK-AVX512-LABEL: test_limit_all:
>>> @@ -980,10 +995,15 @@ define void @test_limit_all(%struct.S* n
>>>  ; CHECK-AVX512-NEXT:    vmovups (%rbx), %xmm0
>>>  ; CHECK-AVX512-NEXT:    vmovups %xmm0, (%r12)
>>>  ; CHECK-AVX512-NEXT:    popq %rbx
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 40
>>>  ; CHECK-AVX512-NEXT:    popq %r12
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 32
>>>  ; CHECK-AVX512-NEXT:    popq %r14
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 24
>>>  ; CHECK-AVX512-NEXT:    popq %r15
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 16
>>>  ; CHECK-AVX512-NEXT:    popq %rbp
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 8
>>>  ; CHECK-AVX512-NEXT:    retq
>>>  entry:
>>>    %d = getelementptr inbounds %struct.S, %struct.S* %s1, i64 0, i32 3
>>> @@ -1047,10 +1067,15 @@ define void @test_limit_one_pred(%struct
>>>  ; CHECK-NEXT:    movl 12(%rbx), %eax
>>>  ; CHECK-NEXT:    movl %eax, 12(%r14)
>>>  ; CHECK-NEXT:    addq $8, %rsp
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 40
>>>  ; CHECK-NEXT:    popq %rbx
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 32
>>>  ; CHECK-NEXT:    popq %r12
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 24
>>>  ; CHECK-NEXT:    popq %r14
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 16
>>>  ; CHECK-NEXT:    popq %r15
>>> +; CHECK-NEXT:    .cfi_def_cfa_offset 8
>>>  ; CHECK-NEXT:    retq
>>>  ;
>>>  ; DISABLED-LABEL: test_limit_one_pred:
>>> @@ -1086,10 +1111,15 @@ define void @test_limit_one_pred(%struct
>>>  ; DISABLED-NEXT:    movups (%rbx), %xmm0
>>>  ; DISABLED-NEXT:    movups %xmm0, (%r12)
>>>  ; DISABLED-NEXT:    addq $8, %rsp
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 40
>>>  ; DISABLED-NEXT:    popq %rbx
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 32
>>>  ; DISABLED-NEXT:    popq %r12
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 24
>>>  ; DISABLED-NEXT:    popq %r14
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 16
>>>  ; DISABLED-NEXT:    popq %r15
>>> +; DISABLED-NEXT:    .cfi_def_cfa_offset 8
>>>  ; DISABLED-NEXT:    retq
>>>  ;
>>>  ; CHECK-AVX2-LABEL: test_limit_one_pred:
>>> @@ -1129,10 +1159,15 @@ define void @test_limit_one_pred(%struct
>>>  ; CHECK-AVX2-NEXT:    movl 12(%rbx), %eax
>>>  ; CHECK-AVX2-NEXT:    movl %eax, 12(%r14)
>>>  ; CHECK-AVX2-NEXT:    addq $8, %rsp
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 40
>>>  ; CHECK-AVX2-NEXT:    popq %rbx
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 32
>>>  ; CHECK-AVX2-NEXT:    popq %r12
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 24
>>>  ; CHECK-AVX2-NEXT:    popq %r14
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 16
>>>  ; CHECK-AVX2-NEXT:    popq %r15
>>> +; CHECK-AVX2-NEXT:    .cfi_def_cfa_offset 8
>>>  ; CHECK-AVX2-NEXT:    retq
>>>  ;
>>>  ; CHECK-AVX512-LABEL: test_limit_one_pred:
>>> @@ -1172,10 +1207,15 @@ define void @test_limit_one_pred(%struct
>>>  ; CHECK-AVX512-NEXT:    movl 12(%rbx), %eax
>>>  ; CHECK-AVX512-NEXT:    movl %eax, 12(%r14)
>>>  ; CHECK-AVX512-NEXT:    addq $8, %rsp
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 40
>>>  ; CHECK-AVX512-NEXT:    popq %rbx
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 32
>>>  ; CHECK-AVX512-NEXT:    popq %r12
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 24
>>>  ; CHECK-AVX512-NEXT:    popq %r14
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 16
>>>  ; CHECK-AVX512-NEXT:    popq %r15
>>> +; CHECK-AVX512-NEXT:    .cfi_def_cfa_offset 8
>>>  ; CHECK-AVX512-NEXT:    retq
>>>  entry:
>>>    %d = getelementptr inbounds %struct.S, %struct.S* %s1, i64 0, i32 3
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll Tue Apr 24
>>> 03:32:08 2018
>>> @@ -24,6 +24,7 @@ define zeroext i16 @test_mm512_kunpackb(
>>>  ; X32-NEXT:    movzwl %ax, %eax
>>>  ; X32-NEXT:    movl %ebp, %esp
>>>  ; X32-NEXT:    popl %ebp
>>> +; X32-NEXT:    .cfi_def_cfa %esp, 4
>>>  ; X32-NEXT:    vzeroupper
>>>  ; X32-NEXT:    retl
>>>  ;
>>> @@ -75,6 +76,7 @@ define i32 @test_mm512_kortestc(<8 x i64
>>>  ; X32-NEXT:    movzbl %al, %eax
>>>  ; X32-NEXT:    movl %ebp, %esp
>>>  ; X32-NEXT:    popl %ebp
>>> +; X32-NEXT:    .cfi_def_cfa %esp, 4
>>>  ; X32-NEXT:    vzeroupper
>>>  ; X32-NEXT:    retl
>>>  ;
>>> @@ -123,6 +125,7 @@ define i32 @test_mm512_kortestz(<8 x i64
>>>  ; X32-NEXT:    movzbl %al, %eax
>>>  ; X32-NEXT:    movl %ebp, %esp
>>>  ; X32-NEXT:    popl %ebp
>>> +; X32-NEXT:    .cfi_def_cfa %esp, 4
>>>  ; X32-NEXT:    vzeroupper
>>>  ; X32-NEXT:    retl
>>>  ;
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/avx512-regcall-Mask.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-regcall-Mask.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/avx512-regcall-Mask.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/avx512-regcall-Mask.ll Tue Apr 24
>>> 03:32:08 2018
>>> @@ -194,11 +194,15 @@ define i64 @caller_argv64i1() #0 {
>>>  ; LINUXOSX64-NEXT:    .cfi_adjust_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    callq test_argv64i1
>>>  ; LINUXOSX64-NEXT:    addq $24, %rsp
>>> -; LINUXOSX64-NEXT:    .cfi_adjust_cfa_offset -16
>>> +; LINUXOSX64-NEXT:    .cfi_adjust_cfa_offset -24
>>>  ; LINUXOSX64-NEXT:    popq %r12
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 32
>>>  ; LINUXOSX64-NEXT:    popq %r13
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 24
>>>  ; LINUXOSX64-NEXT:    popq %r14
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %r15
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %v0 = bitcast i64 4294967298 to <64 x i1>
>>> @@ -271,6 +275,7 @@ define <64 x i1> @caller_retv64i1() #0 {
>>>  ; LINUXOSX64-NEXT:    kmovq %rax, %k0
>>>  ; LINUXOSX64-NEXT:    vpmovm2b %k0, %zmm0
>>>  ; LINUXOSX64-NEXT:    popq %rax
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %call = call x86_regcallcc <64 x i1> @test_retv64i1()
>>> @@ -381,7 +386,9 @@ define x86_regcallcc i32 @test_argv32i1(
>>>  ; LINUXOSX64-NEXT:    vmovaps {{[0-9]+}}(%rsp), %xmm14 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    vmovaps {{[0-9]+}}(%rsp), %xmm15 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    addq $128, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    vzeroupper
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>> @@ -435,6 +442,7 @@ define i32 @caller_argv32i1() #0 {
>>>  ; LINUXOSX64-NEXT:    movl $1, %edx
>>>  ; LINUXOSX64-NEXT:    callq test_argv32i1
>>>  ; LINUXOSX64-NEXT:    popq %rcx
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %v0 = bitcast i32 1 to <32 x i1>
>>> @@ -497,6 +505,7 @@ define i32 @caller_retv32i1() #0 {
>>>  ; LINUXOSX64-NEXT:    callq test_retv32i1
>>>  ; LINUXOSX64-NEXT:    incl %eax
>>>  ; LINUXOSX64-NEXT:    popq %rcx
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %call = call x86_regcallcc <32 x i1> @test_retv32i1()
>>> @@ -610,7 +619,9 @@ define x86_regcallcc i16 @test_argv16i1(
>>>  ; LINUXOSX64-NEXT:    vmovaps {{[0-9]+}}(%rsp), %xmm14 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    vmovaps {{[0-9]+}}(%rsp), %xmm15 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    addq $128, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %res = call i16 @test_argv16i1helper(<16 x i1> %x0, <16 x i1> %x1, <16
>>> x i1> %x2)
>>>    ret i16 %res
>>> @@ -662,6 +673,7 @@ define i16 @caller_argv16i1() #0 {
>>>  ; LINUXOSX64-NEXT:    movl $1, %edx
>>>  ; LINUXOSX64-NEXT:    callq test_argv16i1
>>>  ; LINUXOSX64-NEXT:    popq %rcx
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %v0 = bitcast i16 1 to <16 x i1>
>>> @@ -730,6 +742,7 @@ define i16 @caller_retv16i1() #0 {
>>>  ; LINUXOSX64-NEXT:    incl %eax
>>>  ; LINUXOSX64-NEXT:    # kill: def $ax killed $ax killed $eax
>>>  ; LINUXOSX64-NEXT:    popq %rcx
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %call = call x86_regcallcc <16 x i1> @test_retv16i1()
>>> @@ -843,7 +856,9 @@ define x86_regcallcc i8 @test_argv8i1(<8
>>>  ; LINUXOSX64-NEXT:    vmovaps {{[0-9]+}}(%rsp), %xmm14 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    vmovaps {{[0-9]+}}(%rsp), %xmm15 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    addq $128, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %res = call i8 @test_argv8i1helper(<8 x i1> %x0, <8 x i1> %x1, <8 x
>>> i1> %x2)
>>>    ret i8 %res
>>> @@ -895,6 +910,7 @@ define i8 @caller_argv8i1() #0 {
>>>  ; LINUXOSX64-NEXT:    movl $1, %edx
>>>  ; LINUXOSX64-NEXT:    callq test_argv8i1
>>>  ; LINUXOSX64-NEXT:    popq %rcx
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %v0 = bitcast i8 1 to <8 x i1>
>>> @@ -968,9 +984,11 @@ define <8 x i1> @caller_retv8i1() #0 {
>>>  ; LINUXOSX64-NEXT:    vpmovm2w %k0, %zmm0
>>>  ; LINUXOSX64-NEXT:    # kill: def $xmm0 killed $xmm0 killed $zmm0
>>>  ; LINUXOSX64-NEXT:    popq %rax
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    vzeroupper
>>>  ; LINUXOSX64-NEXT:    retq
>>>  entry:
>>>    %call = call x86_regcallcc <8 x i1> @test_retv8i1()
>>>    ret <8 x i1> %call
>>>  }
>>> +
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/avx512-regcall-NoMask.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-regcall-NoMask.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/avx512-regcall-NoMask.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/avx512-regcall-NoMask.ll Tue Apr 24
>>> 03:32:08 2018
>>> @@ -63,6 +63,7 @@ define x86_regcallcc i1 @test_CallargRet
>>>  ; LINUXOSX64-NEXT:    callq test_argReti1
>>>  ; LINUXOSX64-NEXT:    incb %al
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = add i1 %a, 1
>>>    %c = call x86_regcallcc i1 @test_argReti1(i1 %b)
>>> @@ -130,6 +131,7 @@ define x86_regcallcc i8 @test_CallargRet
>>>  ; LINUXOSX64-NEXT:    callq test_argReti8
>>>  ; LINUXOSX64-NEXT:    incb %al
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = add i8 %a, 1
>>>    %c = call x86_regcallcc i8 @test_argReti8(i8 %b)
>>> @@ -200,6 +202,7 @@ define x86_regcallcc i16 @test_CallargRe
>>>  ; LINUXOSX64-NEXT:    incl %eax
>>>  ; LINUXOSX64-NEXT:    # kill: def $ax killed $ax killed $eax
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = add i16 %a, 1
>>>    %c = call x86_regcallcc i16 @test_argReti16(i16 %b)
>>> @@ -261,6 +264,7 @@ define x86_regcallcc i32 @test_CallargRe
>>>  ; LINUXOSX64-NEXT:    callq test_argReti32
>>>  ; LINUXOSX64-NEXT:    incl %eax
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = add i32 %a, 1
>>>    %c = call x86_regcallcc i32 @test_argReti32(i32 %b)
>>> @@ -327,6 +331,7 @@ define x86_regcallcc i64 @test_CallargRe
>>>  ; LINUXOSX64-NEXT:    callq test_argReti64
>>>  ; LINUXOSX64-NEXT:    incq %rax
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = add i64 %a, 1
>>>    %c = call x86_regcallcc i64 @test_argReti64(i64 %b)
>>> @@ -406,7 +411,9 @@ define x86_regcallcc float @test_Callarg
>>>  ; LINUXOSX64-NEXT:    vaddss %xmm8, %xmm0, %xmm0
>>>  ; LINUXOSX64-NEXT:    vmovaps (%rsp), %xmm8 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    addq $16, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = fadd float 1.0, %a
>>>    %c = call x86_regcallcc float @test_argRetFloat(float %b)
>>> @@ -486,7 +493,9 @@ define x86_regcallcc double @test_Callar
>>>  ; LINUXOSX64-NEXT:    vaddsd %xmm8, %xmm0, %xmm0
>>>  ; LINUXOSX64-NEXT:    vmovaps (%rsp), %xmm8 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    addq $16, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = fadd double 1.0, %a
>>>    %c = call x86_regcallcc double @test_argRetDouble(double %b)
>>> @@ -548,6 +557,7 @@ define x86_regcallcc x86_fp80 @test_Call
>>>  ; LINUXOSX64-NEXT:    callq test_argRetf80
>>>  ; LINUXOSX64-NEXT:    fadd %st(0), %st(0)
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = fadd x86_fp80 %a, %a
>>>    %c = call x86_regcallcc x86_fp80 @test_argRetf80(x86_fp80 %b)
>>> @@ -611,6 +621,7 @@ define x86_regcallcc [4 x i32]* @test_Ca
>>>  ; LINUXOSX64-NEXT:    callq test_argRetPointer
>>>  ; LINUXOSX64-NEXT:    incl %eax
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = ptrtoint [4 x i32]* %a to i32
>>>    %c = add i32 %b, 1
>>> @@ -694,7 +705,9 @@ define x86_regcallcc <4 x i32> @test_Cal
>>>  ; LINUXOSX64-NEXT:    vmovdqa32 %xmm8, %xmm0 {%k1}
>>>  ; LINUXOSX64-NEXT:    vmovaps (%rsp), %xmm8 # 16-byte Reload
>>>  ; LINUXOSX64-NEXT:    addq $16, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = call x86_regcallcc <4 x i32> @test_argRet128Vector(<4 x i32> %a,
>>> <4 x i32> %a)
>>>    %c = select <4 x i1> undef , <4 x i32> %a, <4 x i32> %b
>>> @@ -768,7 +781,9 @@ define x86_regcallcc <8 x i32> @test_Cal
>>>  ; LINUXOSX64-NEXT:    vmovdqu (%rsp), %ymm1 # 32-byte Reload
>>>  ; LINUXOSX64-NEXT:    vmovdqa32 %ymm1, %ymm0 {%k1}
>>>  ; LINUXOSX64-NEXT:    addq $48, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = call x86_regcallcc <8 x i32> @test_argRet256Vector(<8 x i32> %a,
>>> <8 x i32> %a)
>>>    %c = select <8 x i1> undef , <8 x i32> %a, <8 x i32> %b
>>> @@ -842,7 +857,9 @@ define x86_regcallcc <16 x i32> @test_Ca
>>>  ; LINUXOSX64-NEXT:    vmovdqu64 (%rsp), %zmm1 # 64-byte Reload
>>>  ; LINUXOSX64-NEXT:    vmovdqa32 %zmm1, %zmm0 {%k1}
>>>  ; LINUXOSX64-NEXT:    addq $112, %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 16
>>>  ; LINUXOSX64-NEXT:    popq %rsp
>>> +; LINUXOSX64-NEXT:    .cfi_def_cfa_offset 8
>>>  ; LINUXOSX64-NEXT:    retq
>>>    %b = call x86_regcallcc <16 x i32> @test_argRet512Vector(<16 x i32>
>>> %a, <16 x i32> %a)
>>>    %c = select <16 x i1> undef , <16 x i32> %a, <16 x i32> %b
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/avx512-schedule.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-schedule.ll?rev=330706&r1=330705&r2=330706&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/avx512-schedule.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/avx512-schedule.ll Tue Apr 24 03:32:08
>>> 2018
>>> @@ -8702,6 +8702,7 @@ define <16 x float> @broadcast_ss_spill(
>>>  ; GENERIC-NEXT:    callq func_f32
>>>  ; GENERIC-NEXT:    vbroadcastss (%rsp), %zmm0 # 16-byte Folded Reload
>>> sched: [6:1.00]
>>>  ; GENERIC-NEXT:    addq $24, %rsp # sched: [1:0.33]
>>> +; GENERIC-NEXT:    .cfi_def_cfa_offset 8
>>>  ; GENERIC-NEXT:    retq # sched: [1:1.00]
>>>  ;
>>>  ; SKX-LABEL: broadcast_ss_spill:
>>> @@ -8713,6 +8714,7 @@ define <16 x float> @broadcast_ss_spill(
>>>  ; SKX-NEXT:    callq func_f32
>>>  ; SKX-NEXT:    vbroadcastss (%rsp), %zmm0 # 16-byte Folded Reload sched:
>>> [8:0.50]
>>>  ; SKX-NEXT:    addq $24, %rsp # sched: [1:0.25]
>>> +; SKX-NEXT:    .cfi_def_cfa_offset 8
>>>  ; SKX-NEXT:    retq # sched: [7:1.00]
>>>    %a  = fadd float %x, %x
>>>    call void @func_f32(float %a)
>>> @@ -8732,6 +8734,7 @@ define <8 x double> @broadcast_sd_spill(
>>>  ; GENERIC-NEXT:    callq func_f64
>>>  ; GENERIC-NEXT:    vbroadcastsd (%rsp), %zmm0 # 16-byte Folded Reload
>>> sched: [6:1.00]
>>>  ; GENERIC-NEXT:    addq $24, %rsp # sched: [1:0.33]
>>> +; GENERIC-NEXT:    .cfi_def_cfa_offset 8
>>>  ; GENERIC-NEXT:    retq # sched: [1:1.00]
>>>  ;
>>>  ; SKX-LABEL: broadcast_sd_spill:
>>> @@ -8743,6 +8746,7 @@ define <8 x double> @broadcast_sd_spill(
>>>  ; SKX-NEXT:    callq func_f64
>>>  ; SKX-NEXT:    vbroadcastsd (%rsp), %zmm0 # 16-byte Folded Reload sched:
>>> [8:0.50]
>>>  ; SKX-NEXT:    addq $24, %rsp # sched: [1:0.25]
>>> +; SKX-NEXT:    .cfi_def_cfa_offset 8
>>>  ; SKX-NEXT:    retq # sched: [7:1.00]
>>>    %a  = fadd double %x,
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>