[llvm-dev] Potential missed optimisation with SEH funclets

Thu Jun 27 11:34:52 PDT 2019

The main reason it is done is so that frame index resolution just works
inside funclets. Otherwise, we'd have to code up some logic to use a
different base register for stack object offsets inside funclets. Which,
when you say it that way, seems pretty easy to implement. It's just a
matter of changing X86FrameLowering::getFrameIndexReference.

On Thu, Jun 27, 2019 at 5:05 AM David Chisnall via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> A quick skim of this code looks as if we are explicitly disabling frame
> pointer elimination for funclets in the back end.  It looks as if this
> is done because FP-elim sometimes breaks funclets - if anyone has a test
> case for this then that would probably help tracking it down.
>
> David
>
> On 26/06/2019 21:17, Reid Kleckner via llvm-dev wrote:
> > Yes, not much effort has been applied to optimizing Windows exception
> > handling. We were primarily concerned with making it correct, and
> > improving it hasn't been a priority. You can follow the code path
> > through X86FrameLowering::emitPrologue with IsFunclet=true and see that
> > it mechanically emits all the extra instructions mentioned above without
> > any logic to skip such steps when not necessary.
> >
> > However, while the mid-level representation we chose makes it hard to
> > write these types of micro-level code quality optimizations, it allows
> > the optimizers to do a variety of fancy things like heap to stack
> > promotion on unique_ptr in the presence of exceptional control flow.
> >
> > On Tue, Jun 25, 2019 at 4:08 AM Hamza Sood via llvm-dev
> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> >
> >     I’ve been experimenting with SEH handling in LLVM, and it seems like
> >     the unwind funclets generated by LLVM are much larger than those
> >     generated by Microsoft’s CL compiler.
> >
> >     I used the following code as a test:
> >
> >     void test() {
> >        MyClass x;
> >        externalFunction();
> >     }
> >
> >     Compiling with CL, the unwind funclet that destroys ‘x’ is just two
> >     lines of asm:
> >
> >     lea rcx, QWORD PTR x$[rdx]
> >     jmp ??1MyClass@@QEAA at XZ
> >
> >     However when compiling with clang-cl, it seems like it sets up an
> >     entire function frame just for the destructor call:
> >
> >     mov qword ptr [rsp + 16], rdx
> >     push rbp
> >     .seh_pushreg 5
> >     sub rsp, 32
> >     .seh_stackalloc 32
> >     Lea rbp, [rdx + 48]
> >     .seh_endprologue
> >     Lea rcx, [rbp - 16]
> >     call "??1MyClass@@QEAA at XZ”
> >     nop
> >     add rsp, 32
> >     pop rbp
> >     ret
> >
> >     Both were compiled with “/c /O2 /MD /EHsc”
> >
> >     Is LLVM missing a major optimisation here?
> >     _______________________________________________
> >     LLVM Developers mailing list
> >     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/55c40781/attachment.html>