[cfe-users] Avoiding duplications within an interpreter switch

Eliot Miranda via cfe-users cfe-users at lists.llvm.org
Thu Oct 29 11:39:14 PDT 2020


Hi All,

    I'm a happy clang user across several platforms for Smalltalk VM
development.

One version of the VM is an interpreter that supports two bytecode sets.
Its main dispatch loop combines dispatches for two bytecode sets, one
offset by 256 from the other.  So the range of cases in the switch is from
0 to 511.  Some of the entries in the switch are common to both bytecode
sets.  These have (generated) C source looking like:

    while (1) {
        VM_LABEL(Dispatch);
        switch (currentBytecode) {
        CASE(0)
        CASE(256)
            /* pushReceiverVariableBytecode */
            {
                sqInt object;

                VM_LABEL(pushReceiverVariable);
                /* begin fetchNextBytecode */
                currentBytecode = (byteAtPointer(++localIP)) +
bytecodeSetSelector;
                /* begin pushReceiverVariable: */
                object = longAt(((longAt(localFP + FoxReceiver)) +
BaseHeaderSize));
                longAtPointerput((localSP -= BytesPerOop), object);
            }
            BREAK;
        CASE(1)
        CASE(257)
            /* pushReceiverVariableBytecode */
            {
                sqInt object;

                VM_LABEL(pushReceiverVariable1);
                /* begin fetchNextBytecode */
                currentBytecode = (byteAtPointer(++localIP)) +
bytecodeSetSelector;
                /* begin pushReceiverVariable: */
                object = longAt(((longAt(localFP + FoxReceiver)) +
BaseHeaderSize) + 8 /* (currentBytecode bitAnd: 15) << self shiftForWord
*/);
                longAtPointerput((localSP -= BytesPerOop), object);
            }
            BREAK;
etc

CASE and BREAK are macros which allow for gcc's first-class labels to be
used to speed up dispatch. VM_LABEL is a macro which is used to insert a
global label so that in a profiler one can see each bytecode separately,
rather than have all of interpret as one big lump.

Alas using -Os clang on MacOS (Apple LLVM version 10.0.0
(clang-1000.11.45.5)) is choosing to split the common code in the switch
(in the above into a case for 0 and a separate case for 256, a case for 1
and a case for 257, etc).  This breaks the use of VM_LABEL because the
labels _LpushReceiverVariable, _LpushReceiverVariable1, et al that get
inserted by the VM_LABEL macro each get duplicated when the shared code in
the case is duplicated.  So if VM_LABEL is defined to insert a label
compilation fails with many errors about many label duplications.

Is there a way to turn off the duplication?  I've tried a few switches and
looked at the voluminous -foptimization-record-file=optimizations output,
but can't find any clues.  -fno-reroll-loops for example had no effect.

I note that given the use of -Os, which is optimize for space as well as
speed, the duplication in the switch is exactly the opposite response to
the flag a developer wants.  I know that keeping these cases together will
give me better icache density and improved performance, etc.

Is there a way to generate assembly before attempting to generate code?
This would give me the ability to remove the duplicated labels by editing
the generated assembler.  Currently asking for -S generates nothing because
the code is being compiled to machine code before assembler is generated.
Inconvenient for those of us used to be able to abuse the output of the
compiler for nefarious means (which led to the invention of gcc's first
class labels, but that's an old and long story).

and thank you for an otherwise glorious compiler.
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-users/attachments/20201029/db24eca5/attachment.html>


More information about the cfe-users mailing list