[llvm-commits] [PATCH 2/6] revise/enhance atomic primitive code generation

Michael Liao michael.liao at intel.com
Tue Aug 21 22:05:26 PDT 2012


here is the revised patch. Following your valueable suggestions, new helpers are added to remove big switchs and significantly simplify the code.

Yours
- Michael

On Tue, 2012-08-21 at 13:46 -0700, Chandler Carruth wrote:
> FYI, I'm commenting on each individual patch by replying to the email
> where you attached that patch. So on to #2...
> 
> 
>         From: Michael Liao <michael.hliao at gmail.com>
>         Date: Wed, 25 Jul 2012 17:15:02 -0700
>         Subject: [PATCH 2/6] refine code generation for atomic
>         operations with
>          spin-loop
> 
> 
> For commit logs, I would encourage you to write them in a bit more of
> a prose form. Try to include the motivation or end goal, some amount
> of context. No need to go crazy with it, just helps the reviewers.
> 
> 
>         - replace EmitAtomicMinMaxWithCustomInserter &
>           EmitAtomicBitwiseWithCustomInserter into a single
>         EmitAtomicLoadArith
>           for all pseudo atomic operations, i.e. atomic operations
>         based on
>           spin-loop or compare-exchange loop
>         - replace EmitAtomicBit6432WithCustomInserter with
>           EmitAtomicLoadArith6432 for 64-bit atomic operations on
>         32-bit target
>         - remove unncessary modifiers for pseudo atomic instruction td
> 
> 
> Can you split these three bullet points into their own three small
> patches? It looks like none of them change behavior, they're just
> refactorings.

I cannot split them into 3 individual ones as these changes depends on
each other. Even though the last part is little independent from the
above two, it's quite trivial (<9 lines changes) to be included into
this patch.

Yours
- Michael

> 
> 
> Also, when you are fusing routines like this together, please be
> cautious about growing the functions too large. It is important to
> factor them heavily. Some comments about this below....
>  
>         - reduce one unnecessary load in spin-loop 
>         - modify existing tests due to spin-loop change
> 
> 
> If you can give a quick before/after summary in the change log for
> this patch, that would be good.
> 
> 
> 
> 
> Some comments on the patch itself. I'm going to hold off going into
> too much detail until after you break it appart into the refactoring
> patches followed by the functional ones.
> 
> 
> +// private utility function
> 
> 
> 
> Please, much more and detailed comments. =] This is very tricky code,
> so try to help future maintainers.
> 
> 
> If possible, use Doxygen-style comments for all of the functions:
> 
> 
> /// \brief Utility function to compute the whizz bang widgets.
> ///
> /// This function computes whizz bang widgets using the following
> algorithm:
> /// ...
> ///
> /// An example of the output:
> /// ...
> 
> 
> +  switch (VT.getSimpleVT().SimpleTy) {
> +  default:
> +    llvm_unreachable("Invalid atomic-load-op operand size!");
> +  case MVT::i8:
> +    LCMPXCHGOpc = X86::LCMPXCHG8;
> +    LOADOpc = X86::MOV8rm;
> +    break;
> +  case MVT::i16:
> +    LCMPXCHGOpc = X86::LCMPXCHG16;
> +    LOADOpc = X86::MOV16rm;
> +    break;
> +  case MVT::i32:
> +    LCMPXCHGOpc = X86::LCMPXCHG32;
> +    LOADOpc = X86::MOV32rm;
> +    break;
> +  case MVT::i64:
> +    LCMPXCHGOpc = X86::LCMPXCHG64;
> +    LOADOpc = X86::MOV64rm;
> +    break;
> +  }
> 
> 
> When ever you find yourself with a big switch that just sets variables
> to values, consider writing a small static helper function that
> returns the values. Here, I think two functions, one to return the
> cmxchg variant and one to return the load variant would be better.
> 
> 
> 
> 
> +  unsigned Opc = MI->getOpcode();
> +  switch (Opc) {
> +  default:
> +    llvm_unreachable("Unhandled atomic-load-op opcode!");
> +  case X86::ATOMAND8:
> +  case X86::ATOMAND16:
> +  case X86::ATOMAND32:
> +  case X86::ATOMAND64:
> +  case X86::ATOMOR8:
> +  case X86::ATOMOR16:
> +  case X86::ATOMOR32:
> +  case X86::ATOMOR64:
> +  case X86::ATOMXOR8:
> +  case X86::ATOMXOR16:
> +  case X86::ATOMXOR32:
> +  case X86::ATOMXOR64: {
> 
> 
> Here we have another big switch that should probably be hoisted into a
> function, but it also has this:
> 
> 
> +    unsigned ARITHOpc = (Opc == X86::ATOMAND8)  ? X86::AND8rr :
> +                        (Opc == X86::ATOMAND16) ? X86::AND16rr :
> +                        (Opc == X86::ATOMAND32) ? X86::AND32rr :
> +                        (Opc == X86::ATOMAND64) ? X86::AND64rr :
> +                        (Opc == X86::ATOMOR8)   ? X86::OR8rr :
> +                        (Opc == X86::ATOMOR16)  ? X86::OR16rr :
> +                        (Opc == X86::ATOMOR32)  ? X86::OR32rr :
> +                        (Opc == X86::ATOMOR64)  ? X86::OR64rr :
> +                        (Opc == X86::ATOMXOR8)  ? X86::XOR8rr :
> +                        (Opc == X86::ATOMXOR16) ? X86::XOR16rr :
> +                        (Opc == X86::ATOMXOR32) ? X86::XOR32rr :
> +                        (Opc == X86::ATOMXOR64) ? X86::XOR64rr :
> +                        0;
> +    assert(ARITHOpc && "Invalid atomic-load-op transformation!");
> 
> 
> This is a really hard to read (and terribly inefficient) way to build
> a switch that produces a value. A function with a switch would seem
> much clearer.
> 
> 
> That said, why re-do a walk over the opcodes? It would seem more
> obvious to handle this in the one big switch.
> 
> 
> Is there any way to turn this into a table, or better yet, have
> tablegen pre-compute the mapping from atomfoo -> foo? That would
> eliminate a *lot* of the boilerplate in this patch.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-refine-code-generation-for-atomic-operations-with-sp.patch
Type: text/x-patch
Size: 58713 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120821/01a7d451/attachment.bin>


More information about the llvm-commits mailing list