[llvm-commits] [llvm] r77903 - in /llvm/trunk: include/llvm/Intrinsics.td include/llvm/IntrinsicsBlackfin.td lib/Target/Blackfin/BlackfinInstrInfo.td test/CodeGen/Blackfin/load-intr.ll test/CodeGen/Blackfin/sync-intr.ll

Sun Aug 2 12:40:23 PDT 2009

On Sun, Aug 2, 2009 at 11:28 AM, Jakob Stoklund Olesen<stoklund at 2pi.dk> wrote:
> +let TargetPrefix = "bfin" in {
> +
> +  // Almost identical to ctpop except for the type signature
> +  def int_bfin_ones : GCCBuiltin<"__builtin_bfin_ones">,
> +          Intrinsic<[llvm_i16_ty], [llvm_i32_ty], [IntrNoMem]>;
> +
> +  // Load unaligned pointer, ignoring the low bits. Like *(p&~3).
> +  // This uses the disalignexcpt instruction
> +  def int_bfin_loadbytes : GCCBuiltin<"__builtin_bfin_loadbytes">,
> +          Intrinsic<[llvm_i32_ty], [llvm_ptr_ty], [IntrReadArgMem]>;
> +
> +}

If it's easy to represent the behavior of an intrinsic on top of
current IR instructions, we generally prefer to do that rather than
add intrinsics.  __builtin_bfin_ones is equivalent to truncating the
result of a ctpop, and __builtin_bfin_loadbytes is equivalent to
masking the argument pointer, then loading from it.  The reason for
preferring this is that the transformation passes can reason much more
accurately about the behavior of these intrinsics.  We can
constant-fold ctpop, but transformation passes don't have any clue
what calculation a __builtin_bfin_ones does.  If we can prove the
argument to a masked load is 4-byte aligned, we can eliminate the mask
instruction, but transformation passes don't have any clue what memory
a __builtin_bfin_loadbytes reads from.

See llvm-i386.cpp for examples of how this is done.

-Eli