[PATCH] D70243: Lowering CPI/JTI/BA to assembly

Jason Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 19 09:11:17 PST 2019


jasonliu added a comment.



>> For comparison, gcc generates the value of constant pool data directly into the TC entry. It's faster and garbage collectable if not used.
> 
> I did some investigation about LLVM constant pool, as I remember, except for int value which will be put directly into TC entry, others like double, float etc. will be put into constant pool in LLVM. If we want to mimic GCC behavior, it probably would be a large piece of refactoring.

I don't think llvm actually try to put int directly into TC entry, it always put a label in TC entry, or we just use that integer value directly in the assembly. But please correct me if I'm wrong. 
We don't need to switch to that behavior right now, but later when we care about performance then we might want to think about it.
Here's my test case:

  float foo(){
    return 1.2;
  }
  double bar(){
    return 3.4;
  }

GCC generates:

  LC..0:
    .tc FS_3f99999a[TC],0x3f99999a
  lfs 0,LC..0(2)
  
  LC..1:
    .tc FD_400b3333_33333333[TC],0x400b3333,0x33333333
  lfd 0,LC..1(2)

- Notice that the value is store directly into the TC entry.

xlc:

    .csect  H.20.NO_SYMBOL{RO}, 3
    .long 0x3ff33333              # "?\36333"
    .long 0x33333333              # "3333"
    .long 0x400b3333              # "@\v33"
    .long 0x33333333              # "3333"
  
  T.18.NO_SYMBOL:
    .tc H.18.NO_SYMBOL{TC},H.20.NO_SYMBOL{RO}
  
    l          r31,T.18.NO_SYMBOL(RTOC)
    lfd        fp1,0(r31)
  
    l          r31,T.18.NO_SYMBOL(RTOC)
    lfd        fp1,8(r31)



- Notice that xlc only used 1 tc entry to access this RO data. It use a dedicate csect for constant pool instead of mixing with the other read only data. (you could add a "const int a= 10;" into the test case, and it won't use the same csect as constant pool)



> 
> 
>> xlC generates all the constant pool data in a single csect, but reference different data with one TC entry. Less efficient, but still less TC entry generated, and it does not mess with the other rodata.
> 
> Sorry, but can you list your testcase and some explanatory emitted results here, I am not able to reproduce what you said. And based on my observation, xlC generates one RO csect and one TC entry for each global const value [eg. `float a = 5.56`].

See my test case and explanations above.

>> So with this implementation, it would work functionally. But performance and space wise, could not compare to the other two compilers.
>>  I'm Okay with moving forward with some TODOs in our mind.
> 
> Thank you for your advice.  Some plan I have in mind for emitting jump table index is that once we start supporting unique sections something like `-fdata-sections`, then we can emit each JTI to unique RO section. So I put an assertion in `getSectionForJumpTable`. And based on your suggestion I think I can do something similar in `getSectionForConstant`. Please let me know if this will resolve your concerns.

-fdata-sections would help for the garbage collection case. But for performance wise, we might still want to investigate if we could do something similar like what GCC does (at least for constant pool).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70243/new/

https://reviews.llvm.org/D70243





More information about the llvm-commits mailing list