[PATCH] D70243: Lowering CPI/JTI/BA to assembly
Jason Liu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 19 09:11:17 PST 2019
jasonliu added a comment.
>> For comparison, gcc generates the value of constant pool data directly into the TC entry. It's faster and garbage collectable if not used.
>
> I did some investigation about LLVM constant pool, as I remember, except for int value which will be put directly into TC entry, others like double, float etc. will be put into constant pool in LLVM. If we want to mimic GCC behavior, it probably would be a large piece of refactoring.
I don't think llvm actually try to put int directly into TC entry, it always put a label in TC entry, or we just use that integer value directly in the assembly. But please correct me if I'm wrong.
We don't need to switch to that behavior right now, but later when we care about performance then we might want to think about it.
Here's my test case:
float foo(){
return 1.2;
}
double bar(){
return 3.4;
}
GCC generates:
LC..0:
.tc FS_3f99999a[TC],0x3f99999a
lfs 0,LC..0(2)
LC..1:
.tc FD_400b3333_33333333[TC],0x400b3333,0x33333333
lfd 0,LC..1(2)
- Notice that the value is store directly into the TC entry.
xlc:
.csect H.20.NO_SYMBOL{RO}, 3
.long 0x3ff33333 # "?\36333"
.long 0x33333333 # "3333"
.long 0x400b3333 # "@\v33"
.long 0x33333333 # "3333"
T.18.NO_SYMBOL:
.tc H.18.NO_SYMBOL{TC},H.20.NO_SYMBOL{RO}
l r31,T.18.NO_SYMBOL(RTOC)
lfd fp1,0(r31)
l r31,T.18.NO_SYMBOL(RTOC)
lfd fp1,8(r31)
- Notice that xlc only used 1 tc entry to access this RO data. It use a dedicate csect for constant pool instead of mixing with the other read only data. (you could add a "const int a= 10;" into the test case, and it won't use the same csect as constant pool)
>
>
>> xlC generates all the constant pool data in a single csect, but reference different data with one TC entry. Less efficient, but still less TC entry generated, and it does not mess with the other rodata.
>
> Sorry, but can you list your testcase and some explanatory emitted results here, I am not able to reproduce what you said. And based on my observation, xlC generates one RO csect and one TC entry for each global const value [eg. `float a = 5.56`].
See my test case and explanations above.
>> So with this implementation, it would work functionally. But performance and space wise, could not compare to the other two compilers.
>> I'm Okay with moving forward with some TODOs in our mind.
>
> Thank you for your advice. Some plan I have in mind for emitting jump table index is that once we start supporting unique sections something like `-fdata-sections`, then we can emit each JTI to unique RO section. So I put an assertion in `getSectionForJumpTable`. And based on your suggestion I think I can do something similar in `getSectionForConstant`. Please let me know if this will resolve your concerns.
-fdata-sections would help for the garbage collection case. But for performance wise, we might still want to investigate if we could do something similar like what GCC does (at least for constant pool).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70243/new/
https://reviews.llvm.org/D70243
More information about the llvm-commits
mailing list