<div dir="ltr">I think that 32-bit systems (especially ARM) may be short on memory so doubling the size of the table containing (potentially) all the functions may give a tangible overhead. I would even align the entries to 4 bytes (so 12 bytes per entry) on 32-bit platforms and to 8 bytes (so 24-bytes per entry) on 64-bit platforms, to improve CPU cache hits. What do you think?<div><br></div><div>Cheers,</div><div>Serge</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 8 August 2016 at 06:34, Dean Michael Berris <span dir="ltr"><<a href="mailto:dean.berris@gmail.com" target="_blank">dean.berris@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
> On 6 Aug 2016, at 04:06, Serge Rogatch <<a href="mailto:serge.rogatch@gmail.com">serge.rogatch@gmail.com</a>> wrote:<br>
><br>
> Hi Dean,<br>
><br>
</span><div><div class="h5">> I have a question for 32-bit platforms. I see in the code that you used the following in compiler-rt/trunk/lib/xray/<wbr>xray_interface_internal.h :<br>
> struct XRaySledEntry {<br>
> uint64_t Address;<br>
> uint64_t Function;<br>
> unsigned char Kind;<br>
> unsigned char AlwaysInstrument;<br>
> unsigned char Padding[14]; // Need 32 bytes<br>
> };<br>
><br>
> And the peer code in llvm/trunk/lib/Target/X86/<wbr>X86MCInstLower.cpp :<br>
><br>
> void X86AsmPrinter::EmitXRayTable() {<br>
> if (Sleds.empty())<br>
> return;<br>
> if (Subtarget->isTargetELF()) {<br>
> auto *Section = OutContext.getELFSection(<br>
> "xray_instr_map", ELF::SHT_PROGBITS,<br>
> ELF::SHF_ALLOC | ELF::SHF_GROUP | ELF::SHF_MERGE, 0,<br>
> CurrentFnSym->getName());<br>
> auto PrevSection = OutStreamer-><wbr>getCurrentSectionOnly();<br>
> OutStreamer->SwitchSection(<wbr>Section);<br>
> for (const auto &Sled : Sleds) {<br>
> OutStreamer->EmitSymbolValue(<wbr>Sled.Sled, 8);<br>
> OutStreamer->EmitSymbolValue(<wbr>CurrentFnSym, 8);<br>
> auto Kind = static_cast<uint8_t>(Sled.<wbr>Kind);<br>
> OutStreamer->EmitBytes(<br>
> StringRef(reinterpret_cast<<wbr>const char *>(&Kind), 1));<br>
> OutStreamer->EmitBytes(<br>
> StringRef(reinterpret_cast<<wbr>const char *>(&Sled.AlwaysInstrument), 1));<br>
> OutStreamer->EmitZeros(14);<br>
> }<br>
> OutStreamer->SwitchSection(<wbr>PrevSection);<br>
> }<br>
> Sleds.clear();<br>
> }<br>
><br>
> So useful part of your entry is 18 bytes, and you add 14 bytes of padding to achieve 32-byte alignment. But for 32-bit CPUs I understood that XRaySledEntry::Address and XRaySledEntry::Function can be 32-bit too, right? So the entry can fit 16 bytes, with 10 useful bytes and 6 bytes of padding. Can I use 16-byte entries or is there some external (OS? ELF? Linker?) requirement that one entry must be 32-bytes, or aligned at 32 bytes, etc.?<br>
><br>
<br>
</div></div>Good question Serge -- technically there isn't any specific external requirement here, but that supporting 32-bit x86 isn't a priority for me right now. I suspect it's possible to support 32-bit x86 with a similar approach (modifying both the LLVM back-end to emit the right assembly for 32-bit x86 and maybe for 32-bit non-x86, as well as compiler-rt to work on 32-bit x86) but that I haven't had the time to explore this yet.<br>
<br>
I'm positive that it's doable though and that we know the right places where the changes have to happen. There's some work being done on the tooling side of things and I suspect once we have a standardised log format, things like endianness and sizes of certain values start becoming an issue. For example, if I build a log analysis tool to work on 64-bit systems, whether it should be able to handle log files generated in 32-bit systems (and be able to read differently-sized instrumentation map sections).<br>
<br>
For this reason, I think it's better to stay consistent and forward-compatible (i.e. not have special cases for 32-bit platforms). I do think it's important to support 32-bit systems too, and I'd be more than happy to review patches that would make it possible (until say I get the time to support 32-bit platforms later on).<br>
<br>
Does that make sense?<br>
<br>
Cheers<br>
<span class="HOEnZb"><font color="#888888"><br>
-- Dean<br>
<br>
</font></span></blockquote></div><br></div>