[llvm-dev] XRay: Demo on x86_64/Linux almost done; some questions.
Serge Rogatch via llvm-dev
llvm-dev at lists.llvm.org
Mon Aug 8 06:27:23 PDT 2016
I think that 32-bit systems (especially ARM) may be short on memory so
doubling the size of the table containing (potentially) all the functions
may give a tangible overhead. I would even align the entries to 4 bytes (so
12 bytes per entry) on 32-bit platforms and to 8 bytes (so 24-bytes per
entry) on 64-bit platforms, to improve CPU cache hits. What do you think?
Cheers,
Serge
On 8 August 2016 at 06:34, Dean Michael Berris <dean.berris at gmail.com>
wrote:
>
> > On 6 Aug 2016, at 04:06, Serge Rogatch <serge.rogatch at gmail.com> wrote:
> >
> > Hi Dean,
> >
> > I have a question for 32-bit platforms. I see in the code that you used
> the following in compiler-rt/trunk/lib/xray/xray_interface_internal.h :
> > struct XRaySledEntry {
> > uint64_t Address;
> > uint64_t Function;
> > unsigned char Kind;
> > unsigned char AlwaysInstrument;
> > unsigned char Padding[14]; // Need 32 bytes
> > };
> >
> > And the peer code in llvm/trunk/lib/Target/X86/X86MCInstLower.cpp :
> >
> > void X86AsmPrinter::EmitXRayTable() {
> > if (Sleds.empty())
> > return;
> > if (Subtarget->isTargetELF()) {
> > auto *Section = OutContext.getELFSection(
> > "xray_instr_map", ELF::SHT_PROGBITS,
> > ELF::SHF_ALLOC | ELF::SHF_GROUP | ELF::SHF_MERGE, 0,
> > CurrentFnSym->getName());
> > auto PrevSection = OutStreamer->getCurrentSectionOnly();
> > OutStreamer->SwitchSection(Section);
> > for (const auto &Sled : Sleds) {
> > OutStreamer->EmitSymbolValue(Sled.Sled, 8);
> > OutStreamer->EmitSymbolValue(CurrentFnSym, 8);
> > auto Kind = static_cast<uint8_t>(Sled.Kind);
> > OutStreamer->EmitBytes(
> > StringRef(reinterpret_cast<const char *>(&Kind), 1));
> > OutStreamer->EmitBytes(
> > StringRef(reinterpret_cast<const char
> *>(&Sled.AlwaysInstrument), 1));
> > OutStreamer->EmitZeros(14);
> > }
> > OutStreamer->SwitchSection(PrevSection);
> > }
> > Sleds.clear();
> > }
> >
> > So useful part of your entry is 18 bytes, and you add 14 bytes of
> padding to achieve 32-byte alignment. But for 32-bit CPUs I understood that
> XRaySledEntry::Address and XRaySledEntry::Function can be 32-bit too,
> right? So the entry can fit 16 bytes, with 10 useful bytes and 6 bytes of
> padding. Can I use 16-byte entries or is there some external (OS? ELF?
> Linker?) requirement that one entry must be 32-bytes, or aligned at 32
> bytes, etc.?
> >
>
> Good question Serge -- technically there isn't any specific external
> requirement here, but that supporting 32-bit x86 isn't a priority for me
> right now. I suspect it's possible to support 32-bit x86 with a similar
> approach (modifying both the LLVM back-end to emit the right assembly for
> 32-bit x86 and maybe for 32-bit non-x86, as well as compiler-rt to work on
> 32-bit x86) but that I haven't had the time to explore this yet.
>
> I'm positive that it's doable though and that we know the right places
> where the changes have to happen. There's some work being done on the
> tooling side of things and I suspect once we have a standardised log
> format, things like endianness and sizes of certain values start becoming
> an issue. For example, if I build a log analysis tool to work on 64-bit
> systems, whether it should be able to handle log files generated in 32-bit
> systems (and be able to read differently-sized instrumentation map
> sections).
>
> For this reason, I think it's better to stay consistent and
> forward-compatible (i.e. not have special cases for 32-bit platforms). I do
> think it's important to support 32-bit systems too, and I'd be more than
> happy to review patches that would make it possible (until say I get the
> time to support 32-bit platforms later on).
>
> Does that make sense?
>
> Cheers
>
> -- Dean
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160808/9be68189/attachment.html>
More information about the llvm-dev
mailing list