[LLVMdev] GSoC on LLVM usability?

Wed Mar 28 07:44:59 PDT 2012

Hi Tobias,

Thank you for replying!

> 1. Relating LLVM-IR to the source code is difficult

To relate LLVM-IR to the source code, I am simply considering using the
debug information which remains valid along optimizations and analysis.
Regarding transformations which might not relate to any original source code, I
think this is to be solved on a per-case basis, but will hopefully be greatly
eased by the debugging information format (at least if a transformation is
impossible to explain it will be a valuable feedback on the debug format!).
Also, some might simply not need to be exposed to the user, as they do not
refer to a standard compilation technique. This is why my proposal involves
enumerating the optimizations which will yield a message to the user, then
compose these messages.

> 2. There will be a lot of messages

About the potential huge amount of messages, I would assign a "type"
(represented by enum INFO_INDEX in the prototype for info_printf) to each
message. The compiler would remember which types previously appeared, and favor
the types which appeared the least. It would also handle simple dependencies,
like not introducing Loop-Closed SSA before SSA itself. As you may be concerned
about performance, the system would not store all messages before selecting the
ones to display, but rather display them or cast them out as they come.

> (3). making this actually work seems to be difficult

As for the difficulty to implement this work, I have tried to propose the
simplest reasonable approach:
_ The necessary code to write (the heuristic selecting types to display) is
  contained in a single independent function.
_ I do not intend to browse and update every optimizing component. The original
  programmers are best to explain their own creations, so I will just compose
  detailed sample messages and provide usability insights.

You might think that this task underestimated in terms of coding, but the
actual difficulty will be to compose the messages. Indeed, for example how can
one give a glimpse of SSA by relating directly to the user code? In that case I
would make a reassignment trigger the message:
test.c:xx:x: info: Static Single Assignment: Reassignment of 'a' is considered as a new variable, consequently a store operation was saved by discarding its previous value"

The point is not to make an exhaustive description of the optimization
concerned, but rather to foster the user curiosity who will "google" the term
we put at the head of the message.

A revised work plan could be:
_ Enumerating all possible messages in the messages set.
_ Implementing a function receiving feedback from each optimization unit and
  choosing whether to display it: info_printf(enum INFO_INDEX, const char*, ...);
_ Updating one or two optimization blocks to display such feedback, to test
  info_printf
_ Write a formatting guide for adding messages in the set.

Regards,
Thibault