[LLVMdev] [Proposal] Parallelize post-IPO stage.

Wan, Xiaofei xiaofei.wan at intel.com
Wed Jul 17 18:56:57 PDT 2013

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Nick Kledzik
Sent: Thursday, July 18, 2013 7:54 AM
To: Shuxin Yang
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] [Proposal] Parallelize post-IPO stage.

On Jul 17, 2013, at 4:29 PM, Shuxin Yang <shuxin.llvm at gmail.com<mailto:shuxin.llvm at gmail.com>> wrote:

On 7/17/13 4:12 PM, Nick Kledzik wrote:
On Jul 14, 2013, at 7:07 PM, Andrew Trick <atrick at apple.com<mailto:atrick at apple.com>> wrote:
The partitioning should be deterministic. It's just that the linker output now depends on the partitioning heuristics. As long that decision is based on the input (not the host system), then it still meets Eric's requirements. I just think it's unfortunate that post-IPO partitioning (or more generally, parallel codegen) affects the output, but may be hard to avoid. It would be nice to be able to tune the partitioning for compile time without worrying about code quality.
I also want to chime in on the importance of stable binary outputs.  And not just same compiler and same sources produces same binary, but that minor changes to either should cause minor changes to the output binary.   For software updates, Apple updater tries to download only the delta to the binaries, so we want those to be as small as possible.  In addition, it often happens late in an OS release cycle that some critical bug is found and the fix is in the compiler.  To qualify it, we rebuild the whole OS with the new compiler, then compare all the binaries in the OS, making sure only things related to the bug are changed.

We can view partitioning as a "transformation".  Unless the transformation is absolutely no-op,
it will change something.   If we care the consistency in binaries, we either consistently use partition
or consistently not use partition.
But doesn't "consistently not use partition" mean "don't use the optimization you are working on"?   Isn't there someone to get the same output no matter how it is partitioned?

 The compiler used to generate a single object file from the merged
IR, now it will generate multiple of them, one for each partition.
I have not studied the MC interface, but why does each partition need to generate a separate object file?  Why can't the first partition done create an object file, and as other partitions finish, they just append to that object file?

We could append the object files as an alternative.
However, how do we know the /path/to/ld from the existing interface ABIs?
How do we know the flags feed to the ld (more often than not, "-r" alone is enough,
but some linkers may need more).

In my rudimentary implement, I hack by hardcoding to /usr/bin/ld.

I think adding object one by one back to the linker is better as the linker already have
enough information.
I think you missed my point, or you are really thinking from the multi-process point of view.   In LLVM there is an MCWriter used to produce object files.   Your model is that if there are three partitions, then there will be three MCWriter objects, each producing an object file.  What I am saying is to have only one MCWriter object and have all three partitions stream their content out through the one MCWriter, producing one object file.

[Xiaofei] if you only target parallelizing post LTO passes, there is no need to partition, parallelizing function passes is enough and could achieve better performance (no need to link partitions together since it share AsmPrinter & MCWriter)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130718/770473df/attachment.html>

More information about the llvm-dev mailing list