[PATCH] D35334: ThinLTO Minimized Bitcode File Size Reduction

Haojie Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 13 17:07:21 PDT 2017


haojiewang added inline comments.


================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:4031
+
+  writeModuleInfo();
+
----------------
pcc wrote:
> haojiewang wrote:
> > pcc wrote:
> > > tejohnson wrote:
> > > > pcc wrote:
> > > > > haojiewang wrote:
> > > > > > pcc wrote:
> > > > > > > haojiewang wrote:
> > > > > > > > pcc wrote:
> > > > > > > > > With this call you're emitting a lot of unnecessary information into the minimized bitcode file. As I was suggesting on the other thread I wonder whether you can just emit "short" module-level records that just contain the name and the linkage (maybe see if you can drop the linkage, as it should already be available in the summary).
> > > > > > > > Yes, this is what I'm going to do after this patch released. So do you suggest I do it in this patch or leave it to next patch?
> > > > > > > I would suggest writing an initial patch that changes the index bitcode reader to read the linkage from the summary instead of from the module, and then once that lands do what I suggested in this patch.
> > > > > > If it is ok that we do that after this patch? Because I think module info have too much redundancy info for minimized bitcode file, most of which is necessary for IR bitcode file. We can add a new writeModuleInfo for minimized bitcode file after this patch, and then also change the bitcode reader. That can get 5%-10% benefit on size reduction.
> > > > > Up to you but keep in mind that your code will be easier to review if you do things in the order I suggested, because we won't have to think about the design that we know you're going to replace, just the one that you're going to replace it with.
> > > > That change seems like an incremental follow on to this one. I.e. it affects the reader, whereas this patch just takes a first cut at reducing what we emit in the writer. So I am personally happy with this one going in first. Peter, are there things here that would be throwaway work if this went in first and that was a follow on?
> > > I can accept that it may be easier to develop the patches in this order, but as I mentioned it makes things harder to review (which I guess is the "throwaway work" -- if we go with this first I would need to carefully look at the body of writeModuleInfo to make sure that running it wouldn't cause problems in this new flow, even though we know that it won't make a difference in the end).
> > > 
> > > Generally when I write patches I try to make things easier for the reviewer. But as I say, up to you folks.
> > I checked the code and found a problem for getting linkage from the summary instead of module info. The current logic is that we first insert the ValueID-ValueInfo map, which requires GUID(getting from name, linkage and source file name) and ValueID. This will be done after reading the module info. Then when reading summary, we can get GUID from that map. If we want to get the linkage for a value when reading summary, we cannot get the GUID until we read that value's summary. But this value may appear before this in a ref list or call list of other value, which requires the value info for it, while we don't have that yet. So if we want to do it this way, the logic may become more complex: we have to store all the ref list and call list to a ValueID-XXList map, and update them to SummaryIndex when we finish reading all the summary part. Is there another way to solve this problem?
> I see. In that case I think it makes sense to forget about moving the linkage as an initial step and change this patch so that it creates short module records containing just the name and the linkage. That would address my main design concern with this patch.
I have two questions for shortening the module records:
1. If we only record the name and linkage, the index of linkage in module record will be changed(from 5 to 2). Should we change it only for summary or also change the original module info(put linkage right after the name)? We have to also support reading summary from full IR bitcode file, so I guess we should let them have the same format. 
2. Currently there are two versions of module record(one is getting name from VST and another is getting name from strtab). Should we add a third version? Or obsolete the old ones?



https://reviews.llvm.org/D35334





More information about the llvm-commits mailing list