<div dir="ltr">On Wed, Aug 10, 2016 at 9:30 AM, Kevin Enderby <span dir="ltr"><<a href="mailto:enderby@apple.com" target="_blank">enderby@apple.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><span class=""><blockquote type="cite"><div>On Aug 9, 2016, at 7:36 PM, Saleem Abdulrasool <<a href="mailto:compnerd@compnerd.org" target="_blank">compnerd@compnerd.org</a>> wrote:</div><br><div><div dir="ltr">On Tue, Aug 9, 2016 at 2:04 PM, Kevin Enderby via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Hi Puyan,<div><br></div><div>Adding Jim Grosbach to get his thoughts.</div><div><div><br></div><div>Stepping back and thinking a bit more about the functionality you want it doesn’t seen clear to me that llvm-objdump is really the right place to do this. I can see that in the llvm tree this would be an easy place to add this. However it seems that llvm-objdump is more about printing the contents of objects and operating on them and taking them apart and putting parts of them in separate files.</div><div><br></div><div>On Mac OS X this functionality really seems to belong to the tool segedit(1) with its -extract option. And you want an extract “all sections” into conventional named output suffixed files. Maybe it would be better to add the functionality to segedit(1) or create an llvm-segedit(1) tool.</div><div><br></div><div>Part of my thinking is how to handle the error cases of multiple files on the command line, archives and multiple slices in a fat file. Reading your code I think multiple files might work as part off your "conventional named” files includes the input file name. But for archives and fat files I think you would get the odd error message "The file was not recognized as a valid object file” which is what ones gets for object_error::invalid_file<wbr>_type. In this case your error really should be more like “The -extract-macho-sections option only works on thin Mach-O files” or something.</div><div><br></div><div>Jim, what do you think about adding this functionality to llvm-objdump?</div></div></div></blockquote><div><br></div><div>Just a by-stander, but, I think that this should go into a new tool `llvm-objcopy`.</div></div></div></div></div></blockquote><div><br></div></span>I’ve never use a system that had an objcopy(1) tool, but looking at the online objcopy(1) - Linux man page it doesn’t seem to have an option to copy out a section’s contents into a plain file. It seems that the output is always another object file.</div></div></blockquote><div> </div><div>Yes, usually, the output is written to another object file. However, you could specify the binary object to dump it to a raw file.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span class=""><blockquote type="cite"><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> This would be another one of the missing pieces that we have in the LLVM toolset. `objcopy` allows you to copy out various sections in the ELF world. We could handle this similar to a multi call binary</div></div></div></div></div></blockquote><div><br></div></span>Could you explain a bit more what a "multi call binary” is? Sorry it that is obvious for a Linux person, my back ground is traditional old school Unix and Mac OS X.</div></div></blockquote><div><br></div><div>Sure. Basically switching the behavior of the binary based on the name with which it was invoked. So, we would have a symlink for llvm-segedit which would give you the behavior of segedit, otherwise it would behave like objcopy.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div class="h5"><div><blockquote type="cite"><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>and provide a similar option set to `segedit` for MachO and the normal binutils-like command line for `objcopy` which can also do MachO, but does ELF and COFF as well.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><br></div><div>Other comments:</div><div><br></div><div>Other things to check is your handling of zero fill section that have no contents. I think your code will write junk or get errors as the offsets for these are zero but they have non-zero section sizes. I think it is best not to write any file for an S_ZEROFILL section type.</div><div><br></div><div>The name of the routine ExtractLoadCommand() seems very odd. The logic seems it would be better if just folded in to your loop over the load commands:</div><div><br></div><div>+ for (const auto &Load : InputObject->load_commands()) {<br>+ if (!ExtractLoadCommand(Filename, *InputObject, Load))<br>+ break;<br>+ }<br><br><div>And since you are really extracting section contents maybe the names of the routines ExtractSegmentCommand<wbr>() and ExtractSegment64Command() would be better as ExtractSections() and ExtractSections64().</div><div><br></div><div>Also the test is writing into the directory with the inputs (I think) because of the use of "-extract-macho-sections-dir ./“ I think it would be better to some how direct this to the build directory so things can be cleaned. Not sure exactly how to do this but I know lld writes its tests to some “Output” directory in the build directory.</div><div><br></div><div>Also I still think a much smaller test that a 4000+ character line of hex would be nicer. And also checking that more then just the one file for one section gets created. Why not create a little Mach-O file with multiple sections and check all the files for the sections are created and contain the correct contents. I would create a Mach-O file that contains ascii section contents and little files that can be diff(1)’ed or cmp(1)’ed. I did this with archives and malformed Mach-O files. But a small file under 100 bytes could be a simple test case.</div><div><br></div><div>My thoughts,</div><div>Kev</div><div><br><div><blockquote type="cite"><div><div><div>On Aug 9, 2016, at 11:51 AM, Puyan Lotfi <<a href="mailto:puyan.lotfi.llvm@gmail.com" target="_blank">puyan.lotfi.llvm@gmail.com</a>> wrote:</div><br></div></div><div><div><div><div dir="ltr">Revised patch again.<div><br></div><div>I removed the hex printing code, let the test handle that. Still using hexdump and filecheck to verify the contents of the binary. </div><div>Added better error handling for file access, and now using all of LLVM's file IO libraries.</div><div>No more error messages on those segment types. </div><div><br></div><div>PL</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Aug 8, 2016 at 2:36 PM, Kevin Enderby <span dir="ltr"><<a href="mailto:enderby@apple.com" target="_blank">enderby@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><span><blockquote type="cite"><div>On Aug 8, 2016, at 2:29 PM, Puyan Lotfi <<a href="mailto:plotfi@apple.com" target="_blank">plotfi@apple.com</a>> wrote:</div><br><div><div style="word-wrap:break-word"><br><div><blockquote type="cite"><div>On Aug 8, 2016, at 1:53 PM, Kevin Enderby <<a href="mailto:enderby@apple.com" target="_blank">enderby@apple.com</a>> wrote:</div><br><div><div style="word-wrap:break-word"><br><div><blockquote type="cite"><div>On Aug 8, 2016, at 11:27 AM, Puyan Lotfi <<a href="mailto:puyan.lotfi.llvm@gmail.com" target="_blank">puyan.lotfi.llvm@gmail.com</a>> wrote:</div><br><div><div dir="ltr">Kevin<div><br></div><div>I have a new patch attached.</div><div><br></div><div>I am fine with only supporting <span style="font-size:13px">LC_SEGMENT_64 or LC_SEGMENT_64</span><span style="font-size:13px"> for the time being but I adding a message in case the extract code encounters a segment other than those types.</span></div></div></div></blockquote><div><br></div>I’m not understanding why you consider it an error to have load commands other than an LC_SEGMENT_64 or LC_SEGMENT_64 used with Mach-O files used with your option. Is this not expected work with something an x86_64 Mach-O compiled for a Mac OS X? It would have other load commands for like LC_SYMTAB, LC_UUID, LC_FU<wbr>NCTION_STARTS, etc. Or is this option only to work in some GPU Mach-O files with only LC_SEGMENT_64 or LC_SEGMENT_64 load commands?</div><div><br></div></div></div></blockquote><div><br></div><div>The latter is correct. I can add support for those other segments in another commit, but right now I haven't written the code to do that. I don't consider it an error, but I figured the extract should print if it encounters a segment it doesn't support.</div></div></div></div></blockquote><div><br></div></span>I guess maybe I don’t understand what you are saying. There are only two segment load commands, the LC_SEGMENT_64 and LC_SEGMENT_64. Are you using the word “segment” to mean “load command”? If not what do you mean by a "encounters a segment it doesn't support”?</div><div><span><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div><div> Would you like me to add support for all of the possible segment types??</div></div></div></div></blockquote><div><br></div></span>What other “segment types” are there? Sorry I might be lost I’m just not following you.</div><div><span><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div><blockquote type="cite"><div><div dir="ltr"><div><span style="font-size:13px"><br></span></div><div><span style="font-size:13px">I have added some additional flags to narrow down which segment to dump.</span></div></div></div></blockquote><div><br></div><blockquote type="cite"><div><div dir="ltr"><div><span style="font-size:13px">Also, I added a flag and some code to print the section to stderr in a hex format for testing</span></div></div></div></blockquote><div><br></div>For test wouldn’t you just extract the sections using your option and test that it is the same as a file one you add to the test suite?</div><div><br></div></div></div></blockquote><div><br></div><div>I could, but how should I do that? I mean, should I just run cmp on both files?</div></div></div></div></blockquote><div><br></div></span>That’s what I would do. As that is the functionality you are testing/</div><div><span><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div><div> Can I even do that if the test is run on a system without cmp? Sorry, I was just trying to do the thing where I could rely on FileCheck the most here.</div></div></div></div></blockquote><div><br></div></span>I’m not a test expert. Maybe there is a way to test that to files are the same with FileCheck. I guess you might want to see if that is possible.</div><span><div><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div>I don’t really understand the need for the code to dump the section to stderr. There is already the option -section to dump Mach-O sections. Also seems odd that you use the ‘C’ printf style of coding here. Maybe something more C++ like would fit better with the llvm code if you really want to keep this bit.</div><div><br></div></div></div></blockquote><div><br></div><div>Ah, yeah you're right what I am doing here is weird.</div><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div><blockquote type="cite"><div><div dir="ltr"><div><span style="font-size:13px"> and I've added a test case as well.</span></div></div></div></blockquote><div><br></div>Seems like this could be a bit smaller and test the extracted file and use your -extract-macho-sections-dir option. Without that it seems like the test will write into the location of the input file.</div><div><br></div></div></div></blockquote><div><br></div><div>Ah right I think I forgot to add that flag to the test. </div><div><br></div><div>PL</div><br><blockquote type="cite"><div><div style="word-wrap:break-word"><div><blockquote type="cite"><div><div dir="ltr"><div><span style="font-size:13px"><br></span></div><div><span style="font-size:13px">Let me known how this looks to you.</span></div><div><span style="font-size:13px"><br></span></div><div><span style="font-size:13px">Thanks</span></div><div><span style="font-size:13px"><br></span></div><div><span style="font-size:13px">PL</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 5, 2016 at 4:22 PM, Kevin Enderby <span dir="ltr"><<a href="mailto:enderby@apple.com" target="_blank">enderby@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Puyan,<br>
<br>
Seems like the functionality would be easier to test if it didn’t write the output files to the same directory as the input file. And of course you’ll want to add a test case as the llvm tradition with any changes.<br>
<br>
Also I’m not an llvm style expert but you can likely shorten the logic and not use the a switch statement for your return value from ExtractLoadCommand(). Also it looks like it should get an error with any Mach-O file that has non LC_SEGMENT_64 or LC_SEGMENT_64 commands? Is that correct and what you want.<br>
<br>
You also might just want to return a bool for success of failure from ExtractSectionData() and your other routines.<br>
<br>
Also it llvm seems to like the shortest bit if code so I don’t think you need the “int Res” and could just use an if statement.<br>
<br>
Kev<br>
<div><div><br>
> On Aug 5, 2016, at 4:06 PM, Puyan Lotfi <<a href="mailto:puyan.lotfi.llvm@gmail.com" target="_blank">puyan.lotfi.llvm@gmail.com</a>> wrote:<br>
><br>
> Hi Kevin, All:<br>
><br>
> I have a macho-extract tool kicking around locally that I threw together a while back that simply dumps each section in a MachO to disk, and I've decided it's probably high-time to upstream the functionality into llvm-objdump.<br>
><br>
> Let me know if this looks good enough to check in, or otherwise feedback would be nice too.<br>
><br>
> Thanks<br>
><br>
> PL<br>
><br>
><br>
</div></div>> <<a href="http://llvm-objdump-macho-extract.pa/" target="_blank">llvm-objdump-macho-extract.pa</a><wbr>tch><br>
<br>
</blockquote></div><br></div>
<span><llvm-objdump-macho-extract2.p<wbr>atch></span></div></blockquote></div><br></div></div></blockquote></div><br></div></div></blockquote></div><br></span></div></blockquote></div><br></div>
</div></div><span><llvm-objdump-macho-extract3.p<wbr>atch></span></div></blockquote></div><br></div></div></div></div><br>______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div data-smartmail="gmail_signature">Saleem Abdulrasool<br>compnerd (at) compnerd (dot) org</div>
</div></div>
</div></blockquote></div><br></div></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">Saleem Abdulrasool<br>compnerd (at) compnerd (dot) org</div>
</div></div>