[PATCH] D87272: [lld] Buffer writes when composing a single diagnostic

Thu Sep 9 18:36:21 PDT 2021

MaskRay added a comment.

In D87272#2993146 <https://reviews.llvm.org/D87272#2993146>, @dblaikie wrote:

>>> Presumably there's a reason that's not the only option/hardcoded, though?
>>
>> Unavailable for Windows<10?
>
> Perhaps that's it, I'm not sure.
>
>>>>> If people are getting pages and pages of errors they're ignoring/want to perform quickly - that seems like something that deserves a different solution: turning off the warnings they are ignoring.
>>>>
>>>> This is separate: whether we need an option like `--warn-limit=N`.
>>>
>>> I don't think it is separate, though - this change seems to add extra complications to the code to improve performance for a case that doesn't seem like it should matter (generally I don't think error code paths are optimized like this/planned for performance of large numbers of errors being produced, the general design guidance is that once you have to show a message a human has to read and think about, the actual runtime it takes to produce that message is always going to be smaller than the human time required to understand and act on it)
>>
>> The output can be a distributed build farm. Making diagnostic 3x / 4x faster can be sufficient to allow a build not to time out while it could time out previously.
>
> It still doesn't seem like the right solution to me - we shouldn't be generating reams of output that the user won't use. And it sounds like that's the case here. Scrolling through pages of output doesn't seem like a beneficial user experience to me, so far as I can tell?

Users who analyze output with out-of-range relocations may need every relocation.
I am unsure the number of warnings needs a limit (like `--error-limit`).

Say, with a link of 400s, the build just timed out (due to large amount of output) in a build farm.
With 100s, it may succeed with large amount of output.
The 3x/4x output speedup makes a large difference.

>>>> Separate options turning off warnings seem less useful to me. There are some warnings which don't deserve an option (which would make things more complicate). Off-by-default options need the users to rerun the link to turn off it.
>>>
>>> But you've mentioned multiple cases (both in the patch description ("For an invocation of ld.lld with 62000+ lines of..."), and in the renewed motivation to commit this patch (" I noticed another internal user complaining about link timeout due to spending too much time dumping the messages.")).
>>
>> The "symbol ordering file" warning motivation was somewhat weak. It was due to incorrect usage. I don't think it needs dedicated warning option.
>
> I'm not suggesting that lld must have a flag - just that if this issue is going to be addressed, I think that direction is more suitable than the one taken here. Doing neither is OK to me. But doing this rather than fixing the diagnostic experience in a way that's helpful to the user is something I don't think is the right direction.

I have some idea about the frequency of various error/warning diagnostics.
I think it is overkill to introduce separate -W and -Wno like options now, beside some features which already have a disabling mechanism.

>> The new motivation is `relocation XXX out of range` with `--noinhibit-exec`.
>> This is stronger. Faster output can be useful when analyzing large output.
>
> What sort of analysis do you have in mind? It certainly doesn't seem like this would be useful to a normal user running their linker. If they have to write scripts to try to deduplicate, find the largest value, or do some other analysis over the output - I don't think the performance of lld's diagnostic printing is going to be the expensive part of that process.

See a paragraph above.
I am thinking of some analysis which can figure out output section ordering.

>> I don't think it needs dedicated warning option, though.
>> The slightly generic `--warn-limit=N` may have some marginal use case but I haven't made up my mind to add it.
>>
>>>> This patch addresses another problem: I have seen interleaved diagnostic (something like `msg_from_process_0 msg_from_process_1 msg_from_process_1` on one line) from either Clang or ld.lld from multiple processes.
>>>> This patch will avoid such issue.
>>>
>>> That motivation isn't mentioned in the commit message - the scalability of 10s of thousands of diagnostics is mentioned (& the convenience of strace - also doesn't seem like a huge motivation to me).
>>
>> Sure, it is subjective sometimes. To me `reportDiagnostic` is better because with the previous form it was difficult to ensure the different kinds of diagnostics have the same style.
>
> Having consistent error output is a good thing, yes - I don't object to having a common function - though I'd prefer it either be (or wrap/use) https://llvm.org/doxygen/classllvm_1_1WithColor.html#a59d59f7f8aa89b08f44ad6a87e8ebb1a for instance, which handles the colour changing and formatting in a common way across the whole LLVM project, for instance.

lld needs more full-blown diagnostic builder like `clang::DiagnosticBuilder`.
Currently `WithColor` lacks quite a few things. Perhaps making it support line buffered diagnostics may allow lld to migrate to its interface.
There are other lld specific needs like `lld::errs()`.
Overall I am unclear sharing code would be the best solution across the whole LLVM project.

That said, making `WithColor::error(errs(), ToolName) << ... << ... << ...` line buffered can help some utilities (llvm-objdump/llvm-readelf/etc) to avoid line interleaved output.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87272/new/

https://reviews.llvm.org/D87272