[cfe-dev] [llvm-dev] Rewriting calls to varargs functions

Dávid Bolvanský via cfe-dev cfe-dev at lists.llvm.org
Tue May 22 17:25:46 PDT 2018


Interesting ideas, thanks!

But only if "printf("Hello, %s", "world") to printf("%s", "Hello,
world")"-like transformation makes some sense, I think it is not worth it
at all to do it.


Anyway, thank you for all your suggestions.

2018-05-23 2:11 GMT+02:00 Richard Smith <richard at metafoo.co.uk>:

> Converting to puts is usually not possible: puts appends a newline to its
> output. The only really appropriate thing to convert to, that works in
> general, is fwrite. But we can't convert to that because we can't form the
> 'stdout' parameter (stdout might be a macro rather than a global, or might
> have a nontrivial mangling, so LLVM can't synthesize it). Also, converting
> printf("Hello, %s", "world") to printf("Hello, world") is likely a
> pessimization rather than an optimization for performance: printing a
> string via %s just needs to write the string, whereas printing a format
> string needs to scan for %s.
>
> Having said all that, the opposite conversion (from printf("Hello, %s",
> "world") to printf("%s", "Hello, world")) may be marginally worthwhile. And
> there are some non-trivial tradeoffs here if you want to optimize for size.
> (Eg, some format string refactorings may permit more string constant reuse.)
>
> On 22 May 2018 at 10:26, Hubert Tong via llvm-dev <llvm-dev at lists.llvm.org
> > wrote:
>
>> On Tue, May 22, 2018 at 12:59 PM, Dávid Bolvanský via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> It could save useless parsing in s/f/printf during runtime.
>>>
>> A mix of calls to puts and calls to printf with format strings containing
>> just a conversion specifier can help towards such a goal without mutating
>> constants beyond the format string.
>>
>>
>>>
>>> E.g. for heavy "fprint"ing code like fprintf(f, "%s: %s", TAG, msg); I
>>> think it could be quite useful.
>>> After this transformation we would get fprintf(f, "ABC: %s", msg);  -->
>>> We could save one push/mov instruction + less parsing in printf every time
>>> we call it. We would just replace string constant "%s: %s" with "ABC: %s"
>>> and possibly orphaned "ABC" constant could be removed completely.
>>>
>>>
>>>
>>> 2018-05-22 18:36 GMT+02:00 Hal Finkel <hfinkel at anl.gov>:
>>>
>>>>
>>>> On 05/22/2018 10:42 AM, Dávid Bolvanský wrote:
>>>>
>>>> Thanks.
>>>>
>>>> Yes, to substitute only some of the arguments. Formatting used by
>>>> printf depends on the locale but only for double, float types I think -
>>>> yes, I would not place double/float constants into the format string.
>>>>
>>>>
>>>> Okay. I think it's true that integers will be the same regardless of
>>>> locale (so long as the ' flag is not used, as that brings in a dependence
>>>> on LC_NUMERIC).
>>>>
>>>>
>>>> Why? To reduce number of constants (some of them could be merged into
>>>> the format string) and number of args when calling printf/fprintf/sprintf,
>>>> etc..
>>>>
>>>>
>>>> Sure, but it seems to me unlikely that this will affect performance. Is
>>>> it a code-size optimization (this actually isn't obvious to me because the
>>>> string representation might be longer than the binary form of the constant
>>>> plus the extra instructions)?
>>>>
>>>>  -Hal
>>>>
>>>>
>>>>
>>>> 2018-05-22 16:22 GMT+02:00 Hal Finkel <hfinkel at anl.gov>:
>>>>
>>>>>
>>>>> On 05/22/2018 04:32 AM, Dávid Bolvanský via llvm-dev wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> A new patch:
>>>>> https://reviews.llvm.org/D47159
>>>>>
>>>>> proposes transformations like:
>>>>> printf("Hello, %s %d", "world", 123) - > printf("Hello world 123")
>>>>>
>>>>>
>>>>> To clarify, the real question here comes up when you can only
>>>>> substitute some of the arguments? If you can substitute all of the
>>>>> arguments, then you can turn this into a call to puts.
>>>>>
>>>>> In any case , why do you want to do this? Also, doesn't the formatting
>>>>> used by printf depend on the process's current locale?
>>>>>
>>>>>  -Hal
>>>>>
>>>>>
>>>>> As Eli noted:
>>>>>
>>>>> "I'm not sure we can rewrite calls to varargs functions safely in
>>>>> general given the current state of the C ABI rules in LLVM.
>>>>>
>>>>> Sometimes clang does weird things to conform with the ABI rules,
>>>>> because the LLVM type system isn't the same as the C system. For most
>>>>> functions, it's pretty easy to tell it happened: if the IR signature of the
>>>>> function doesn't match the expected signature, something weird happened, so
>>>>> we can just bail out. But varargs functions don't specify a complete
>>>>> signature, so we can't tell if the clang ABI code was forced to do
>>>>> something weird, like split an argument into multiple values, or insert a
>>>>> padding value. For example, for the target mips64-unknown-linux-gnu, a call
>>>>> like printf("asdf%Lf", 1.0L); gets lowered to the following:
>>>>>
>>>>> %call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([5 x
>>>>> i8], [5 x i8]* @.str, i32 0, i32 0), i64 undef, fp128
>>>>> 0xL00000000000000003FFF000000000000) #2"
>>>>>
>>>>>
>>>>> I would to hear more suggestions whether it is safe or not. Seems like
>>>>> for mips Clang produces some weird IR, but e.g. x86 IR seems ok.
>>>>>
>>>>> Any folks from Clang/LLVM to bring more information about "varargs vs
>>>>> ABI vs LLVM vs Clang"?
>>>>> And whether we can rewrite calls to varargs functions safely under
>>>>> some conditions..
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>>
>>>>> --
>>>>> Hal Finkel
>>>>> Lead, Compiler Technology and Programming Languages
>>>>> Leadership Computing Facility
>>>>> Argonne National Laboratory
>>>>>
>>>>>
>>>>
>>>> --
>>>> Hal Finkel
>>>> Lead, Compiler Technology and Programming Languages
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180523/e958ae96/attachment.html>


More information about the cfe-dev mailing list