[libc-commits] [PATCH] D158774: [libc] Add GPU support for `printf`	and `fprintf`
    Joseph Huber via Phabricator via libc-commits 
    libc-commits at lists.llvm.org
       
    Wed Sep  6 13:35:42 PDT 2023
    
    
  
jhuber6 added inline comments.
================
Comment at: libc/src/stdio/gpu/parser.h:134
+    ++cur_pos;
+
+  if (format[cur_pos] != '\0')
----------------
michaelrj wrote:
> jhuber6 wrote:
> > michaelrj wrote:
> > > I realize that this was probably for size reasons, but this should probably add back something for parsing flags, widths, and precisions. The current design will accept all valid format specifiers, but won't reject all invalid format specifiers (e.g. `%?d`).
> > I originally had code for that but it added an absurd amount of resource usage that I don't think we really need. The parsing here will definitely pick this up as an argument and copy the argument to the server. We use a real parser there so the actual printing will pick it up, but I guess the number of arguments might be off?
> > 
> > I'm guessing the standard solution is to just ignore it?
> in the case of an invalid format specifier (like `%?d`) the behavior is undefined. The way we generally handle it is by treating the format specifier (in this case `%?`) as a raw string. The current micro-parser would mistakenly skip past the `?` and assume the format name is `d`, which is just a letter `d` in this case. This would cause argument misalignment since there would be an extra `int` argument where there shouldn't be one.
So it's similar to any other architecture, you have registers that store intermediate values. The way the GPU differs is that we have different kind of registers, and a lot of them. So, because the GPU uses a SIMD architecture at its core, we have a separation between scalar registers and vector registers. These conditions are heavy on the vector registers because they're values that can be divergent across the SIMD group. E.g. we have 32 threads in a "warp" all executing in lock-step. A vector register thus contains 32-values. If we're doing a printf where the format string is different between threads (uncommon but possible) then we'll hit the vector registers harder.
Unrelated, @arsenm, do we have a way to assert that some of these values will be uniform if the format string is uniform? It would probably save us a good amount of vector registers if we knew that this internal state would all be the same.
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158774/new/
https://reviews.llvm.org/D158774
    
    
More information about the libc-commits
mailing list