[PATCH] D127462: [Clang] Begin implementing Plan 9 C extensions

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 21 05:37:16 PST 2022


aaron.ballman added a comment.

In D127462#3939243 <https://reviews.llvm.org/D127462#3939243>, @ksaunders wrote:

> Hi Aaron. Unfortunately, I don't feel I can make a great case for why these extensions should be in Clang. Although there are users of Plan 9 C extensions, I don't see these features being adopted more generally enough to warrant its inclusion in Clang which violates the inclusion policy.

Just to check -- do you think (some of) these features are something you wish to propose to WG14 for adoption into C? e.g., are you aiming to get multiple compilers to implement Plan 9 extensions to demonstrate to WG14 that this is existing practice in C compilers?

> To this effect, I tried using libTooling to rewrite Plan 9 C to standard C that can be correctly compiled with Clang, but because the AST creation requires semantic analysis to run it leaves the AST in a state of disrepair (it can parse Plan 9 C, but the analyzer gets confused with duplicate fields and so on).
>
> I'll have to decide if I am going to keep these changes in a Clang fork or modify another C compiler for LLVM. Regardless, I believe my diffs for adding the Plan 9 calling convention to LLVM still apply (they are simple), so I will send them upstream when I feel they are ready.

SGTM

> ---
>
> I think it also makes sense to address your questions here for the sake of completeness.

Thank you, I appreciate the education. :-)

>> I'm wondering if you could go into a bit more detail about what Automatic embedded structure type decay and Embedded structure type name accesses mean in practice (some code examples would be helpful).
>
> Absolutely. "Automatic embedded structure type decay" and "Embedded structure type name accesses" are features best described by example:
>
>   typedef struct Lock Lock;
>   typedef struct Rc Rc;
>   typedef struct Resource Resource;
>   
>   struct Lock
>   {
>     int hold;
>   };
>   
>   struct Rc
>   {
>     int references;
>   }:
>   
>   struct Resource
>   {
>     Rc;
>     Lock;
>     void *buffer;
>     size_t size;
>   };
>
> Now with "Embedded structure type name accesses" enabled, if we have a value like `Resource *r`, we can do `r->Lock`. This simply returns the field as if `Lock;` was declared as `Lock Lock;`, but this special declaration also brings all names into scope (like an anonymous struct) so we can do `r->hold`. This also does NOT work if you declare the field as `struct Lock;`, it must be a typedef name.

What an interesting extension! What happens with something like this?

  typedef struct Lock FirstLock;
  typedef struct Lock SecondLock;
  typedef struct Rc Rc;
  typedef struct Resource Resource;
  
  struct Lock
  {
    int hold;
  };
   
  struct Rc
  {
    int references;
  };
  
  struct Resource
  {
    Rc;
    FirstLock;
    SecondLock;
    void *buffer;
    size_t size;
  };

Does this work for accessing `r->FirstLock` but give an ambiguous lookup for `r->hold`? Or do you only allow one member of the underlying canonical type?

Also, why does it require a typedef name?

> Further, with "Automatic embedded structure type decay" structure pointers are automatically converted into an access of an embedded compatible structure. So we have a function like: `void lock(Lock *);` we can call it with `lock(r);` and the compiler will automatically search all unnamed structures in `Resource` recursively until it finds a matching type. Note that `Lock;` is declared after `Rc;`, this is intentional. In standard C it is possible to have a pointer to a struct declay to a pointer to the first field in the struct. That is completely separate from this extension.

Ah, interesting. So this is another case where multiple members of the same type would be a problem.  Does this only find structure/union members, or does this also work for other members? e.g. `void size(size_t *)` being called with `lock(r)`? And if it works for other members... what does it do for bit-field members which share an allocation unit?

> If that was unclear, GCC also supports this functionality and it is documented here for a different explanation: https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html
>
>> Are you planning to add a new driver for Clang to emulate the Plan 9 compiler driver (similar to how we have a driver for MSVC compatibility)?
>
> For now, no.
>
> Adding the Plan 9 object format to LLD is out-of-scope for this project (this was discussed previously <https://discourse.llvm.org/t/plan-9-a-out-executables-with-lld/61438>) so I don't think it's necessary to add a new driver, we can just use the generic ELF driver.
>
> Similarly, adding the Plan 9 assembler syntax is not necessary either as most programs are C so the assembler can be trivially converted as the idea is that programs will be compiled with the Plan 9 calling convention and C ABI.
>
>> Are there other extensions planned, or is the list in the summary pretty complete?
>
> No, the listing above is complete.
>
>> Do I understand correctly that plan 9 compatibility mode forces C89 as the language standard mode, or is it expected that users can do things like -std=c2x -fplan9-extensions?
>
> Plan 9 C extensions are not mutually exclusive with C2x so I think that you should be allowed to write C2x Plan 9 C. If we did have a Plan 9 driver though, it would set `-fplan9-extensions -std=c89` to be as close as possible to the Plan 9 compilers functionality.
>
> Cheers

Thanks for the extra details!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127462/new/

https://reviews.llvm.org/D127462



More information about the cfe-commits mailing list