[PATCH] [OPENMP] CodeGen for "omp atomic read [seq_cst]" directive.

Mon Jan 5 13:18:04 PST 2015

Hi John, thanks for the review.

> How are you planning to implement stores for any of the non-simple l-value cases?  Compare-and-swap loops?

Yes, that's the plan. Except for global registers: I did not find 
compare-and-swap op in LLVM IR for them, so I decided to use global 
locks for them.

> ... You need to use narrower bounds than that because you need something that's guaranteed stable: ...

Hmm, I did not catch why there can be troubles with bitfileds. Why one 
compiler may use 12-byte atomic access, while another one will produce a 
4-byte access? I think all atomic accesses will be the same. According 
to OpenMP spec we cannot perform atomic operation on the whole bitfield 
structure, only on their particular bitfields (atomic ops are allowed 
only for scalar values). So I expect that all atomic operations on 
bitfields will be performed on the same bounds. Or you mean something else?

> Also, both bitfields and vector elements can often be accessed more efficiently than just a libcall, depending on how much space they need.

I thought about it. I agree, but also it may significantly complicate 
the code itself.  That's why I decided to use only libcalls, taking into 
account that atomic operations on bitfields/vector elements are very 
rarely used (if any, actually I did not see any, but it is good to have 
a working solution for all kinds of lvalues).

> This ends up being an inadvertently confusing variable name, since it ends in "six".

Ok, I'll try to improve it after our holidays.

Best regards,
Alexey Bataev
=============
Software Engineer
Intel Compiler Team

05.01.2015 11:08, John McCall пишет:
> How are you planning to implement stores for any of the non-simple l-value cases?  Compare-and-swap loops?
>
> Bitfields are interesting because IRGen actually uses larger-than-strictly-necessary accesses: if you have a struct containing 12 bytes of adjacent bitfields, we will join them all into one large i96 access.  You need to use narrower bounds than that because you need something that's guaranteed stable: you can't have one version of the compiler trying to access the bitfield with a 12-byte atomic access and another accessing it with a 4-byte access, because the atomic runtime functions don't promise that such accesses will actually be atomic w.r.t. each other.  You'll need to invent a rule here that you're willing to stick to forever.
>
> Also, both bitfields and vector elements can often be accessed more efficiently than just a libcall, depending on how much space they need.
>
>
> ================
> Comment at: test/OpenMP/atomic_read_codegen.c:28
> @@ +27,3 @@
> +typedef int v4si __attribute__((__vector_size__(16)));
> +v4si v4six;
> +
> ----------------
> This ends up being an inadvertently confusing variable name, since it ends in "six".
>
> http://reviews.llvm.org/D6431
>
> EMAIL PREFERENCES
>    http://reviews.llvm.org/settings/panel/emailpreferences/
>
>