[PATCH] D38431: [ProfileData] Fix data racing in merging indexed profiles

Mon Oct 2 11:40:47 PDT 2017

On Mon, Oct 2, 2017 at 11:20 AM, Vedant Kumar <vsk at apple.com> wrote:

>
> On Oct 2, 2017, at 10:58 AM, Rong Xu <xur at google.com> wrote:
>
>
>
> On Fri, Sep 29, 2017 at 4:32 PM, Vedant Kumar via Phabricator <
> reviews at reviews.llvm.org> wrote:
>
>> vsk added inline comments.
>>
>>
>> ================
>> Comment at: lib/ProfileData/InstrProfReader.cpp:736
>>  Error IndexedInstrProfReader::readNextRecord(NamedInstrProfRecord
>> &Record) {
>> -  static unsigned RecordIndex = 0;
>> +  thread_local static unsigned RecordIndex = 0;
>>
>> ----------------
>> Why isn't this just a member variable?
>>
>> If we used "thread_local", wouldn't there still be a bug if you
>> instantiated and used two IndexedInstrProfReaders?
>>
>
> I think making this member variable would also work -- and this might be a
> better way.
>
>
> But why you think "thread_local" won't work? IndexedInstrProfReader::readNextRecord
> can not be multithreaded in either way. If we instantiate two
> IndexedInstrProfReaders, each should be in it own thread and accesses its
> own RecordIndex. There is not racing.
>
>
> I was concerned about the case where two IndexedInstrProfReaders are used
> by the same thread. Or two IndexedInstrProfReaders are created, but are
> used intermittently by two different threads. Admittedly we're not doing
> these things now, but because the API permits this type of usage, it needs
> to work properly.
>
> Yes. If two Readers are created by the same thread, then 'thread_local"
will break. But at the same time, if two threads are access the one Reader,
member variable solution will break too. API does not mention any of the
permitted usage as of now. Maybe we should document it.

>
>
>>
>>
>> ================
>> Comment at: tools/llvm-profdata/llvm-profdata.cpp:230
>>      // Load the inputs in parallel (N/NumThreads serial steps).
>> -    unsigned Ctx = 0;
>> -    for (const auto &Input : Inputs) {
>> -      Pool.async(loadInput, Input, Contexts[Ctx].get());
>> -      Ctx = (Ctx + 1) % NumThreads;
>> +    unsigned ChunkSize = Inputs.size() / NumThreads;
>> +    unsigned ChunkRemainder = Inputs.size() % NumThreads;
>> ----------------
>> I don't understand the problem here. Each time a WriterContext is
>> accessed, the accessing thread immediately acquires a unique lock on the
>> context. Where does the racing access occur?
>>
>
> Where the lock to WriterContext comes from? I only see the lock for the
> queue in ThreadPool. I thought it was the user's responsibility to make
> sure the share data is race free.
>
>
> Sorry, I see now that locking the WriterContext isn't relevant to this
> problem.
>
> I think I understand where you're coming from now. Your solution would get
> rid of the race, but I don't think users of IndexedInstrProfReader should
> have to take care to only use one instance per thread. If we can make
> "RecordIndex" a member variable, we wouldn't have to change how the API is
> used, so I'd prefer that solution.
>

Sorry I don't follow here. I was talking the race to WriterContext. It was
not clear to me that there is lock to WriterContext. Can you explain if
there is one?

Thanks,

-Rong

>
> vedant
>
>
>
>>
>>
>> https://reviews.llvm.org/D38431
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171002/ec1fc55a/attachment.html>