[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
Diego Novillo
dnovillo at google.com
Thu Jan 16 10:02:45 PST 2014
On Thu, Jan 16, 2014 at 9:26 AM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Diego,
>
> It looks like the problem is with the code in the vectorizer that tries to estimate the most profitable vectorization factor:
>
>> LV: Found an estimated cost of 6 for VF 2 For instruction: %3 = load
>> i64* %state, align 8, !dbg !58, !tbaa !61
>
>
> It looks like a cost model problem. The vectorizer thinks that loading %3 (above) is non consecutive and would require scatter/gather. Is that correct? I wonder that SCEV is reporting. Is there an index overflow problem that is preventing us from loading consecutive elements?
Yes, I forgot to mention that. The access is non-consecutive:
for(i=0; i<reg->size; i++)
{
/* Flip the target bit of each basis state */
reg->node[i].state ^= ((MAX_UNSIGNED) 1 << target);
}
The code is writing to the 'state' field. The data structures look like this:
typedef struct quantum_matrix_struct quantum_matrix;
struct quantum_reg_node_struct
{
COMPLEX_FLOAT amplitude; /* alpha_j */
MAX_UNSIGNED state; /* j */
};
typedef struct quantum_reg_node_struct quantum_reg_node;
/* The quantum register */
struct quantum_reg_struct
{
int width; /* number of qubits in the qureg */
int size; /* number of non-zero vectors */
int hashw; /* width of the hash array */
quantum_reg_node *node;
int *hash;
};
If you do the trick of writing to a separate array, then the loop can
be vectorized.
Diego.
More information about the llvm-dev
mailing list