The compiler automatically peeled iterations from the vector loop into a scalar loop to align the vector loop with a particular memory reference; however, this optimization may not be ideal. To possibly achieve better performance, disable automatic peel generation using the directive: #pragma vector nodynamic_align
...
#pragma vector nodynamic_align
for (int i = 0; i < len; i++)
...void f(float * a, float * b, float * c, int len)
{
#pragma vector nodynamic_align
for (int i = 0; i < len; i++)
{
a[i] = b[i] * c[i];
}
}