Force scalar remainder generation

The compiler generated a masked vectorized remainder loop that contains too few iterations for efficient vector processing. A scalar loop may be more beneficial. To fix: Force scalar remainder generation using a directive: #pragma vector novecremainder.

Example

...
// Force the compiler to not vectorize the remainder loop
#pragma vector novecremainder
for (i=0; i<n; i++)
...
void add_floats(float *a, float *b, float *c, float *d, float *e, int n)
{
    int i;
    // Force the compiler to not vectorize the remainder loop
    #pragma vector novecremainder
    for (i=0; i<n; i++)
    {
        a[i] = a[i] + b[i] + c[i] + d[i] + e[i];
    }
}

Read More