Consider outer loop vectorization

The compiler never targets loops other than innermost ones, so it vectorized the inner loop while did not vectorize the outer loop. However outer loop vectorization could be more profitable because of better Memory Access Pattern, higher Trip Counts or better Dependencies profile.
To enforce outer loop vectorization:
TargetDirective
Outer loop#pragma omp simd
Inner loop#pragma novector

Given issue is only about opportunity to vectorize outer loop, to prove profitability you need perform deeper dive analysis (MAP, Trip Counts, Dependencies)

Example

#pragma omp simd
for(i=0; i<N; i++)
...
#pragma omp simd
for(i=0; i<N; i++)
{
    #pragma novector
    for(j=0; j<N; j++)
    {
        sum += A[i]*A[j];
    }
}

Read More