The compiler did not vectorize the loop as the code exceeds the compilers complexity criteria. You might get higher performance if you enforce the loop vectorization. Use a directive right before your loop block in the source code.
| IFORT Directive |
|---|
| !$OMP SIMD |
Given issue is only about opportunity to vectorize outer loop, to prove profitability you need perform deeper dive analysis (MAP, Trip Counts, Dependencies)