OpenMP calls prevent automatic vectorization when the compiler cannot move the calls outside the loop body, such as when OpenMP calls are not invariant. To fix:
- Split the OpenMP parallel loop directive into two directives.
Target Directive Outer !$OMP PARALLEL [clause[[,] clause] ... ] Inner !$OMP DO [clause[[,] clause] ... ] - Move the OpenMP calls outside the loop when possible.
!$OMP PARALLEL DO PRIVATE(tid, nthreads)
do k = 1, N
tid = omp_get_thread_num() ! this call inside loop prevents vectorization
nthreads = omp_get_num_threads() ! this call inside loop prevents vectorization
...
enddo!$OMP PARALLEL PRIVATE(tid, nthreads)
! Move OpenMP calls here
tid = omp_get_thread_num()
nthreads = omp_get_num_threads()
!$OMP DO NOWAIT
do k = 1, N
...
enddo
!$OMP END PARALLEL