Addr of buf1 = 0x7f0225bb7010
Offs of buf1 = 0x7f0225bb7180
Addr of buf2 = 0x7f0223bb6010
Offs of buf2 = 0x7f0223bb61c0
Addr of buf3 = 0x7f0221bb5010
Offs of buf3 = 0x7f0221bb5100
Addr of buf4 = 0x7f021fbb4010
Offs of buf4 = 0x7f021fbb4140
Threads #: 16 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 7.090 seconds
