Collect Roofline for all memory levels

Run the Roofline for all memory levels to get a detailed analysis of memory-bound loops/functions.
Memory-Level Roofline evaluates the traffic between each memory subsystem based on cache simulation data.