----------------------------------------
bottomup columns
----------------------------------------
access_pattern: Information About Stride Types Detected In The Site
address_distance_confidence: Distance Confidence
all_cache_misses: Number of memory load operations served by a memory subsystem higher than cache. Calculated for all loop instances (assuming "cold" CPU cache). Value is a result of virtual cache modeling, which might not match the exact counter reported by hardware for this particular run.
all_dirty_evictions: Number Of Evicted Cache Lines With a Modified State Introducing Upstream Memory Traffic to a Higher Memory Subsystem
all_eliminated_loaded_bytes: All Eliminated Loaded Bytes
all_eliminated_loads: All Eliminated Loads
all_eliminated_memory_operations: All Eliminated Memory Operations
all_eliminated_stored_bytes: All Eliminated Stored Bytes
all_eliminated_stores: All Eliminated Stores
all_eliminated_transferred_bytes: All Eliminated Transferred Bytes
all_expected_loaded_bytes: All Expected Loaded Bytes
all_expected_loads: All Expected Loads
all_expected_memory_operations: All Expected Memory Operations
all_expected_stored_bytes: All Expected Stored Bytes
all_expected_stores: All Expected Stores
all_expected_transferred_bytes: All Expected Transferred Bytes
all_memory_loads: Number Of Memory Load Operations In All Loop Instances
all_memory_stores: Number Of Memory Store Operations In All Loop Instances
all_rfo_cache_misses: Number of Read for Ownership Cache Lines (Cache Lines Loaded to the Cache Due to a Data Modification Request)
all_total_loaded_bytes: All Total Loaded Bytes
all_total_loads: All Total Loads
all_total_memory_operations: All Total Memory Operations
all_total_stored_bytes: All Total Stored Bytes
all_total_stores: All Total Stores
all_total_transferred_bytes: All Total Transferred Bytes
architecture: Instruction Set Architecture
average_trip_count: Loop Trip Count Average
cache_line_utilization: Simulated Cache Lines Utilization For Data Transfer Operations
cache_misses: Number of memory load operations served by memory subsystem higher than cache. Calculated for the first instance of the loop (assuming "cold" CPU cache). Value is a result of virtual cache modeling, which might not match exact counter reported by hardware for this particular run.
call_count: Number of Times Loop/Function Is Invoked
cfg_index_modified_in_body: Control Flow Graph Data: Index Modified In Body
cfg_jumps_outside: Control Flow Graph Data: Jumps Outside
cfg_no_index_register: Control Flow Graph Data: No Index Register
cfg_no_loop_step: Control Flow Graph Data: No Loop Step
compilation_flags: Compilation flags (per compilation unit)
compiler_estimated_gain: Theoretical Compiler Estimate of Relative Loop Performance Speed-up Achieved or Achievable Due to Vectorization
compiler_name: Compiler Name
compiler_version: Compiler Version
compute_instructions: Compute Intstructions Number
constant_accesses: Number of Memory Accesses That Consistently Change By N Elements
constant_stride_percent: Percent Of Constant Strides For Selected Site
constant_strides: Constant Strides For Selected Site
data_types: Data Types Provided by Binary Static Analysis
dirty_evictions: Number Of Evicted Cache Lines With A Modified State Introducing Upstream Memory Traffic To A Higher Memory Sub System
dynamic_abs: Dynamic abs Instructions Number
dynamic_add: Dynamic add Instructions Number
dynamic_all_instructions: Dynamic All Instructions Number
dynamic_avg: Dynamic avg Instructions Number
dynamic_avx2_scalar_compute: Dynamic avx2 Scalar Compute Instructions Number
dynamic_avx2_scalar_compute_with_memory: Dynamic avx2 Scalar Compute With Memory Instructions Number
dynamic_avx2_scalar_memory: Dynamic avx2 Scalar Memory Instructions Number
dynamic_avx2_vector_compute: Dynamic avx2 Vector Compute Instructions Number
dynamic_avx2_vector_compute_with_memory: Dynamic avx2 Vector Compute With Memory Instructions Number
dynamic_avx2_vector_memory: Dynamic avx2 Vector Memory Instructions Number
dynamic_avx512_scalar_compute: Dynamic avx512 Scalar Compute Instructions Number
dynamic_avx512_scalar_compute_with_memory: Dynamic avx512 Scalar Compute With Memory Instructions Number
dynamic_avx512_scalar_memory: Dynamic avx512 Scalar Memory Instructions Number
dynamic_avx512_vector_compute: Dynamic avx512 Vector Compute Instructions Number
dynamic_avx512_vector_compute_with_memory: Dynamic avx512 Vector Compute With Memory Instructions Number
dynamic_avx512_vector_memory: Dynamic avx512 Vector Memory Instructions Number
dynamic_avx512phi_vector_compute: Dynamic avx512phi Vector Compute Instructions Number
dynamic_avx512phi_vector_compute_with_memory: Dynamic avx512phi Vector Compute With Memory Instructions Number
dynamic_avx512phi_vector_memory: Dynamic avx512phi Vector Memory Instructions Number
dynamic_avx_scalar_compute: Dynamic avx Scalar Compute Instructions Number
dynamic_avx_scalar_compute_with_memory: Dynamic avx Scalar Compute With Memory Instructions Number
dynamic_avx_scalar_memory: Dynamic avx Scalar Memory Instructions Number
dynamic_avx_vector_compute: Dynamic avx Vector Compute Instructions Number
dynamic_avx_vector_compute_with_memory: Dynamic avx Vector Compute With Memory Instructions Number
dynamic_avx_vector_memory: Dynamic avx Vector Memory Instructions Number
dynamic_call: Dynamic call_column_descr
dynamic_compute: Dynamic Compute Instructions Number
dynamic_compute_with_memory: Dynamic Compute With Memory Instructions Number
dynamic_div: Dynamic div Instructions Number
dynamic_dp_compute: Dynamic dp_compute_column_descr
dynamic_dp_compute_with_memory: Dynamic dp_compute_with_memory_column_descr
dynamic_fma: Dynamic fma Instructions Number
dynamic_fma_scalar_compute: Dynamic fma Scalar Compute Instructions Number
dynamic_fma_scalar_compute_with_memory: Dynamic fma Scalar Compute With Memory Instructions Number
dynamic_fma_vector_compute: Dynamic fma Vector Compute Instructions Number
dynamic_fma_vector_compute_with_memory: Dynamic fma Vector Compute With Memory Instructions Number
dynamic_func_inst: Dynamic Function Instance Id
dynamic_int_compute: Dynamic int_compute_column_descr
dynamic_int_compute_with_memory: Dynamic int_compute_with_memory_column_descr
dynamic_loaded_bytes: Dynamic Loaded Bytes Number
dynamic_loads: Dynamic loads Instructions Number
dynamic_max: Dynamic max Instructions Number
dynamic_memory: Dynamic Memory Instructions Number
dynamic_min: Dynamic min Instructions Number
dynamic_mul: Dynamic mul Instructions Number
dynamic_reccp: Dynamic reccp Instructions Number
dynamic_sad: Dynamic sad Instructions Number
dynamic_scale: Dynamic scale Instructions Number
dynamic_sign: Dynamic sign Instructions Number
dynamic_sp_compute: Dynamic sp_compute_column_descr
dynamic_sp_compute_with_memory: Dynamic sp_compute_with_memory_column_descr
dynamic_sqrt: Dynamic sqrt Instructions Number
dynamic_sse_scalar_compute: Dynamic sse Scalar Compute Instructions Number
dynamic_sse_scalar_compute_with_memory: Dynamic sse Scalar Compute With Memory Instructions Number
dynamic_sse_scalar_memory: Dynamic sse Scalar Memory Instructions Number
dynamic_sse_vector_compute: Dynamic sse Vector Compute Instructions Number
dynamic_sse_vector_compute_with_memory: Dynamic sse Vector Compute With Memory Instructions Number
dynamic_sse_vector_memory: Dynamic sse Vector Memory Instructions Number
dynamic_stored_bytes: Dynamic Stored Bytes Number
dynamic_stores: Dynamic stores Instructions Number
dynamic_sub: Dynamic sub Instructions Number
dynamic_vector_compute: Dynamic Vector Compute Instructions Number
dynamic_vector_compute_with_memory: Dynamic Vector Compute With Memory Instructions Number
dynamic_vector_memory: Dynamic Vector Memory Instructions Number
dynamic_vpconflict: Dynamic vpconflict_column_descr
dynamic_vplzcnt: Dynamic vplzcnt_column_descr
dynamic_x86_scalar_compute: Dynamic x86 Scalar Compute Instructions Number
dynamic_x86_scalar_compute_with_memory: Dynamic x86 Scalar Compute With Memory Instructions Number
dynamic_x86_scalar_memory: Dynamic x86 Scalar Memory Instructions Number
dynamic_x87_scalar_compute: Dynamic x87 Scalar Compute Instructions Number
dynamic_x87_scalar_compute_with_memory: Dynamic x87 Scalar Compute With Memory Instructions Number
dynamic_x87_scalar_memory: Dynamic x87 Scalar Memory Instructions Number
efficiency: Estimate of Loop Vectorizaton Efficiency, equals (Estimated Gain/Vector Length) * 100.0
efficiency_confidence: Efficiency Confidence
eliminated_loaded_bytes: Eliminated Loaded Bytes
eliminated_loads: Eliminated Loads
eliminated_memory_operations: Eliminated Memory Operations
eliminated_stored_bytes: Eliminated Stored Bytes
eliminated_stores: Eliminated Stores
eliminated_transferred_bytes: Eliminated Transferred Bytes
expected_loaded_bytes: Expected Loaded Bytes
expected_loads: Expected Loads
expected_memory_operations: Expected Memory Operations
expected_stored_bytes: Expected Stored Bytes
expected_stores: Expected Stores
expected_transferred_bytes: Expected Transferred Bytes
first_child: First Child
first_instance_site_footprint: Memory Footprint For The First Instance Of The Loop (Calculated In Bytes)
fit_into_cache: Dataset Fits Into Cache in Given Loop
footprint_estimate: Memory Footprint Estimates For This Loop
fp_mask_utilization: Ratio of Utilized Vector Elements (dynamically collected using actual mask values during execution) to Total Vector Elements for Floating Point instructions.
Ranges from 0% (all elements are suppressed by mask) to 100% (all elements are utilized).
Elements of different size (double and single precision) are counted equally.
function: Function Name
function_call_sites_and_loops: Top-Down Call Tree Of Target Functions And Loops
function_instance_type: Function Instance Type
gain_estimate: Calculated Advisor Estimate of Relative Loop Performance Speed-up Achieved Due to Vectorization
gather_accesses: Number Of Accesses Detected for Gather Instructions on AVX2 Instruction Set
hlo_unroll_type: Unroll Type Applied By Compiler
instruction_sets: Instruction Set Architecture Usage for Individual Instructions
int_mask_utilization: Ratio of Utilized Vector Elements (dynamically collected using actual mask values during execution) to Total Vector Elements for Integer instructions.
Ranges from 0% (all elements are suppressed by mask) to 100% (all elements are utilized).
Elements of different size are counted equally.
is_compiler_vector_length: Nonzero if vector lenght provided by compiler, zero if identified by static analysis (less reliable)
is_system_module: Is System Module
is_vectorized: Is Vectorized
issue_ids: Issue Ids
iteration_duration: Average Loop Iteration Time
key_column: key_column_descr
library_name: General name of library where code resides (empty for user code or system libraries)
line: Line Number in Source File
load_constant_accesses: Number of Load Memory Accesses That Consistently Change By N Elements
load_random_accesses: Number of Load Memory Accesses That Change by an Unpredictable Number of Elements
load_uniform_accesses: Number of Load Accesses Into The Same Memory
load_unit_accesses: Number of Load Memory Accesses That Consistently Change By One Element
loop_annotation: Use this checkbox to select loops for Trip Counts and FLOP, Memory Access Patterns and Dependencies analyses
loop_carried_dependencies: Loop-Carried Dependencies: RAW - Read After Write (Flow Dependency), WAR - Write After Read (Anti Dependency), WAW - Write After Write (Output Dependency)
loop_function_id: Loop/Function instance index
loop_height: Loops with height equal to zero are innermost. Loop height decreases, the call tree nesting level increases
loop_instance_total_time: Average Loop Instance Total Time
loop_name: Loop Name
loop_overrides: Loop Overrides
main_vectorization_type: Main Vectorization Type
mangled_name: Mangled Function or Loop Name
map_executed_first_instance_iterations: First Site Instance Iterations Counted by MAP Collector
map_executed_instances: Site Instances Counted by MAP Collector
map_executed_max_iterations: Maximum Instance Site Iterations Counted by MAP Collector
map_executed_min_iterations: Minimum Instance Site Iterations Counted by MAP Collector
map_executed_total_iterations: Total Site Iterations Counted by MAP Collector
mask_utilization: Ratio of Utilized Vector Elements (dynamically collected using actual mask values during execution) to Total Vector Elements for FP and INT instructions.
Ranges from 0% (all elements are suppressed by mask) to 100% (all elements are utilized).
Elements of different size are counted equally
max_depth: Max Depth Of Loop/Function in Call-Stack
max_trip_count: Loop Trip Count Maximum
maximum_per_instruction_address_range: Maximum Distance (Among All Instances Of The Loop) Between Min And Max Memory Address Values That Were Accessed By The Loop (Calculated In Bytes)
memory_instructions: Memory Instructions Number
memory_loads: Number Of Memory Load Operations In The First Instance Of The Loop
memory_stores: Number Of Memory Store Operations In The First Instance Of The Loop
metadata_types: Metadata Types (bitmask)
min_trip_count: Loop Trip Count Minimum
module: Name Of Program Module Containing Loop
multi_pumping_factor: Loop Multi-Pumping Factor Applied by Compiler to Extend Vector Length
multiversion: Multiversioning Applied By Compiler
multiversion_type: Multiversion Type
nesting_level: Nesting Level
no_site_start_dependency: No Site Start Dependency
non_unit_stride_percent: Percent Of Non-Unit Strides For Selected Site
non_unit_strides: Non Unit Strides For Selected Site
number_of_vector_registers: Number of Vector Registers Used In Loop
offload_device_to_host: Data transferred from device to host in bytes
offload_host_to_device: Data transferred from host to device in bytese
offload_private: Data private to device in bytes
offload_shared: Data shared between host and device in bytes
optimization_details: Compiler Optimization Details
parent_id: ID of Parent Of Loop/Function in Call-Stack
performance_issues: Recommendations Related To Deeper Analysis
potential_reduction_column: potential_reduction_column_descr
potential_war_dependency: Potential WAR (Write After Read) Dependency
potential_waw_dependency: Potential WAW (Write After Write) Dependency
random_accesses: Number of Memory Accesses That Change by an Unpredictable Number of Elements
raw_dependencies: RAW (Read After Write) Dependencies
read_constant_stride_percent: Percent Of Constant Strides Among Memory Read Strides For Selected Site
read_constant_strides: Memory Read Constant Strides For Selected Site
read_non_unit_stride_percent: Percent Of Non-Unit Strides Among Memory Read Strides For Selected Site
read_non_unit_strides: Memory Read Non-Unit Strides For Selected Site
read_strides_distribution: Unit/Constant/Variable Stride Ratio Among Memory Read Strides For Selected Site
read_total_strides: Total Read Strides For Selected Site
read_unit_stride_percent: Percent Of Unit Strides Among Memory Read Strides For Selected Site
read_unit_strides: Memory Read Unit Strides For Selected Site
rfo_cache_misses: Read For Ownership - Number Of Cache Lines Loaded To The Cache Due A Data Modification Request
scatter_accesses: Number Of Accesses Detected for Scatter Instructions on AVX2 Instruction Set
self_abs: Self abs Instructions Number
self_add: Self add Instructions Number
self_ai: Self AI - Self Arithmetic Intensity - Ratio Of Self Floating-Point Operations To Self L1 Transferred Bytes
self_all_instructions: Self All Instructions Number
self_avg: Self avg Instructions Number
self_avx2_scalar_compute: Self avx2 Scalar Compute Instructions Number
self_avx2_scalar_compute_with_memory: Self avx2 Scalar Compute With Memory Instructions Number
self_avx2_scalar_memory: Self avx2 Scalar Memory Instructions Number
self_avx2_vector_compute: Self avx2 Vector Compute Instructions Number
self_avx2_vector_compute_with_memory: Self avx2 Vector Compute With Memory Instructions Number
self_avx2_vector_memory: Self avx2 Vector Memory Instructions Number
self_avx512_scalar_compute: Self avx512 Scalar Compute Instructions Number
self_avx512_scalar_compute_with_memory: Self avx512 Scalar Compute With Memory Instructions Number
self_avx512_scalar_memory: Self avx512 Scalar Memory Instructions Number
self_avx512_vector_compute: Self avx512 Vector Compute Instructions Number
self_avx512_vector_compute_with_memory: Self avx512 Vector Compute With Memory Instructions Number
self_avx512_vector_memory: Self avx512 Vector Memory Instructions Number
self_avx512phi_vector_compute: Self avx512phi Vector Compute Instructions Number
self_avx512phi_vector_compute_with_memory: Self avx512phi Vector Compute With Memory Instructions Number
self_avx512phi_vector_memory: Self avx512phi Vector Memory Instructions Number
self_avx_scalar_compute: Self avx Scalar Compute Instructions Number
self_avx_scalar_compute_with_memory: Self avx Scalar Compute With Memory Instructions Number
self_avx_scalar_memory: Self avx Scalar Memory Instructions Number
self_avx_vector_compute: Self avx Vector Compute Instructions Number
self_avx_vector_compute_with_memory: Self avx Vector Compute With Memory Instructions Number
self_avx_vector_memory: Self avx Vector Memory Instructions Number
self_call: Self call_column_descr
self_compute: Self Compute Instructions Number
self_compute_with_memory: Self Compute With Memory Instructions Number
self_div: Self div Instructions Number
self_dp_compute: Self dp_compute_column_descr
self_dp_compute_with_memory: Self dp_compute_with_memory_column_descr
self_dram_gb: Data Transferred Between Last Level Cache And DRAM In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_dram_loaded_gb: Data Loaded From DRAM In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_dram_stored_gb: Data Stored To DRAM In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_elapsed_time: Elapsed Time Is The Exclusive (Self-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Elapsed Time Is Equal To Self-Time
self_fma: Self fma Instructions Number
self_fma_scalar_compute: Self fma Scalar Compute Instructions Number
self_fma_scalar_compute_with_memory: Self fma Scalar Compute With Memory Instructions Number
self_fma_vector_compute: Self fma Vector Compute Instructions Number
self_fma_vector_compute_with_memory: Self fma Vector Compute With Memory Instructions Number
self_func_inst: Self Function Instance Id
self_gb_s: Data transfers between CPU and memory sub-system (traffic for caches and DRAM) in Giga Bytes per Second (Self GBs / Self Elapsed Time) for loop or function excluding traffic from its callees.
self_gflop: Giga Floating-Point Operations, Not Including GFLOP For Functions Called In The Loop Or Function
self_gflops: Self GFLOPS = Self GFLOP / Self Elapsed Time
self_giga_op: Giga Floating-Point Operations, Not Including GFLOP For Functions Called In The Loop Or Function
self_giga_ops: Self INT+FLOAT Giga OPS = Self INT+FLOAT Giga OP / Self Elapsed Time.
self_gintop: Giga Integer Operations, Not Including GINT For Functions Called In The Loop Or Function
self_gintops: Self GINTOPS = Self GINTOP / Self Elapsed Time
self_int_ai: Self INT AI - Self INT Arithmetic Intensity - Ratio Of Self Integer Operations To Self L1 Transferred Bytes
self_int_compute: Self int_compute_column_descr
self_int_compute_with_memory: Self int_compute_with_memory_column_descr
self_l2_gb: Data Transferred Between L1 And L2 Caches In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l2_loaded_gb: Data Loaded From L2 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l2_stored_gb: Data Stored To L2 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l3_gb: Data Transferred Between L2 and L3 Caches In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l3_loaded_gb: Data Loaded From L3 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l3_stored_gb: Data Stored To L3 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l4_gb: Data Transferred Between L3 and L4 Caches In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l4_loaded_gb: Data Loaded From L4 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l4_stored_gb: Data Stored To L4 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_loaded_bytes: Self Loaded Bytes Number
self_loaded_gb: Data Transferred From Memory Sub-System To CPU (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_loads: Self loads Instructions Number
self_max: Self max Instructions Number
self_memory: Self Memory Instructions Number
self_memory_gb: Data Transfers Between CPU And Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_min: Self min Instructions Number
self_mul: Self mul Instructions Number
self_overall_ai: Self AI - Self Arithmetic Intensity - Ratio Of Self FP and INT Operations To Self L1 Transferred Bytes.
self_reccp: Self reccp Instructions Number
self_sad: Self sad Instructions Number
self_scale: Self scale Instructions Number
self_sign: Self sign Instructions Number
self_sp_compute: Self sp_compute_column_descr
self_sp_compute_with_memory: Self sp_compute_with_memory_column_descr
self_sqrt: Self sqrt Instructions Number
self_sse_scalar_compute: Self sse Scalar Compute Instructions Number
self_sse_scalar_compute_with_memory: Self sse Scalar Compute With Memory Instructions Number
self_sse_scalar_memory: Self sse Scalar Memory Instructions Number
self_sse_vector_compute: Self sse Vector Compute Instructions Number
self_sse_vector_compute_with_memory: Self sse Vector Compute With Memory Instructions Number
self_sse_vector_memory: Self sse Vector Memory Instructions Number
self_stored_bytes: Self Stored Bytes Number
self_stored_gb: Data Transferred From CPU To Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_stores: Self stores Instructions Number
self_sub: Self sub Instructions Number
self_time: Time Actively Executing a Function or Loop, Not Including Time for Functions Called in the Loop or Function
self_time_percent: Self Time (Seconds)
self_vector_compute: Self Vector Compute Instructions Number
self_vector_compute_with_memory: Self Vector Compute With Memory Instructions Number
self_vector_memory: Self Vector Memory Instructions Number
self_vpconflict: Self vpconflict_column_descr
self_vplzcnt: Self vplzcnt_column_descr
self_x86_scalar_compute: Self x86 Scalar Compute Instructions Number
self_x86_scalar_compute_with_memory: Self x86 Scalar Compute With Memory Instructions Number
self_x86_scalar_memory: Self x86 Scalar Memory Instructions Number
self_x87_scalar_compute: Self x87 Scalar Compute Instructions Number
self_x87_scalar_compute_with_memory: Self x87 Scalar Compute With Memory Instructions Number
self_x87_scalar_memory: Self x87 Scalar Memory Instructions Number
simulated_load_memory_footprint: Simulated Load Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
simulated_memory_footprint: Simulated Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
simulated_store_memory_footprint: Simulated Store Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
site_id: Site Id
site_location: Information About Parent Function, Source File, And Line Where Site (Or Loop) Begins
site_name: Site Name If Using Source Annotations/Sequence ID If Marking Loops For Deeper Analysis In Survey Report
site_source_file: Site Source File Full Path
site_source_line: Site Source Line
source_full_path: Source Full Path
source_location: Source Location (File and Line)
source_size: source_size_descr
static_abs: Static abs Instructions Number
static_add: Static add Instructions Number
static_all_instructions: Static All Instructions Number
static_avg: Static avg Instructions Number
static_avx2_scalar_compute: Static avx2 Scalar Compute Instructions Number
static_avx2_scalar_compute_with_memory: Static avx2 Scalar Compute With Memory Instructions Number
static_avx2_scalar_memory: Static avx2 Scalar Memory Instructions Number
static_avx2_vector_compute: Static avx2 Vector Compute Instructions Number
static_avx2_vector_compute_with_memory: Static avx2 Vector Compute With Memory Instructions Number
static_avx2_vector_memory: Static avx2 Vector Memory Instructions Number
static_avx512_scalar_compute: Static avx512 Scalar Compute Instructions Number
static_avx512_scalar_compute_with_memory: Static avx512 Scalar Compute With Memory Instructions Number
static_avx512_scalar_memory: Static avx512 Scalar Memory Instructions Number
static_avx512_vector_compute: Static avx512 Vector Compute Instructions Number
static_avx512_vector_compute_with_memory: Static avx512 Vector Compute With Memory Instructions Number
static_avx512_vector_memory: Static avx512 Vector Memory Instructions Number
static_avx512phi_vector_compute: Static avx512phi Vector Compute Instructions Number
static_avx512phi_vector_compute_with_memory: Static avx512phi Vector Compute With Memory Instructions Number
static_avx512phi_vector_memory: Static avx512phi Vector Memory Instructions Number
static_avx_scalar_compute: Static avx Scalar Compute Instructions Number
static_avx_scalar_compute_with_memory: Static avx Scalar Compute With Memory Instructions Number
static_avx_scalar_memory: Static avx Scalar Memory Instructions Number
static_avx_vector_compute: Static avx Vector Compute Instructions Number
static_avx_vector_compute_with_memory: Static avx Vector Compute With Memory Instructions Number
static_avx_vector_memory: Static avx Vector Memory Instructions Number
static_call: Static call_column_descr
static_compute: Static Compute Instructions Number
static_compute_with_memory: Static Compute With Memory Instructions Number
static_div: Static div Instructions Number
static_dp_compute: Static dp_compute_column_descr
static_dp_compute_with_memory: Static dp_compute_with_memory_column_descr
static_fma: Static fma Instructions Number
static_fma_scalar_compute: Static fma Scalar Compute Instructions Number
static_fma_scalar_compute_with_memory: Static fma Scalar Compute With Memory Instructions Number
static_fma_vector_compute: Static fma Vector Compute Instructions Number
static_fma_vector_compute_with_memory: Static fma Vector Compute With Memory Instructions Number
static_func_inst: Static Function Instance Id
static_int_compute: Static int_compute_column_descr
static_int_compute_with_memory: Static int_compute_with_memory_column_descr
static_loaded_bytes: Static Loaded Bytes Number
static_loads: Static loads Instructions Number
static_max: Static max Instructions Number
static_memory: Static Memory Instructions Number
static_min: Static min Instructions Number
static_mul: Static mul Instructions Number
static_reccp: Static reccp Instructions Number
static_sad: Static sad Instructions Number
static_scale: Static scale Instructions Number
static_sign: Static sign Instructions Number
static_sp_compute: Static sp_compute_column_descr
static_sp_compute_with_memory: Static sp_compute_with_memory_column_descr
static_sqrt: Static sqrt Instructions Number
static_sse_scalar_compute: Static sse Scalar Compute Instructions Number
static_sse_scalar_compute_with_memory: Static sse Scalar Compute With Memory Instructions Number
static_sse_scalar_memory: Static sse Scalar Memory Instructions Number
static_sse_vector_compute: Static sse Vector Compute Instructions Number
static_sse_vector_compute_with_memory: Static sse Vector Compute With Memory Instructions Number
static_sse_vector_memory: Static sse Vector Memory Instructions Number
static_stored_bytes: Static Stored Bytes Number
static_stores: Static stores Instructions Number
static_sub: Static sub Instructions Number
static_vector_compute: Static Vector Compute Instructions Number
static_vector_compute_with_memory: Static Vector Compute With Memory Instructions Number
static_vector_memory: Static Vector Memory Instructions Number
static_vpconflict: Static vpconflict_column_descr
static_vplzcnt: Static vplzcnt_column_descr
static_x86_scalar_compute: Static x86 Scalar Compute Instructions Number
static_x86_scalar_compute_with_memory: Static x86 Scalar Compute With Memory Instructions Number
static_x86_scalar_memory: Static x86 Scalar Memory Instructions Number
static_x87_scalar_compute: Static x87 Scalar Compute Instructions Number
static_x87_scalar_compute_with_memory: Static x87 Scalar Compute With Memory Instructions Number
static_x87_scalar_memory: Static x87 Scalar Memory Instructions Number
store_constant_accesses: Number of Store Memory Accesses That Consistently Change By N Elements
store_random_accesses: Number of Store Memory Accesses That Change by an Unpredictable Number of Elements
store_uniform_accesses: Number of Store Memory Accesses Into The Same Memory
store_unit_accesses: Number of Store Memory Accesses That Consistently Change By One Element
strides_distribution: Unit/Constant/Variable Stride Ratio For Selected Site
thread: Thread Name or ID
thread_count: Number Of Executed Threads
total_abs: Total abs Instructions Number
total_add: Total add Instructions Number
total_all_instructions: Total All Instructions Number
total_arithmetic_intensity: Total AI - Total Arithmetic Intensity - Ratio Of Total Floating-Point Operations To Total L1 Transferred Bytes
total_avg: Total avg Instructions Number
total_avx2_scalar_compute: Total avx2 Scalar Compute Instructions Number
total_avx2_scalar_compute_with_memory: Total avx2 Scalar Compute With Memory Instructions Number
total_avx2_scalar_memory: Total avx2 Scalar Memory Instructions Number
total_avx2_vector_compute: Total avx2 Vector Compute Instructions Number
total_avx2_vector_compute_with_memory: Total avx2 Vector Compute With Memory Instructions Number
total_avx2_vector_memory: Total avx2 Vector Memory Instructions Number
total_avx512_scalar_compute: Total avx512 Scalar Compute Instructions Number
total_avx512_scalar_compute_with_memory: Total avx512 Scalar Compute With Memory Instructions Number
total_avx512_scalar_memory: Total avx512 Scalar Memory Instructions Number
total_avx512_vector_compute: Total avx512 Vector Compute Instructions Number
total_avx512_vector_compute_with_memory: Total avx512 Vector Compute With Memory Instructions Number
total_avx512_vector_memory: Total avx512 Vector Memory Instructions Number
total_avx512phi_vector_compute: Total avx512phi Vector Compute Instructions Number
total_avx512phi_vector_compute_with_memory: Total avx512phi Vector Compute With Memory Instructions Number
total_avx512phi_vector_memory: Total avx512phi Vector Memory Instructions Number
total_avx_scalar_compute: Total avx Scalar Compute Instructions Number
total_avx_scalar_compute_with_memory: Total avx Scalar Compute With Memory Instructions Number
total_avx_scalar_memory: Total avx Scalar Memory Instructions Number
total_avx_vector_compute: Total avx Vector Compute Instructions Number
total_avx_vector_compute_with_memory: Total avx Vector Compute With Memory Instructions Number
total_avx_vector_memory: Total avx Vector Memory Instructions Number
total_call: Total call_column_descr
total_compute: Total Compute Instructions Number
total_compute_with_memory: Total Compute With Memory Instructions Number
total_div: Total div Instructions Number
total_dp_compute: Total dp_compute_column_descr
total_dp_compute_with_memory: Total dp_compute_with_memory_column_descr
total_dram_gb: Data Transferred Between Last Level Cache And DRAM In Giga Bytes Of Function/Loop And Its Callees
total_dram_loaded_gb: Data Loaded From DRAM In Giga Bytes Of Function/Loop And Its Callees
total_dram_stored_gb: Data Stored To DRAM In Giga Bytes Of Function/Loop And Its Callees
total_elapsed_time: Total Elapsed Time Is The Inclusive (Total-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Total Elapsed Time Is Equal To Total-Time
total_fma: Total fma Instructions Number
total_fma_scalar_compute: Total fma Scalar Compute Instructions Number
total_fma_scalar_compute_with_memory: Total fma Scalar Compute With Memory Instructions Number
total_fma_vector_compute: Total fma Vector Compute Instructions Number
total_fma_vector_compute_with_memory: Total fma Vector Compute With Memory Instructions Number
total_func_inst: Total Function Instance Id
total_gb_s: Data transfers between CPU and memory sub-system (traffic for caches and DRAM) in Giga Bytes Per Second (Total GBs / Total Elapsed Time) for loop or function and its callees.
total_gflop: Giga Floating-Point Operations Of Function/Loop And Its Callees
total_gflops: Total GFLOPS = Total GFLOP / Total Elapsed Time
total_giga_op: Total GINTOPS = Total GINTOP / Total Elapsed Time
total_giga_ops: Total INT+FLOAT Giga OPS = Total INT+FLOAT Giga OP / Total Elapsed Time.
total_gintop: Giga Integer Operations Of Function/Loop And Its Callees
total_gintops: Total GINTOPS = Total GINTOP / Total Elapsed Time
total_instructions: Total Instructions Number
total_int_ai: Total AI - Total Arithmetic Intensity - Ratio Of Total Integer Operations To Total L1 Transferred Bytes
total_int_compute: Total int_compute_column_descr
total_int_compute_with_memory: Total int_compute_with_memory_column_descr
total_l2_gb: Data Transferred Between L1 And L2 Caches In Giga Bytes Of Function/Loop And Its Callees
total_l2_loaded_gb: Data Loaded From L2 In Giga Bytes Of Function/Loop And Its Callees
total_l2_stored_gb: Data Stored To L2 In Giga Bytes Of Function/Loop And Its Callees
total_l3_gb: Data Transferred Between L2 And L3 Caches In Giga Bytes Of Function/Loop And Its Callees
total_l3_loaded_gb: Data Loaded From L3 In Giga Bytes Of Function/Loop And Its Callees
total_l3_stored_gb: Data Stored To L3 In Giga Bytes Of Function/Loop And Its Callees
total_l4_gb: Data Transferred Between L3 And L4 Caches In Giga Bytes Of Function/Loop And Its Callees
total_l4_loaded_gb: Data Loaded From L4 In Giga Bytes Of Function/Loop And Its Callees
total_l4_stored_gb: Data Stored To L4 In Giga Bytes Of Function/Loop And Its Callees
total_loaded_bytes: Total Loaded Bytes
total_loaded_gb: Data Transferred From Memory Sub-System To CPU (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes Of Function/Loop And Its Callees
total_loads: Total Loads
total_max: Total max Instructions Number
total_memory: Total Memory Instructions Number
total_memory_gb: Data Transfers Between CPU And Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes Of Function/Loop And Its Callees
total_memory_operations: Total Memory Operations
total_min: Total min Instructions Number
total_mul: Total mul Instructions Number
total_overall_ai: Total AI - Total Arithmetic Intensity - Ratio Of Total FP and INT Operations To Total L1 Transferred Bytes.
total_reccp: Total reccp Instructions Number
total_sad: Total sad Instructions Number
total_scale: Total scale Instructions Number
total_sign: Total sign Instructions Number
total_simulated_load_memory_footprint: Simulated Load Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_simulated_memory_footprint: Simulated Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_simulated_store_memory_footprint: Simulated Store Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_sp_compute: Total sp_compute_column_descr
total_sp_compute_with_memory: Total sp_compute_with_memory_column_descr
total_sqrt: Total sqrt Instructions Number
total_sse_scalar_compute: Total sse Scalar Compute Instructions Number
total_sse_scalar_compute_with_memory: Total sse Scalar Compute With Memory Instructions Number
total_sse_scalar_memory: Total sse Scalar Memory Instructions Number
total_sse_vector_compute: Total sse Vector Compute Instructions Number
total_sse_vector_compute_with_memory: Total sse Vector Compute With Memory Instructions Number
total_sse_vector_memory: Total sse Vector Memory Instructions Number
total_stored_bytes: Total Stored Bytes
total_stored_gb: Data Transferred From CPU To Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes Of Function/Loop And Its Callees
total_stores: Total Stores
total_strides: Total Strides
total_sub: Total sub Instructions Number
total_time: Time Actively Executing Function/Loop and Its Callees (All Threads)
total_time_percent: % of Time Actively Executing Function/Loop and its callees. Starts at 100% (top)
total_transferred_bytes: Total Transferred Bytes
total_vector_compute: Total Vector Compute Instructions Number
total_vector_compute_with_memory: Total Vector Compute With Memory Instructions Number
total_vector_memory: Total Vector Memory Instructions Number
total_vpconflict: Total vpconflict_column_descr
total_vplzcnt: Total vplzcnt_column_descr
total_x86_scalar_compute: Total x86 Scalar Compute Instructions Number
total_x86_scalar_compute_with_memory: Total x86 Scalar Compute With Memory Instructions Number
total_x86_scalar_memory: Total x86 Scalar Memory Instructions Number
total_x87_scalar_compute: Total x87 Scalar Compute Instructions Number
total_x87_scalar_compute_with_memory: Total x87 Scalar Compute With Memory Instructions Number
total_x87_scalar_memory: Total x87 Scalar Memory Instructions Number
traits: Scalar and Vectorization Characteristics That May Impact Performance
transformation_ids: Optimization Transformations Ids By Compiler
transformations: Transformations Applied By Compiler
trip_count_reliable: Mark if Trip Count Data IS Reliable By Collector
trip_count_total: Total Trip Count
type: Problem Type of Problem(s) in Problem Set
uniform_accesses: Number of Accesses Into The Same Memory
unique_index: Unique Index
unit_accesses: Number of Memory Accesses That Consistently Change By One Element
unit_stride_percent: Percent Of Unit Strides For Selected Site
unit_strides: Unit Strides For Selected Site
unroll_factor: Loop Unroll Factor Applied By Compiler
used_stack_instructions: Used Stack Instructions
vector_isa: Best Vector Instruction Set Architecture Type
vector_length: Loop Vector Length Estimated by Binary Static Analysis or Intel Compiler (Available Only for Intel Compiler 16.x or Higher)
vector_widths: Vector Widths In Bits
vectorization_details: Compiler Notes On Vectorization
vectorization_message_code: Vectorization Message Code By Compiler
vectorization_overhead: Vectorization Overhead Estimation By Compiler
vectorization_trip_count_type: Vectorization Trip Count Type By Compiler
war_dependencies: WAR (Write After Read) Dependencies
waw_dependencies: WAW (Write After Write) Dependencies
why_no_vectorization: Intel Compiler Reason Why Loop Was Not Vectorized
write_constant_stride_percent: Percent Of Constant Strides Among Memory Write Strides For Selected Site
write_constant_strides: Memory Write Constant Strides For Selected Site
write_non_unit_stride_percent: Percent Of Non-Unit Strides Among Memory Write Strides For Selected Site
write_non_unit_strides: Memory Write Non-Unit Strides For Selected Site
write_strides_distribution: Unit/Constant/Variable Stride Ratio Among Memory Write Strides For Selected Site
write_total_strides: Total Write Strides For Selected Site
write_unit_stride_percent: Percent Of Unit Strides Among Memory Write Strides For Selected Site
write_unit_strides: Memory Write Unit Strides For Selected Site
----------------------------------------
topdown columns
----------------------------------------
average_trip_count: Loop Trip Count Average
call_count: Number of Times Loop/Function Is Invoked
call_order_column: call_order_column_descr
compiler_estimated_gain: Theoretical Compiler Estimate of Relative Loop Performance Speed-up Achieved or Achievable Due to Vectorization
compiler_name: Compiler Name
compiler_version: Compiler Version
data_types: Data Types Provided by Binary Static Analysis
function_call_sites_and_loops: Top-Down Call Tree Of Target Functions And Loops
function_instance_type: Function Instance Type
function_type: Function Type (0 - function, 2 - loop)
hlo_unroll_type: Unroll Type Applied By Compiler
instruction_sets: Instruction Set Architecture Usage for Individual Instructions
is_compiler_vector_length: Nonzero if vector lenght provided by compiler, zero if identified by static analysis (less reliable)
is_vectorized: Is Vectorized
iteration_duration: Average Loop Iteration Time
key_column: key_column_descr
library_name: General name of library where code resides (empty for user code or system libraries)
line: Line Number in Source File
loop_function_id: Loop/Function instance index
loop_height: Loops with height equal to zero are innermost. Loop height decreases, the call tree nesting level increases
main_vectorization_type: Main Vectorization Type
mangled_name: Mangled Function or Loop Name
max_depth: Max Depth Of Loop/Function in Call-Stack
max_trip_count: Loop Trip Count Maximum
metadata_types: Metadata Types (bitmask)
min_trip_count: Loop Trip Count Minimum
module: Name Of Program Module Containing Loop
multi_pumping_factor: Loop Multi-Pumping Factor Applied by Compiler to Extend Vector Length
nesting_level: Nesting Level
optimization_details: Compiler Optimization Details
self_abs: Self abs Instructions Number
self_add: Self add Instructions Number
self_ai: Self AI - Self Arithmetic Intensity - Ratio Of Self Floating-Point Operations To Self L1 Transferred Bytes
self_all_instructions: Self All Instructions Number
self_avg: Self avg Instructions Number
self_avx2_scalar_compute: Self avx2 Scalar Compute Instructions Number
self_avx2_scalar_compute_with_memory: Self avx2 Scalar Compute With Memory Instructions Number
self_avx2_scalar_memory: Self avx2 Scalar Memory Instructions Number
self_avx2_vector_compute: Self avx2 Vector Compute Instructions Number
self_avx2_vector_compute_with_memory: Self avx2 Vector Compute With Memory Instructions Number
self_avx2_vector_memory: Self avx2 Vector Memory Instructions Number
self_avx512_scalar_compute: Self avx512 Scalar Compute Instructions Number
self_avx512_scalar_compute_with_memory: Self avx512 Scalar Compute With Memory Instructions Number
self_avx512_scalar_memory: Self avx512 Scalar Memory Instructions Number
self_avx512_vector_compute: Self avx512 Vector Compute Instructions Number
self_avx512_vector_compute_with_memory: Self avx512 Vector Compute With Memory Instructions Number
self_avx512_vector_memory: Self avx512 Vector Memory Instructions Number
self_avx512phi_vector_compute: Self avx512phi Vector Compute Instructions Number
self_avx512phi_vector_compute_with_memory: Self avx512phi Vector Compute With Memory Instructions Number
self_avx512phi_vector_memory: Self avx512phi Vector Memory Instructions Number
self_avx_scalar_compute: Self avx Scalar Compute Instructions Number
self_avx_scalar_compute_with_memory: Self avx Scalar Compute With Memory Instructions Number
self_avx_scalar_memory: Self avx Scalar Memory Instructions Number
self_avx_vector_compute: Self avx Vector Compute Instructions Number
self_avx_vector_compute_with_memory: Self avx Vector Compute With Memory Instructions Number
self_avx_vector_memory: Self avx Vector Memory Instructions Number
self_call: Self call_column_descr
self_compute: Self Compute Instructions Number
self_compute_with_memory: Self Compute With Memory Instructions Number
self_div: Self div Instructions Number
self_dp_compute: Self dp_compute_column_descr
self_dp_compute_with_memory: Self dp_compute_with_memory_column_descr
self_dram_gb: Data Transferred Between Last Level Cache And DRAM In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_dram_loaded_gb: Data Loaded From DRAM In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_dram_stored_gb: Data Stored To DRAM In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_elapsed_time: Elapsed Time Is The Exclusive (Self-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Elapsed Time Is Equal To Self-Time
self_fma: Self fma Instructions Number
self_fma_scalar_compute: Self fma Scalar Compute Instructions Number
self_fma_scalar_compute_with_memory: Self fma Scalar Compute With Memory Instructions Number
self_fma_vector_compute: Self fma Vector Compute Instructions Number
self_fma_vector_compute_with_memory: Self fma Vector Compute With Memory Instructions Number
self_func_inst: Self Function Instance Id
self_gb_s: Data transfers between CPU and memory sub-system (traffic for caches and DRAM) in Giga Bytes per Second (Self GBs / Self Elapsed Time) for loop or function excluding traffic from its callees.
self_gflop: Giga Floating-Point Operations, Not Including GFLOP For Functions Called In The Loop Or Function
self_gflops: Self GFLOPS = Self GFLOP / Self Elapsed Time
self_giga_op: Giga Floating-Point Operations, Not Including GFLOP For Functions Called In The Loop Or Function
self_giga_ops: Self INT+FLOAT Giga OPS = Self INT+FLOAT Giga OP / Self Elapsed Time.
self_gintop: Giga Integer Operations, Not Including GINT For Functions Called In The Loop Or Function
self_gintops: Self GINTOPS = Self GINTOP / Self Elapsed Time
self_int_ai: Self INT AI - Self INT Arithmetic Intensity - Ratio Of Self Integer Operations To Self L1 Transferred Bytes
self_int_compute: Self int_compute_column_descr
self_int_compute_with_memory: Self int_compute_with_memory_column_descr
self_l2_gb: Data Transferred Between L1 And L2 Caches In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l2_loaded_gb: Data Loaded From L2 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l2_stored_gb: Data Stored To L2 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l3_gb: Data Transferred Between L2 and L3 Caches In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l3_loaded_gb: Data Loaded From L3 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l3_stored_gb: Data Stored To L3 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l4_gb: Data Transferred Between L3 and L4 Caches In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l4_loaded_gb: Data Loaded From L4 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_l4_stored_gb: Data Stored To L4 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_loaded_bytes: Self Loaded Bytes Number
self_loaded_gb: Data Transferred From Memory Sub-System To CPU (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_loads: Self loads Instructions Number
self_max: Self max Instructions Number
self_memory: Self Memory Instructions Number
self_memory_gb: Data Transfers Between CPU And Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_min: Self min Instructions Number
self_mul: Self mul Instructions Number
self_overall_ai: Self AI - Self Arithmetic Intensity - Ratio Of Self FP and INT Operations To Self L1 Transferred Bytes.
self_reccp: Self reccp Instructions Number
self_sad: Self sad Instructions Number
self_scale: Self scale Instructions Number
self_sign: Self sign Instructions Number
self_sp_compute: Self sp_compute_column_descr
self_sp_compute_with_memory: Self sp_compute_with_memory_column_descr
self_sqrt: Self sqrt Instructions Number
self_sse_scalar_compute: Self sse Scalar Compute Instructions Number
self_sse_scalar_compute_with_memory: Self sse Scalar Compute With Memory Instructions Number
self_sse_scalar_memory: Self sse Scalar Memory Instructions Number
self_sse_vector_compute: Self sse Vector Compute Instructions Number
self_sse_vector_compute_with_memory: Self sse Vector Compute With Memory Instructions Number
self_sse_vector_memory: Self sse Vector Memory Instructions Number
self_stored_bytes: Self Stored Bytes Number
self_stored_gb: Data Transferred From CPU To Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function
self_stores: Self stores Instructions Number
self_sub: Self sub Instructions Number
self_time: Time Actively Executing a Function or Loop, Not Including Time for Functions Called in the Loop or Function
self_time_percent: Self Time (Seconds)
self_vector_compute: Self Vector Compute Instructions Number
self_vector_compute_with_memory: Self Vector Compute With Memory Instructions Number
self_vector_memory: Self Vector Memory Instructions Number
self_vpconflict: Self vpconflict_column_descr
self_vplzcnt: Self vplzcnt_column_descr
self_x86_scalar_compute: Self x86 Scalar Compute Instructions Number
self_x86_scalar_compute_with_memory: Self x86 Scalar Compute With Memory Instructions Number
self_x86_scalar_memory: Self x86 Scalar Memory Instructions Number
self_x87_scalar_compute: Self x87 Scalar Compute Instructions Number
self_x87_scalar_compute_with_memory: Self x87 Scalar Compute With Memory Instructions Number
self_x87_scalar_memory: Self x87 Scalar Memory Instructions Number
source_location: Source Location (File and Line)
source_size: source_size_descr
total_abs: Total abs Instructions Number
total_add: Total add Instructions Number
total_all_instructions: Total All Instructions Number
total_arithmetic_intensity: Total AI - Total Arithmetic Intensity - Ratio Of Total Floating-Point Operations To Total L1 Transferred Bytes
total_avg: Total avg Instructions Number
total_avx2_scalar_compute: Total avx2 Scalar Compute Instructions Number
total_avx2_scalar_compute_with_memory: Total avx2 Scalar Compute With Memory Instructions Number
total_avx2_scalar_memory: Total avx2 Scalar Memory Instructions Number
total_avx2_vector_compute: Total avx2 Vector Compute Instructions Number
total_avx2_vector_compute_with_memory: Total avx2 Vector Compute With Memory Instructions Number
total_avx2_vector_memory: Total avx2 Vector Memory Instructions Number
total_avx512_scalar_compute: Total avx512 Scalar Compute Instructions Number
total_avx512_scalar_compute_with_memory: Total avx512 Scalar Compute With Memory Instructions Number
total_avx512_scalar_memory: Total avx512 Scalar Memory Instructions Number
total_avx512_vector_compute: Total avx512 Vector Compute Instructions Number
total_avx512_vector_compute_with_memory: Total avx512 Vector Compute With Memory Instructions Number
total_avx512_vector_memory: Total avx512 Vector Memory Instructions Number
total_avx512phi_vector_compute: Total avx512phi Vector Compute Instructions Number
total_avx512phi_vector_compute_with_memory: Total avx512phi Vector Compute With Memory Instructions Number
total_avx512phi_vector_memory: Total avx512phi Vector Memory Instructions Number
total_avx_scalar_compute: Total avx Scalar Compute Instructions Number
total_avx_scalar_compute_with_memory: Total avx Scalar Compute With Memory Instructions Number
total_avx_scalar_memory: Total avx Scalar Memory Instructions Number
total_avx_vector_compute: Total avx Vector Compute Instructions Number
total_avx_vector_compute_with_memory: Total avx Vector Compute With Memory Instructions Number
total_avx_vector_memory: Total avx Vector Memory Instructions Number
total_call: Total call_column_descr
total_compute: Total Compute Instructions Number
total_compute_with_memory: Total Compute With Memory Instructions Number
total_div: Total div Instructions Number
total_dp_compute: Total dp_compute_column_descr
total_dp_compute_with_memory: Total dp_compute_with_memory_column_descr
total_dram_gb: Data Transferred Between Last Level Cache And DRAM In Giga Bytes Of Function/Loop And Its Callees
total_dram_loaded_gb: Data Loaded From DRAM In Giga Bytes Of Function/Loop And Its Callees
total_dram_stored_gb: Data Stored To DRAM In Giga Bytes Of Function/Loop And Its Callees
total_elapsed_time: Total Elapsed Time Is The Inclusive (Total-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Total Elapsed Time Is Equal To Total-Time
total_fma: Total fma Instructions Number
total_fma_scalar_compute: Total fma Scalar Compute Instructions Number
total_fma_scalar_compute_with_memory: Total fma Scalar Compute With Memory Instructions Number
total_fma_vector_compute: Total fma Vector Compute Instructions Number
total_fma_vector_compute_with_memory: Total fma Vector Compute With Memory Instructions Number
total_func_inst: Total Function Instance Id
total_gb_s: Data transfers between CPU and memory sub-system (traffic for caches and DRAM) in Giga Bytes Per Second (Total GBs / Total Elapsed Time) for loop or function and its callees.
total_gflop: Giga Floating-Point Operations Of Function/Loop And Its Callees
total_gflops: Total GFLOPS = Total GFLOP / Total Elapsed Time
total_giga_op: Total GINTOPS = Total GINTOP / Total Elapsed Time
total_giga_ops: Total INT+FLOAT Giga OPS = Total INT+FLOAT Giga OP / Total Elapsed Time.
total_gintop: Giga Integer Operations Of Function/Loop And Its Callees
total_gintops: Total GINTOPS = Total GINTOP / Total Elapsed Time
total_int_ai: Total AI - Total Arithmetic Intensity - Ratio Of Total Integer Operations To Total L1 Transferred Bytes
total_int_compute: Total int_compute_column_descr
total_int_compute_with_memory: Total int_compute_with_memory_column_descr
total_l2_gb: Data Transferred Between L1 And L2 Caches In Giga Bytes Of Function/Loop And Its Callees
total_l2_loaded_gb: Data Loaded From L2 In Giga Bytes Of Function/Loop And Its Callees
total_l2_stored_gb: Data Stored To L2 In Giga Bytes Of Function/Loop And Its Callees
total_l3_gb: Data Transferred Between L2 And L3 Caches In Giga Bytes Of Function/Loop And Its Callees
total_l3_loaded_gb: Data Loaded From L3 In Giga Bytes Of Function/Loop And Its Callees
total_l3_stored_gb: Data Stored To L3 In Giga Bytes Of Function/Loop And Its Callees
total_l4_gb: Data Transferred Between L3 And L4 Caches In Giga Bytes Of Function/Loop And Its Callees
total_l4_loaded_gb: Data Loaded From L4 In Giga Bytes Of Function/Loop And Its Callees
total_l4_stored_gb: Data Stored To L4 In Giga Bytes Of Function/Loop And Its Callees
total_loaded_bytes: Total Loaded Bytes
total_loaded_gb: Data Transferred From Memory Sub-System To CPU (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes Of Function/Loop And Its Callees
total_loads: Total Loads
total_max: Total max Instructions Number
total_memory: Total Memory Instructions Number
total_memory_gb: Data Transfers Between CPU And Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes Of Function/Loop And Its Callees
total_min: Total min Instructions Number
total_mul: Total mul Instructions Number
total_overall_ai: Total AI - Total Arithmetic Intensity - Ratio Of Total FP and INT Operations To Total L1 Transferred Bytes.
total_reccp: Total reccp Instructions Number
total_sad: Total sad Instructions Number
total_scale: Total scale Instructions Number
total_sign: Total sign Instructions Number
total_sp_compute: Total sp_compute_column_descr
total_sp_compute_with_memory: Total sp_compute_with_memory_column_descr
total_sqrt: Total sqrt Instructions Number
total_sse_scalar_compute: Total sse Scalar Compute Instructions Number
total_sse_scalar_compute_with_memory: Total sse Scalar Compute With Memory Instructions Number
total_sse_scalar_memory: Total sse Scalar Memory Instructions Number
total_sse_vector_compute: Total sse Vector Compute Instructions Number
total_sse_vector_compute_with_memory: Total sse Vector Compute With Memory Instructions Number
total_sse_vector_memory: Total sse Vector Memory Instructions Number
total_stored_bytes: Total Stored Bytes
total_stored_gb: Data Transferred From CPU To Memory Sub-System (Total Traffic, Including L1, L2, LLC, And DRAM Traffic) In Giga Bytes Of Function/Loop And Its Callees
total_stores: Total Stores
total_sub: Total sub Instructions Number
total_time: Time Actively Executing Function/Loop and Its Callees (All Threads)
total_time_percent: % of Time Actively Executing Function/Loop and its callees. Starts at 100% (top)
total_vector_compute: Total Vector Compute Instructions Number
total_vector_compute_with_memory: Total Vector Compute With Memory Instructions Number
total_vector_memory: Total Vector Memory Instructions Number
total_vpconflict: Total vpconflict_column_descr
total_vplzcnt: Total vplzcnt_column_descr
total_x86_scalar_compute: Total x86 Scalar Compute Instructions Number
total_x86_scalar_compute_with_memory: Total x86 Scalar Compute With Memory Instructions Number
total_x86_scalar_memory: Total x86 Scalar Memory Instructions Number
total_x87_scalar_compute: Total x87 Scalar Compute Instructions Number
total_x87_scalar_compute_with_memory: Total x87 Scalar Compute With Memory Instructions Number
total_x87_scalar_memory: Total x87 Scalar Memory Instructions Number
traits: Scalar and Vectorization Characteristics That May Impact Performance
transformations: Transformations Applied By Compiler
trip_count_reliable: Mark if Trip Count Data IS Reliable By Collector
trip_count_total: Total Trip Count
type: Problem Type of Problem(s) in Problem Set
unique_index: Unique Index
unroll_factor: Loop Unroll Factor Applied By Compiler
vector_isa: Best Vector Instruction Set Architecture Type
vector_length: Loop Vector Length Estimated by Binary Static Analysis or Intel Compiler (Available Only for Intel Compiler 16.x or Higher)
vector_widths: Vector Widths In Bits
vectorization_details: Compiler Notes On Vectorization
vectorization_trip_count_type: Vectorization Trip Count Type By Compiler
why_no_vectorization: Intel Compiler Reason Why Loop Was Not Vectorized
----------------------------------------
dependencies columns
----------------------------------------
access_footprint: Maximum distance (among all instances of the loop) between Min And Max memory address values accessed by the instructions. Generated from the current source line with the current Stride Type. (Calculated In bytes.)
access_pattern: Information About Stride Types Detected In The Site
address_distance_confidence: Distance Confidence
all_cache_misses: Number of memory load operations served by a memory subsystem higher than cache. Calculated for all loop instances (assuming "cold" CPU cache). Value is a result of virtual cache modeling, which might not match the exact counter reported by hardware for this particular run.
all_dirty_evictions: Number Of Evicted Cache Lines With a Modified State Introducing Upstream Memory Traffic to a Higher Memory Subsystem
all_eliminated_loaded_bytes: All Eliminated Loaded Bytes
all_eliminated_loads: All Eliminated Loads
all_eliminated_memory_operations: All Eliminated Memory Operations
all_eliminated_stored_bytes: All Eliminated Stored Bytes
all_eliminated_stores: All Eliminated Stores
all_eliminated_transferred_bytes: All Eliminated Transferred Bytes
all_expected_loaded_bytes: All Expected Loaded Bytes
all_expected_loads: All Expected Loads
all_expected_memory_operations: All Expected Memory Operations
all_expected_stored_bytes: All Expected Stored Bytes
all_expected_stores: All Expected Stores
all_expected_transferred_bytes: All Expected Transferred Bytes
all_memory_loads: Number Of Memory Load Operations In All Loop Instances
all_memory_stores: Number Of Memory Store Operations In All Loop Instances
all_rfo_cache_misses: Number of Read for Ownership Cache Lines (Cache Lines Loaded to the Cache Due to a Data Modification Request)
all_total_loaded_bytes: All Total Loaded Bytes
all_total_loads: All Total Loads
all_total_memory_operations: All Total Memory Operations
all_total_stored_bytes: All Total Stored Bytes
all_total_stores: All Total Stores
all_total_transferred_bytes: All Total Transferred Bytes
cache_line_utilization: Simulated Cache Lines Utilization For Data Transfer Operations
cache_misses: Number of memory load operations served by memory subsystem higher than cache. Calculated for the first instance of the loop (assuming "cold" CPU cache). Value is a result of virtual cache modeling, which might not match exact counter reported by hardware for this particular run.
constant_accesses: Number of Memory Accesses That Consistently Change By N Elements
constant_stride_percent: Percent Of Constant Strides For Selected Site
constant_strides: Constant Strides For Selected Site
dirty_evictions: Number Of Evicted Cache Lines With A Modified State Introducing Upstream Memory Traffic To A Higher Memory Sub System
eliminated_loaded_bytes: Eliminated Loaded Bytes
eliminated_loads: Eliminated Loads
eliminated_memory_operations: Eliminated Memory Operations
eliminated_stored_bytes: Eliminated Stored Bytes
eliminated_stores: Eliminated Stores
eliminated_transferred_bytes: Eliminated Transferred Bytes
expected_loaded_bytes: Expected Loaded Bytes
expected_loads: Expected Loads
expected_memory_operations: Expected Memory Operations
expected_stored_bytes: Expected Stored Bytes
expected_stores: Expected Stores
expected_transferred_bytes: Expected Transferred Bytes
first_instance_site_footprint: Memory Footprint For The First Instance Of The Loop (Calculated In Bytes)
fit_into_cache: Dataset Fits Into Cache in Given Loop
footprint_estimate: Memory Footprint Estimates For This Loop
function: Function Name
gather_accesses: Number Of Accesses Detected for Gather Instructions on AVX2 Instruction Set
is_vectorized: Is Vectorized
load_constant_accesses: Number of Load Memory Accesses That Consistently Change By N Elements
load_random_accesses: Number of Load Memory Accesses That Change by an Unpredictable Number of Elements
load_uniform_accesses: Number of Load Accesses Into The Same Memory
load_unit_accesses: Number of Load Memory Accesses That Consistently Change By One Element
map_executed_first_instance_iterations: First Site Instance Iterations Counted by MAP Collector
map_executed_instances: Site Instances Counted by MAP Collector
map_executed_max_iterations: Maximum Instance Site Iterations Counted by MAP Collector
map_executed_min_iterations: Minimum Instance Site Iterations Counted by MAP Collector
map_executed_total_iterations: Total Site Iterations Counted by MAP Collector
memory_loads: Number Of Memory Load Operations In The First Instance Of The Loop
memory_stores: Number Of Memory Store Operations In The First Instance Of The Loop
no_site_start_dependency: No Site Start Dependency
non_unit_stride_percent: Percent Of Non-Unit Strides For Selected Site
non_unit_strides: Non Unit Strides For Selected Site
potential_reduction_column: potential_reduction_column_descr
potential_war_dependency: Potential WAR (Write After Read) Dependency
potential_waw_dependency: Potential WAW (Write After Write) Dependency
random_accesses: Number of Memory Accesses That Change by an Unpredictable Number of Elements
raw_dependencies: RAW (Read After Write) Dependencies
read_constant_stride_percent: Percent Of Constant Strides Among Memory Read Strides For Selected Site
read_constant_strides: Memory Read Constant Strides For Selected Site
read_non_unit_stride_percent: Percent Of Non-Unit Strides Among Memory Read Strides For Selected Site
read_non_unit_strides: Memory Read Non-Unit Strides For Selected Site
read_total_strides: Total Read Strides For Selected Site
read_unit_stride_percent: Percent Of Unit Strides Among Memory Read Strides For Selected Site
read_unit_strides: Memory Read Unit Strides For Selected Site
rfo_cache_misses: Read For Ownership - Number Of Cache Lines Loaded To The Cache Due A Data Modification Request
scatter_accesses: Number Of Accesses Detected for Scatter Instructions on AVX2 Instruction Set
simulated_load_memory_footprint: Simulated Load Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
simulated_memory_footprint: Simulated Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
simulated_store_memory_footprint: Simulated Store Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
site_id: Site Id
site_name: Site Name If Using Source Annotations/Sequence ID If Marking Loops For Deeper Analysis In Survey Report
site_source_file: Site Source File Full Path
site_source_line: Site Source Line
source: Source File Location
store_constant_accesses: Number of Store Memory Accesses That Consistently Change By N Elements
store_random_accesses: Number of Store Memory Accesses That Change by an Unpredictable Number of Elements
store_uniform_accesses: Number of Store Memory Accesses Into The Same Memory
store_unit_accesses: Number of Store Memory Accesses That Consistently Change By One Element
total_loaded_bytes: Total Loaded Bytes
total_loads: Total Loads
total_memory_operations: Total Memory Operations
total_simulated_load_memory_footprint: Simulated Load Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_simulated_memory_footprint: Simulated Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_simulated_store_memory_footprint: Simulated Store Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_stored_bytes: Total Stored Bytes
total_stores: Total Stores
total_strides: Total Strides
total_transferred_bytes: Total Transferred Bytes
uniform_accesses: Number of Accesses Into The Same Memory
unit_accesses: Number of Memory Accesses That Consistently Change By One Element
unit_stride_percent: Percent Of Unit Strides For Selected Site
unit_strides: Unit Strides For Selected Site
war_dependencies: WAR (Write After Read) Dependencies
waw_dependencies: WAW (Write After Write) Dependencies
write_constant_stride_percent: Percent Of Constant Strides Among Memory Write Strides For Selected Site
write_constant_strides: Memory Write Constant Strides For Selected Site
write_non_unit_stride_percent: Percent Of Non-Unit Strides Among Memory Write Strides For Selected Site
write_non_unit_strides: Memory Write Non-Unit Strides For Selected Site
write_total_strides: Total Write Strides For Selected Site
write_unit_stride_percent: Percent Of Unit Strides Among Memory Write Strides For Selected Site
write_unit_strides: Memory Write Unit Strides For Selected Site
----------------------------------------
dependency problem columns
----------------------------------------
id: Memory Access Location ID
is_filter_match: Is Filter Match
modules: Application Modules Where The Memory Access Is Issued
severity: Memory Access Severity
site_name: Site Name If Using Source Annotations/Sequence ID If Marking Loops For Deeper Analysis In Survey Report
sources: Source File Location(s)
state: State Of Most Severe Problem In Problem Set: Regression, New, Not Fixed, Confirmed, Fixed, Not A Problem
type: Problem Type of Problem(s) in Problem Set
----------------------------------------
dependency observation columns
----------------------------------------
description: Code Location Classification
function: Function Name
id: Memory Access Location ID
instruction_address: Instruction Address In Memory
module: Name Of Program Module Containing Loop
source: Source File Location
state: State Of Most Severe Problem In Problem Set: Regression, New, Not Fixed, Confirmed, Fixed, Not A Problem
variable_references: Name Of The Variable For Which The Memory Access Stride Is Detected
----------------------------------------
map columns
----------------------------------------
access_footprint: Maximum distance (among all instances of the loop) between Min And Max memory address values accessed by the instructions. Generated from the current source line with the current Stride Type. (Calculated In bytes.)
access_pattern: Information About Stride Types Detected In The Site
address_distance_confidence: Distance Confidence
all_cache_misses: Number of memory load operations served by a memory subsystem higher than cache. Calculated for all loop instances (assuming "cold" CPU cache). Value is a result of virtual cache modeling, which might not match the exact counter reported by hardware for this particular run.
all_dirty_evictions: Number Of Evicted Cache Lines With a Modified State Introducing Upstream Memory Traffic to a Higher Memory Subsystem
all_eliminated_loaded_bytes: All Eliminated Loaded Bytes
all_eliminated_loads: All Eliminated Loads
all_eliminated_memory_operations: All Eliminated Memory Operations
all_eliminated_stored_bytes: All Eliminated Stored Bytes
all_eliminated_stores: All Eliminated Stores
all_eliminated_transferred_bytes: All Eliminated Transferred Bytes
all_expected_loaded_bytes: All Expected Loaded Bytes
all_expected_loads: All Expected Loads
all_expected_memory_operations: All Expected Memory Operations
all_expected_stored_bytes: All Expected Stored Bytes
all_expected_stores: All Expected Stores
all_expected_transferred_bytes: All Expected Transferred Bytes
all_memory_loads: Number Of Memory Load Operations In All Loop Instances
all_memory_stores: Number Of Memory Store Operations In All Loop Instances
all_rfo_cache_misses: Number of Read for Ownership Cache Lines (Cache Lines Loaded to the Cache Due to a Data Modification Request)
all_total_loaded_bytes: All Total Loaded Bytes
all_total_loads: All Total Loads
all_total_memory_operations: All Total Memory Operations
all_total_stored_bytes: All Total Stored Bytes
all_total_stores: All Total Stores
all_total_transferred_bytes: All Total Transferred Bytes
cache_line_utilization: Simulated Cache Lines Utilization For Data Transfer Operations
cache_misses: Number of memory load operations served by memory subsystem higher than cache. Calculated for the first instance of the loop (assuming "cold" CPU cache). Value is a result of virtual cache modeling, which might not match exact counter reported by hardware for this particular run.
constant_accesses: Number of Memory Accesses That Consistently Change By N Elements
constant_stride_percent: Percent Of Constant Strides For Selected Site
constant_strides: Constant Strides For Selected Site
dirty_evictions: Number Of Evicted Cache Lines With A Modified State Introducing Upstream Memory Traffic To A Higher Memory Sub System
eliminated_loaded_bytes: Eliminated Loaded Bytes
eliminated_loads: Eliminated Loads
eliminated_memory_operations: Eliminated Memory Operations
eliminated_stored_bytes: Eliminated Stored Bytes
eliminated_stores: Eliminated Stores
eliminated_transferred_bytes: Eliminated Transferred Bytes
expected_loaded_bytes: Expected Loaded Bytes
expected_loads: Expected Loads
expected_memory_operations: Expected Memory Operations
expected_stored_bytes: Expected Stored Bytes
expected_stores: Expected Stores
expected_transferred_bytes: Expected Transferred Bytes
first_instance_site_footprint: Memory Footprint For The First Instance Of The Loop (Calculated In Bytes)
fit_into_cache: Dataset Fits Into Cache in Given Loop
footprint_estimate: Memory Footprint Estimates For This Loop
function: Function Name
gather_accesses: Number Of Accesses Detected for Gather Instructions on AVX2 Instruction Set
is_vectorized: Is Vectorized
load_constant_accesses: Number of Load Memory Accesses That Consistently Change By N Elements
load_random_accesses: Number of Load Memory Accesses That Change by an Unpredictable Number of Elements
load_uniform_accesses: Number of Load Accesses Into The Same Memory
load_unit_accesses: Number of Load Memory Accesses That Consistently Change By One Element
map_executed_first_instance_iterations: First Site Instance Iterations Counted by MAP Collector
map_executed_instances: Site Instances Counted by MAP Collector
map_executed_max_iterations: Maximum Instance Site Iterations Counted by MAP Collector
map_executed_min_iterations: Minimum Instance Site Iterations Counted by MAP Collector
map_executed_total_iterations: Total Site Iterations Counted by MAP Collector
memory_loads: Number Of Memory Load Operations In The First Instance Of The Loop
memory_stores: Number Of Memory Store Operations In The First Instance Of The Loop
no_site_start_dependency: No Site Start Dependency
non_unit_stride_percent: Percent Of Non-Unit Strides For Selected Site
non_unit_strides: Non Unit Strides For Selected Site
potential_reduction_column: potential_reduction_column_descr
potential_war_dependency: Potential WAR (Write After Read) Dependency
potential_waw_dependency: Potential WAW (Write After Write) Dependency
random_accesses: Number of Memory Accesses That Change by an Unpredictable Number of Elements
raw_dependencies: RAW (Read After Write) Dependencies
read_constant_stride_percent: Percent Of Constant Strides Among Memory Read Strides For Selected Site
read_constant_strides: Memory Read Constant Strides For Selected Site
read_non_unit_stride_percent: Percent Of Non-Unit Strides Among Memory Read Strides For Selected Site
read_non_unit_strides: Memory Read Non-Unit Strides For Selected Site
read_total_strides: Total Read Strides For Selected Site
read_unit_stride_percent: Percent Of Unit Strides Among Memory Read Strides For Selected Site
read_unit_strides: Memory Read Unit Strides For Selected Site
rfo_cache_misses: Read For Ownership - Number Of Cache Lines Loaded To The Cache Due A Data Modification Request
scatter_accesses: Number Of Accesses Detected for Scatter Instructions on AVX2 Instruction Set
simulated_load_memory_footprint: Simulated Load Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
simulated_memory_footprint: Simulated Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
simulated_store_memory_footprint: Simulated Store Memory Footprint For The First Instance Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
site_id: Site Id
site_name: Site Name If Using Source Annotations/Sequence ID If Marking Loops For Deeper Analysis In Survey Report
site_source_file: Site Source File Full Path
site_source_line: Site Source Line
source: Source File Location
store_constant_accesses: Number of Store Memory Accesses That Consistently Change By N Elements
store_random_accesses: Number of Store Memory Accesses That Change by an Unpredictable Number of Elements
store_uniform_accesses: Number of Store Memory Accesses Into The Same Memory
store_unit_accesses: Number of Store Memory Accesses That Consistently Change By One Element
total_loaded_bytes: Total Loaded Bytes
total_loads: Total Loads
total_memory_operations: Total Memory Operations
total_simulated_load_memory_footprint: Simulated Load Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_simulated_memory_footprint: Simulated Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_simulated_store_memory_footprint: Simulated Store Memory Footprint For All Instances Of The Loop, Calculated In Bytes (Number Of Unique Cache Lines Accessed During The CPU Cache Simulation * Cache Line Size)
total_stored_bytes: Total Stored Bytes
total_stores: Total Stores
total_strides: Total Strides
total_transferred_bytes: Total Transferred Bytes
uniform_accesses: Number of Accesses Into The Same Memory
unit_accesses: Number of Memory Accesses That Consistently Change By One Element
unit_stride_percent: Percent Of Unit Strides For Selected Site
unit_strides: Unit Strides For Selected Site
war_dependencies: WAR (Write After Read) Dependencies
waw_dependencies: WAW (Write After Write) Dependencies
write_constant_stride_percent: Percent Of Constant Strides Among Memory Write Strides For Selected Site
write_constant_strides: Memory Write Constant Strides For Selected Site
write_non_unit_stride_percent: Percent Of Non-Unit Strides Among Memory Write Strides For Selected Site
write_non_unit_strides: Memory Write Non-Unit Strides For Selected Site
write_total_strides: Total Write Strides For Selected Site
write_unit_stride_percent: Percent Of Unit Strides Among Memory Write Strides For Selected Site
write_unit_strides: Memory Write Unit Strides For Selected Site
----------------------------------------
map problem columns
----------------------------------------
access_footprint: Maximum distance (among all instances of the loop) between Min And Max memory address values accessed by the instructions. Generated from the current source line with the current Stride Type. (Calculated In bytes.)
access_pattern_type: Memory Access Pattern Types: Unit Stride (Stride 1), Stride 0, Constant Stride (Stride N), Irregular Stride
access_type: Access Type
address_distance_confidence: Distance Confidence
id: Memory Access Location ID
is_filter_match: Is Filter Match
modules: Application Modules Where The Memory Access Is Issued
nested_function: Function (Invoked From Site) Where Stride Diagnostic Was Detected
severity: Memory Access Severity
site_name: Site Name If Using Source Annotations/Sequence ID If Marking Loops For Deeper Analysis In Survey Report
source: Source File Location
stride: Physical Distance In Elements Between Memory Accesses In Two Adjacent Iterations
variable_references: Name Of The Variable For Which The Memory Access Stride Is Detected
----------------------------------------
map observation columns
----------------------------------------
description: Code Location Classification
function: Function Name
id: Memory Access Location ID
instruction_address: Instruction Address In Memory
module: Name Of Program Module Containing Loop
source: Source File Location
state: State Of Most Severe Problem In Problem Set: Regression, New, Not Fixed, Confirmed, Fixed, Not A Problem
variable_references: Name Of The Variable For Which The Memory Access Stride Is Detected