Home > Backend Development > C++ > Why is Looping Over 8192 Elements So Much Slower Than 8191 or 8193?

Why is Looping Over 8192 Elements So Much Slower Than 8191 or 8193?

Susan Sarandon
Release: 2024-12-11 08:56:10
Original
848 people have browsed it

Why is Looping Over 8192 Elements So Much Slower Than 8191 or 8193?

Performance Impact When Looping Over 8192 Elements

Certain matrix operations exhibit performance anomalies when the matrix size, particularly the number of rows, is a multiple of 2048 (e.g., 8192). This phenomenon, referred to as super-alignment, arises due to specific memory management practices in modern CPUs.

The provided code snippet demonstrates this issue, where a matrix res[][] is computed from a matrix img[][]. The performance for different matrix sizes, specifically 8191, 8192, and 8193, reveals a significant slowdown when the matrix size is 8192.

Super-Alignment Effects

The performance variations stem from the non-uniform access to memory caused by the nested loops iterating column-wise over the matrix img[][]. This non-sequential access pattern results in performance penalties on modern CPUs, which operate more efficiently with sequential memory access.

Resolution: Interchanging Outer Loops

The solution lies in reordering the nested loops, prioritizing row-wise iteration over column-wise iteration. By doing so, memory access becomes sequential, significantly improving performance:

for(j=1;j<SIZE-1;j++) {
    for(i=1;i<SIZE-1;i++) {
        // Code to compute res[j][i]
    }
}
Copy after login

Performance Results

The following performance results demonstrate the improvement achieved by interchanging the outer loops:

Matrix Size Original Code (s) Interchanged Loops (s)
8191 1.499 0.376
8192 2.122 0.357
8193 1.582 0.351

This optimization drastically reduces the performance gap for matrices with dimensions that are multiples of 2048, resulting in consistent performance across different matrix sizes.

The above is the detailed content of Why is Looping Over 8192 Elements So Much Slower Than 8191 or 8193?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template