Question
What is the difference between cache-unfriendly code and cache-friendly code?
How can I write code that uses the CPU cache efficiently and avoids slow memory access patterns?
Short Answer
By the end of this page, you will understand what CPU cache-friendly code means, why memory access patterns can affect performance, and how to improve code by using sequential access, better data layout, and fewer unnecessary memory lookups.
Concept
CPU caches are small, very fast memory areas located close to the processor. They store copies of recently used data so the CPU does not need to fetch everything from much slower main memory (RAM).
When code is cache-friendly, it accesses memory in a way that works well with these caches. When code is cache-unfriendly, it causes frequent cache misses, forcing the CPU to wait for data to be loaded from RAM.
Why this matters
Modern CPUs are extremely fast at executing instructions, but memory access is comparatively slow. In many programs, performance is limited not by arithmetic, but by how data is laid out and accessed.
What usually makes code cache-friendly
- Accessing memory sequentially
- Reusing recently accessed data
- Keeping related data close together in memory
- Reducing random pointer chasing
- Working on smaller chunks of data that fit in cache
What usually makes code cache-unfriendly
- Jumping around memory unpredictably
- Repeatedly following pointers to scattered objects
- Using data structures with poor locality for the task
- Scanning very large memory regions repeatedly
- Storing related data far apart
Spatial and temporal locality
Two key ideas explain cache-friendly code:
- Spatial locality: if you access one memory location, nearby locations are likely to be useful soon.
- Temporal locality: if you access data now, you are likely to access the same data again soon.
CPU caches are built to exploit these patterns. Code that matches them tends to run faster.
Example intuition
An array is often cache-friendly because its elements are stored next to each other. A linked list is often less cache-friendly because each node may live in a completely different place in memory.
That means looping through an array is usually much faster than walking through a linked list, even if both contain the same values.
Mental Model
Think of the CPU cache like a small tray on your desk and RAM like a large storage room in another part of the building.
- Cache-friendly code is like taking a stack of papers from the storage room and processing them one by one while they are already on your desk.
- Cache-unfriendly code is like walking back to the storage room for one random paper at a time.
The less often you need to leave the desk, the faster you work.
Another useful analogy is reading a book:
- Reading page 10, then 11, then 12 is cache-friendly.
- Reading page 10, then 400, then 23, then 275 is cache-unfriendly.
The CPU prefers predictable, nearby access patterns.
Syntax and Examples
There is no special language syntax for cache-friendly code. It is mainly about how you organize and access data.
Example: sequential array access
const numbers = new Array(1_000_000).fill(1);
let sum = 0;
for (let i = 0; i < numbers.length; i++) {
sum += numbers[i];
}
console.log(sum);
This is relatively cache-friendly because arrays are stored contiguously or near-contiguously in memory, and the loop reads elements in order.
Example: poor locality through indirection
const items = Array.from({ length: 100000 }, (_, i) => ({ value: i }));
const shuffled = [...items].sort(() => Math.random() - 0.5);
let sum = 0;
for (let i = ; i < shuffled.; i++) {
sum += shuffled[i].;
}
.(sum);
Step by Step Execution
Consider this example:
const data = [10, 20, 30, 40];
let total = 0;
for (let i = 0; i < data.length; i++) {
total += data[i];
}
console.log(total);
Step by step:
datais created as an array of numbers.totalstarts at0.- The loop starts with
i = 0. - The program reads
data[0], which is10, and adds it tototal. - Then it reads
data[1], which is20, and adds it. - Then
data[2] = 30. - Then
data[3] = 40. - The final value of
totalis100.
Real World Use Cases
Cache-friendly thinking matters most in performance-sensitive code.
Common examples
- Game engines: updating positions, physics, and collisions for thousands of entities
- Data processing: scanning large arrays, logs, matrices, or analytics data
- Image and video processing: iterating through pixels in row order
- Scientific computing: matrix operations and simulations
- Databases: scanning rows, indexes, and in-memory structures efficiently
- Machine learning systems: operating on dense arrays and tensors
- Backend services: processing large batches of records or metrics
Example scenario
Suppose you are processing 10 million user scores.
- A cache-friendly version stores scores in a flat array and loops through them once.
- A cache-unfriendly version stores each score inside a separate object, with extra metadata, and accesses them in irregular order.
Both programs may do the same logical work, but the first often performs better because it uses memory more efficiently.
Real Codebase Usage
In real projects, developers usually improve cache efficiency through data layout and iteration patterns rather than by writing special cache commands.
Common patterns
Process data in order
When possible, loop through arrays or buffers from start to finish.
for (let i = 0; i < records.length; i++) {
process(records[i]);
}
Avoid unnecessary object indirection
If performance matters, prefer flatter data structures over deep object graphs.
// More indirect
const users = [{ profile: { age: 20 } }, { profile: { age: 25 } }];
// Flatter
const ages = [20, 25];
Work in batches
Large datasets are often processed in chunks so active data fits better in cache.
const chunkSize = 1000;
for (let start = 0; start < data.length; start += chunkSize) {
end = .(start + chunkSize, data.);
( i = start; i < end; i++) {
(data[i]);
}
}
Common Mistakes
1. Assuming only algorithms matter
A program can have a good Big-O complexity and still be slow because of poor memory locality.
O(n)with sequential array access can be very fastO(n)with pointer-heavy random access can be much slower
2. Using linked structures for sequential processing
Beginners sometimes assume all collections with the same data are equally fast to scan.
In practice, arrays are often much better for iteration than pointer-based structures.
3. Accessing large data randomly
Broken pattern for cache efficiency:
const indexes = [5000, 2, 900, 100000, 7];
let total = 0;
for (let i = 0; i < indexes.length; i++) {
total += bigArray[indexes[i]];
}
This may be necessary in some programs, but random access is usually worse for cache use than ordered access.
4. Splitting related data too much
const item = {
config: {
display: {
metrics: {
score:
}
}
}
};
Comparisons
| Concept | Cache-friendly version | Cache-unfriendly version | Why |
|---|---|---|---|
| Data traversal | Sequential array access | Random indexing | Nearby data is more likely to already be loaded |
| Data structure | Flat arrays | Linked lists / scattered objects | Arrays usually have better memory locality |
| Object layout | Compact related fields | Deep nested indirection | Extra dereferences can cost more memory access |
| Processing style | One pass over data | Many repeated passes | Reusing loaded data reduces memory traffic |
| Working set size | Small chunks | Huge active dataset | Smaller active data fits cache better |
Array vs linked list
Cheat Sheet
Quick definition
Cache-friendly code accesses memory in ways that let the CPU reuse fast cached data.
Good patterns
- Iterate through arrays in order
- Keep related data close together
- Reuse data soon after loading it
- Prefer flatter structures in hot loops
- Process large data in chunks
- Reduce extra pointer chasing
Bad patterns
- Random memory access
- Scattered objects
- Linked-list-style traversal for large scans
- Repeated full passes over large datasets
- Large working sets that do not fit in cache
Key terms
- Cache miss: data was not in cache, so the CPU must fetch it from slower memory
- Spatial locality: nearby memory is likely to be used soon
- Temporal locality: recently used data is likely to be used again soon
- Working set: the amount of data actively used during a computation
Practical rules
// Usually better
for (let i = 0; i < arr.length; i++) {
use(arr[i]);
}
// Often worse for large data
for (let i = 0; i < indexes.length; i++) {
use(arr[indexes[i]]);
}
Important reminder
Cache friendliness is about memory access patterns, not special syntax.
FAQ
What does cache-friendly code mean?
It means code accesses data in ways that work well with the CPU cache, usually by reading nearby memory sequentially and reusing data soon.
Why can memory access be slower than computation?
The CPU is much faster than RAM. If needed data is not in cache, the CPU may wait for it to be fetched from slower memory.
Are arrays more cache-friendly than linked lists?
Usually yes. Arrays tend to store elements next to each other, while linked lists often scatter nodes across memory.
Does cache-friendly code always matter?
No. It matters most in performance-critical code, large loops, data-heavy processing, games, simulations, and systems code.
Can I control CPU cache directly in JavaScript?
Not directly. JavaScript does not expose low-level cache control, but you can still write code with better memory access patterns.
Is cache-friendly code the same as efficient algorithms?
Not exactly. Algorithmic complexity and cache behavior are different. Good performance often needs both.
How do I know whether cache issues are hurting performance?
Profile and benchmark your code using realistic inputs. Do not guess based only on intuition.
Mini Project
Description
Build a small benchmark script that compares sequential access with random access over a large array. This helps you see how memory access patterns can affect runtime, even when both approaches do similar work.
Goal
Create and run a program that measures the time difference between sequential and random array reads.
Requirements
[ "Create a large numeric array with many elements.", "Sum the array once using sequential access.", "Create a shuffled list of indexes and sum using that order.", "Measure the runtime of both approaches.", "Print the totals and elapsed times." ]
Keep learning
Related questions
Basic Rules and Idioms for Operator Overloading in C++
Learn the core rules, syntax, and common idioms for operator overloading in C++, including member vs non-member operators.
C++ Base Class Constructor Rules Explained
Learn how C++ base class constructors are called from derived classes, including order, syntax, defaults, and common mistakes.
C++ Casts Explained: C-Style Cast vs static_cast vs dynamic_cast
Learn the difference between C-style casts, static_cast, and dynamic_cast in C++ with clear examples, safety rules, and real usage tips.