Skip to content

Commit 5a4035e

Browse files
authored
Merge pull request #424 from vekatze/docs-memory-optimizations
clarify memory optimization behavior
2 parents 1f320de + f6bd8d6 commit 5a4035e

2 files changed

Lines changed: 118 additions & 4 deletions

File tree

book/src/basis.md

Lines changed: 111 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
- [On Executing Types](#on-executing-types)
66
- [Free-Malloc Canceling](#free-malloc-canceling)
77
- [Malloc-Free Canceling](#malloc-free-canceling)
8+
- [Order of Memory Optimizations](#order-of-memory-optimizations)
89
- [Name Resolution](#name-resolution)
910
- [Leading Bars and Trailing Commas](#leading-bars-and-trailing-commas)
1011
- [Compiler Configuration](#compiler-configuration)
@@ -144,7 +145,73 @@ However, since the size of `Cons(x, rest)` and `Cons(add-int(x, 1), increment(re
144145
2. calculate `add-int(x, 1)` and `increment(rest)`
145146
3. store the calculated values to `xs` (overwrite)
146147

147-
Neut performs this optimization. When a `free` is required, Neut looks for a `malloc` of the same size and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
148+
Neut performs this optimization. When a `free` is required, Neut looks for a later `malloc` whose allocation can fit in the freed region and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
149+
150+
### Size Matching
151+
152+
A known-size `free` can be canceled with a later known-size `malloc` when the freed region is large enough for the allocation.
153+
154+
For example, the following lowered pseudocode can be optimized because the sizes are the same:
155+
156+
```neut
157+
free(p, 16);
158+
let q = malloc(16);
159+
cont
160+
161+
// ↓
162+
163+
let q = p;
164+
cont
165+
```
166+
167+
The next one can also be optimized because the freed region is larger than the later allocation:
168+
169+
```neut
170+
free(p, 16);
171+
let q = malloc(8);
172+
cont
173+
174+
// ↓
175+
176+
let q = p;
177+
cont
178+
```
179+
180+
### Search Order
181+
182+
When there are multiple possible allocations in the same lowered continuation, the freed region is reused for the earlier suitable allocation.
183+
184+
For example, in the following lowered pseudocode, the region pointed to by `p` is reused for `q`:
185+
186+
```neut
187+
free(p, 16);
188+
let q = malloc(16);
189+
let r = malloc(16);
190+
cont
191+
192+
// ↓
193+
194+
let q = p;
195+
let r = malloc(16);
196+
cont
197+
```
198+
199+
The compiler doesn't skip `q` and reuse `p` for `r`, since `q` is already a suitable allocation.
200+
201+
The compiler prefers exact size matches over merely compatible ones. Thus, in the following case, the region pointed to by `p` is reused for `r`:
202+
203+
```neut
204+
free(p, 16);
205+
let q = malloc(8);
206+
let r = malloc(16);
207+
cont
208+
209+
// ↓
210+
211+
let q = malloc(8);
212+
let r = p;
213+
cont
214+
```
148215

149216
### Free-Malloc Canceling and Branching
150217

@@ -166,7 +233,7 @@ define insert(v: int, xs: int-list) -> int-list {
166233
}
167234
```
168235

169-
At point `(X)`, `free` against `xs` is required. However, this `free` can be canceled since `malloc`s of the same size can be found in all the possible branches (here, `(Y)` and `(Z)`). Thus, in the code above, the deallocation of `xs` at `(X)` is removed, and the memory region of `xs` is reused at `(Y)` and `(Z)`, resulting in an in-place update of `xs`.
236+
At point `(X)`, `free` against `xs` is required. However, this `free` can be canceled since suitable `malloc`s can be found in all the reachable branches (here, `(Y)` and `(Z)`). Thus, in the code above, the deallocation of `xs` at `(X)` is removed, and the memory region of `xs` is reused at `(Y)` and `(Z)`, resulting in an in-place update of `xs`.
170237

171238
On the other hand, consider rewriting the code above into something like the following:
172239

@@ -185,7 +252,9 @@ define foo(v: int, xs: int-list) -> int-list {
185252
}
186253
```
187254

188-
At this point, the `free` against `xs` at `(X')` can't be optimized away since there is a branch (namely, `(Y')`) that doesn't perform a `malloc` of the same size as `xs`.
255+
At this point, the `free` against `xs` at `(X')` can't be optimized away since there is a branch (namely, `(Y')`) that doesn't perform a suitable `malloc`.
256+
257+
The same rule is also applied at branch joins. If a later `malloc` appears after a branch, and each reachable branch frees a suitable region before reaching the join, Neut can pass those freed regions through the join and reuse them for that later `malloc`. Unreachable branches do not prevent this optimization.
189258

190259
## Malloc-Free Canceling
191260

@@ -216,6 +285,45 @@ define foo() -> int {
216285

217286
That is, the compiler removes the `malloc`/`free` pair and uses a stack slot instead.
218287

288+
This optimization can also work through branch joins. If each reachable branch creates a temporary allocation and the joined result is later freed without escaping, those allocations are candidates for stack allocation.
289+
290+
## Order of Memory Optimizations
291+
292+
Malloc-free canceling is applied before free-malloc canceling. For example, consider the following pseudocode:
293+
294+
```neut
295+
let tmp = malloc(8);
296+
store-int(42, tmp);
297+
let n = load-int(tmp);
298+
free(tmp, 8);
299+
free(old, 16);
300+
let result = malloc(16);
301+
cont
302+
```
303+
304+
Malloc-free canceling first rewrites `tmp` into a stack slot:
305+
306+
```neut
307+
let tmp = alloca(8);
308+
store-int(42, tmp);
309+
let n = load-int(tmp);
310+
free(old, 16);
311+
let result = malloc(16);
312+
cont
313+
```
314+
315+
Then, free-malloc canceling rewrites the later allocation so that `result` uses the region pointed to by `old`:
316+
317+
```neut
318+
let tmp = alloca(8);
319+
store-int(42, tmp);
320+
let n = load-int(tmp);
321+
let result = old;
322+
cont
323+
```
324+
325+
These optimizations are applied within each definition. They do not directly cancel a `malloc` in one definition with a `free` in another definition. This is why destination-passing style can matter: it can move the relevant allocation and deallocation into the same definition.
326+
219327
## Name Resolution
220328

221329
### Resolving Module Aliases

book/src/static-memory-management.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,9 @@ Using this knowledge, the compiler translates the code so that it reuses the mem
138138
2. Calculate `foo` and `bar`
139139
3. Store the calculated values to `Cons(y, ys)`
140140

141-
In other words, when a `free` is required, the compiler looks for a `malloc` in the continuation that is the same size and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
141+
In other words, when a `free` is required, the compiler looks for a later `malloc` whose allocation can fit in the freed region and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
142+
143+
For more precise rules, see [Free-Malloc Canceling](./basis.md#free-malloc-canceling).
142144

143145
## Optimization: Malloc-Free Canceling
144146

@@ -167,6 +169,10 @@ define foo() -> int {
167169
}
168170
```
169171

172+
Malloc-free canceling is applied before free-malloc canceling.
173+
174+
For more precise rules, see [Malloc-Free Canceling](./basis.md#malloc-free-canceling).
175+
170176
## Destination-Passing Style
171177

172178
Malloc-free canceling is especially useful when combined with destination-passing style.

0 commit comments

Comments
 (0)