You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -144,7 +145,73 @@ However, since the size of `Cons(x, rest)` and `Cons(add-int(x, 1), increment(re
144
145
2. calculate `add-int(x, 1)` and `increment(rest)`
145
146
3. store the calculated values to `xs` (overwrite)
146
147
147
-
Neut performs this optimization. When a `free` is required, Neut looks for a `malloc` of the same size and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
148
+
Neut performs this optimization. When a `free` is required, Neut looks for a later `malloc` whose allocation can fit in the freed region and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
149
+
150
+
### Size Matching
151
+
152
+
A known-size `free` can be canceled with a later known-size `malloc` when the freed region is large enough for the allocation.
153
+
154
+
For example, the following lowered pseudocode can be optimized because the sizes are the same:
155
+
156
+
```neut
157
+
free(p, 16);
158
+
let q = malloc(16);
159
+
cont
160
+
161
+
// ↓
162
+
163
+
let q = p;
164
+
cont
165
+
```
166
+
167
+
The next one can also be optimized because the freed region is larger than the later allocation:
168
+
169
+
```neut
170
+
free(p, 16);
171
+
let q = malloc(8);
172
+
cont
173
+
174
+
// ↓
175
+
176
+
let q = p;
177
+
cont
178
+
```
179
+
180
+
### Search Order
181
+
182
+
When there are multiple possible allocations in the same lowered continuation, the freed region is reused for the earlier suitable allocation.
183
+
184
+
For example, in the following lowered pseudocode, the region pointed to by `p` is reused for `q`:
185
+
186
+
```neut
187
+
free(p, 16);
188
+
let q = malloc(16);
189
+
let r = malloc(16);
190
+
cont
191
+
192
+
// ↓
193
+
194
+
let q = p;
195
+
let r = malloc(16);
196
+
cont
197
+
```
198
+
199
+
The compiler doesn't skip `q` and reuse `p` for `r`, since `q` is already a suitable allocation.
200
+
201
+
The compiler prefers exact size matches over merely compatible ones. Thus, in the following case, the region pointed to by `p` is reused for `r`:
At point `(X)`, `free` against `xs` is required. However, this `free` can be canceled since `malloc`s of the same size can be found in all the possible branches (here, `(Y)` and `(Z)`). Thus, in the code above, the deallocation of `xs` at `(X)` is removed, and the memory region of `xs` is reused at `(Y)` and `(Z)`, resulting in an in-place update of `xs`.
236
+
At point `(X)`, `free` against `xs` is required. However, this `free` can be canceled since suitable `malloc`s can be found in all the reachable branches (here, `(Y)` and `(Z)`). Thus, in the code above, the deallocation of `xs` at `(X)` is removed, and the memory region of `xs` is reused at `(Y)` and `(Z)`, resulting in an in-place update of `xs`.
170
237
171
238
On the other hand, consider rewriting the code above into something like the following:
At this point, the `free` against `xs` at `(X')` can't be optimized away since there is a branch (namely, `(Y')`) that doesn't perform a `malloc` of the same size as `xs`.
255
+
At this point, the `free` against `xs` at `(X')` can't be optimized away since there is a branch (namely, `(Y')`) that doesn't perform a suitable `malloc`.
256
+
257
+
The same rule is also applied at branch joins. If a later `malloc` appears after a branch, and each reachable branch frees a suitable region before reaching the join, Neut can pass those freed regions through the join and reuse them for that later `malloc`. Unreachable branches do not prevent this optimization.
189
258
190
259
## Malloc-Free Canceling
191
260
@@ -216,6 +285,45 @@ define foo() -> int {
216
285
217
286
That is, the compiler removes the `malloc`/`free` pair and uses a stack slot instead.
218
287
288
+
This optimization can also work through branch joins. If each reachable branch creates a temporary allocation and the joined result is later freed without escaping, those allocations are candidates for stack allocation.
289
+
290
+
## Order of Memory Optimizations
291
+
292
+
Malloc-free canceling is applied before free-malloc canceling. For example, consider the following pseudocode:
293
+
294
+
```neut
295
+
let tmp = malloc(8);
296
+
store-int(42, tmp);
297
+
let n = load-int(tmp);
298
+
free(tmp, 8);
299
+
free(old, 16);
300
+
let result = malloc(16);
301
+
cont
302
+
```
303
+
304
+
Malloc-free canceling first rewrites `tmp` into a stack slot:
305
+
306
+
```neut
307
+
let tmp = alloca(8);
308
+
store-int(42, tmp);
309
+
let n = load-int(tmp);
310
+
free(old, 16);
311
+
let result = malloc(16);
312
+
cont
313
+
```
314
+
315
+
Then, free-malloc canceling rewrites the later allocation so that `result` uses the region pointed to by `old`:
316
+
317
+
```neut
318
+
let tmp = alloca(8);
319
+
store-int(42, tmp);
320
+
let n = load-int(tmp);
321
+
let result = old;
322
+
cont
323
+
```
324
+
325
+
These optimizations are applied within each definition. They do not directly cancel a `malloc` in one definition with a `free` in another definition. This is why destination-passing style can matter: it can move the relevant allocation and deallocation into the same definition.
Copy file name to clipboardExpand all lines: book/src/static-memory-management.md
+7-1Lines changed: 7 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -138,7 +138,9 @@ Using this knowledge, the compiler translates the code so that it reuses the mem
138
138
2. Calculate `foo` and `bar`
139
139
3. Store the calculated values to `Cons(y, ys)`
140
140
141
-
In other words, when a `free` is required, the compiler looks for a `malloc` in the continuation that is the same size and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
141
+
In other words, when a `free` is required, the compiler looks for a later `malloc` whose allocation can fit in the freed region and optimizes away such a pair if one exists. The resulting assembly code thus performs in-place updates.
142
+
143
+
For more precise rules, see [Free-Malloc Canceling](./basis.md#free-malloc-canceling).
142
144
143
145
## Optimization: Malloc-Free Canceling
144
146
@@ -167,6 +169,10 @@ define foo() -> int {
167
169
}
168
170
```
169
171
172
+
Malloc-free canceling is applied before free-malloc canceling.
173
+
174
+
For more precise rules, see [Malloc-Free Canceling](./basis.md#malloc-free-canceling).
175
+
170
176
## Destination-Passing Style
171
177
172
178
Malloc-free canceling is especially useful when combined with destination-passing style.
0 commit comments