Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 49 additions & 16 deletions design.md
Original file line number Diff line number Diff line change
Expand Up @@ -872,8 +872,8 @@ concrete monomorphic dispatcher type has already determined the owner.

A numeric literal whose target type is a non-builtin nominal type converts
through that type's `from_numeral` method, and a string literal converts
through `from_quote` (receiving the literal's post-escape UTF-8 bytes as
`List(U8)`). Every such conversion with a concrete target type is a
through `from_quote` (receiving the literal's post-escape contents as `Str`).
Every such conversion with a concrete target type is a
compile-time root (`numeral_conversion` / `quote_conversion`), no matter
where the literal sits in the AST: checking finalization evaluates the raw
dispatch call, stores its `Try` result through `ConstStore`, unwraps `Ok` into
Expand Down Expand Up @@ -901,20 +901,53 @@ encoding.

### String Interpolation

An interpolated string literal is canonicalization sugar. It desugars into
ordinary CIR: the interpolated expressions bind to locals in source order,
each literal segment stays a real string literal (so each converts through
`from_quote`), and the result is
`seg0.from_interpolation([].iter().prepended((interp_n, seg_n+1))...)` — the
iterator yields each interpolated value paired with the literal segment that
follows it. `from_interpolation : val, Iter((interpolated, val)) -> val` is an
ordinary method: each implementing type chooses its `interpolated` type (`Str`
chooses `Str`; a `Url`-style type can interpolate `Str` rather than itself),
and a type that needs to validate assembled values simply does not implement
it. The synthesized call node is recorded so checking unifies the call result
with the receiver, pinning the literal's target type from the use site before
string defaulting runs. No post-canonicalization stage knows interpolation
exists.
An interpolated string literal is its own CIR expression. It is not
desugared as receiver method-call syntax, because interpolation method
selection is owned by the expression result type, not by the first literal
segment. The interpolated expressions bind to locals in source order. Literal
segments are always builtin `Str` values, and the interpolation expression
passes the first segment plus an `Iter((interpolated, Str))` of the remaining
interpolated values paired with the literal segment that follows each one.

For an unsuffixed interpolation, checking gives the expression this type:

```roc
val where [
val.from_interpolation : Str, Iter((_interpolated, Str)) -> val,
]
```

The static dispatch owner is `val`, the interpolation result type. If `val`
remains unconstrained, it defaults to `Str`, which selects:

```roc
Str.from_interpolation : Str, Iter((Str, Str)) -> Str
```

Types that want checked interpolation through `Try` implement their own
`from_interpolation` and rely on `Try` forwarding:

```roc
Try.from_interpolation : Str, Iter((interpolated, Str)) -> Try(ok, err)
where [
ok.from_interpolation : Str, Iter((interpolated, Str)) -> Try(ok, err),
]
```

For a suffixed interpolation such as `"a${x}b".Regex`, the suffix is not a
static-dispatch owner. It is a direct associated-function call to
`Regex.from_interpolation`; the function's argument types constrain the
literal segments and interpolated expressions, and the function's return type is
the type of the whole interpolation expression. Missing suffixed interpolation
functions are reported as missing associated functions on the suffix type.

Interpolation deliberately does not parameterize literal segments over an
arbitrary `literal` type with a `literal.from_quote` constraint. That design
would defer quoted-segment conversion errors until monomorphic specializations
are known. `roc check` must report all compile-time conversion errors without
monomorphizing the program, so interpolation segments use builtin `Str`
directly. Normal non-interpolated quoted literals still convert through
`from_quote` as described above.

## Shared Post-Check Model

Expand Down
32 changes: 19 additions & 13 deletions src/build/roc/Builtin.roc
Original file line number Diff line number Diff line change
Expand Up @@ -333,20 +333,15 @@ Builtin :: [].{
## ```
from_utf8 : List(U8) -> Try(Str, [BadUtf8({ problem : Str.Utf8Problem, index : U64 }), ..])

## Converts the UTF-8 bytes of a string literal to a [Str].
## Converts a string literal to a [Str].
##
## The compiler calls this when a string literal's type is [Str], passing
## the literal's bytes after escape processing. It can also be called
## directly, in which case invalid UTF-8 returns `Err`.
## the literal's contents after escape processing.
## ```roc
## expect Str.from_quote([82, 111, 99]) == Ok("Roc")
## expect Str.from_quote([255]).is_err()
## expect Str.from_quote("Roc") == Ok("Roc")
## ```
from_quote : List(U8) -> Try(Str, [BadQuotedBytes(Str)])
from_quote = |bytes| match Str.from_utf8(bytes) {
Ok(str) => Ok(str)
Err(_) => Err(BadQuotedBytes("the bytes were not valid UTF-8"))
}
from_quote : Str -> Try(Str, [BadQuotedBytes(Str)])
from_quote = |str| Ok(str)

## Assembles an interpolated string literal.
##
Expand Down Expand Up @@ -490,9 +485,8 @@ Builtin :: [].{
## Returns an iterator that yields the given item first, followed by
## everything the given iterator yields.
##
## The compiler uses this to assemble the iterator it passes to a type's
## `from_interpolation` method when an interpolated string literal
## targets that type.
## The compiler uses this to assemble the iterator it passes to
## `from_interpolation` when checking an interpolated string literal.
## ```roc
## expect Iter.fold([2, 3].iter().prepended(1), [], |acc, item| acc.append(item)) == [1, 2, 3]
## ```
Expand Down Expand Up @@ -1899,6 +1893,18 @@ Builtin :: [].{
Err(_) => True
}

## Forwards interpolated string literal assembly through an inner type
## whose `from_interpolation` method returns the same [Try].
from_interpolation : Str, Iter((interpolated, Str)) -> Try(ok, err)
where [
ok.from_interpolation : Str, Iter((interpolated, Str)) -> Try(ok, err),
]
from_interpolation = |first, rest| {
OkType : ok

OkType.from_interpolation(first, rest)
}

## If the result is `Ok`, returns the value it holds. Otherwise, returns
## the given default value.
##
Expand Down
61 changes: 38 additions & 23 deletions src/canonicalize/Can.zig
Original file line number Diff line number Diff line change
Expand Up @@ -12425,25 +12425,24 @@ fn processEscapeSequences(allocator: std.mem.Allocator, input: []const u8) std.m
return result.toOwnedSlice(allocator);
}

/// Helper function to create a string literal expression and add it to the scratch stack
/// Desugar an interpolated string literal into ordinary CIR:
/// Canonicalize an interpolated string literal.
///
/// ```roc
/// "a${x}b${y}c"
/// ```
/// becomes
/// becomes a block which evaluates interpolated expressions in source order and
/// finishes with a result-owned interpolation dispatch:
/// ```roc
/// {
/// #interp_0 = x
/// #interp_1 = y
/// "a".from_interpolation([].iter().prepended((#interp_1, "c")).prepended((#interp_0, "b")))
/// <interpolation first="a" rest=[].iter().prepended((#interp_1, "c")).prepended((#interp_0, "b"))>
/// }
/// ```
/// The interpolated expressions bind to locals first so they evaluate in
/// source order; the iterator yields each interpolated value paired with the
/// literal segment that follows it. Every literal segment stays a real string
/// literal expression, so each converts through `from_quote` like any other,
/// and the receiver's type suffix (when present) pins the target type.
/// literal `Str` segment that follows it. With a type suffix, the final
/// expression is a direct call to `Suffix.from_interpolation(first, rest)`.
fn desugarInterpolatedString(
self: *Self,
span: CIR.Expr.Span,
Expand Down Expand Up @@ -12508,7 +12507,8 @@ fn desugarInterpolatedString(
}
const stmts_span = try self.env.store.statementSpanFrom(stmts_top);

// Wrap each literal segment in its own string expression.
// Interpolation segments are always builtin Str, so keep them as raw
// string-segment expressions instead of wrapping them in quote literals.
const seg_exprs = try gpa.alloc(Expr.Idx, segments.items.len);
defer gpa.free(seg_exprs);
for (segments.items, 0..) |maybe_segment, i| {
Expand All @@ -12518,18 +12518,7 @@ fn desugarInterpolatedString(
.literal = empty_literal,
} }, region);
};
const seg_region = self.env.store.getNodeRegion(ModuleEnv.nodeIdxFrom(segment_idx));
const seg_scratch_top = self.env.store.scratchExprTop();
try self.env.store.addScratchExpr(segment_idx);
const seg_span = try self.env.store.exprSpanFrom(seg_scratch_top);
seg_exprs[i] = try self.env.addExpr(CIR.Expr{ .e_str = .{
.span = seg_span,
} }, seg_region);
}

// The receiver's type suffix (e.g. `"a${x}b".MyType`) pins the target type.
if (type_ident) |suffix_ident| {
try self.recordTypedNumericSuffix(seg_exprs[0], suffix_ident);
seg_exprs[i] = segment_idx;
}

const iter_method = try self.env.insertIdent(Ident.for_text("iter"));
Expand All @@ -12539,6 +12528,8 @@ fn desugarInterpolatedString(
// [].iter()
const empty_list_idx = try self.env.addExpr(CIR.Expr{ .e_empty_list = .{} }, region);
var chain_idx = try self.addSyntheticMethodCall(empty_list_idx, iter_method, &.{}, region);
const part_exprs = try gpa.alloc(Expr.Idx, interps.items.len * 2);
defer gpa.free(part_exprs);

// Prepend (interpolation, following-segment) pairs back to front so the
// iterator yields them in source order.
Expand All @@ -12549,6 +12540,8 @@ fn desugarInterpolatedString(
const tmp_lookup_idx = try self.env.addExpr(CIR.Expr{ .e_lookup_local = .{
.pattern_idx = tmp_patterns[pair_i],
} }, interp_region);
part_exprs[pair_i * 2] = tmp_lookup_idx;
part_exprs[pair_i * 2 + 1] = seg_exprs[pair_i + 1];
const elems_top = self.env.store.scratchExprTop();
try self.env.store.addScratchExpr(tmp_lookup_idx);
try self.env.store.addScratchExpr(seg_exprs[pair_i + 1]);
Expand All @@ -12558,13 +12551,35 @@ fn desugarInterpolatedString(
} }, interp_region);
chain_idx = try self.addSyntheticMethodCall(chain_idx, prepended_method, &.{pair_idx}, interp_region);
}
const parts_span = try self.env.store.appendExprSpan(part_exprs);

const final_idx = if (type_ident) |suffix_ident| suffix_blk: {
const fn_expr = try self.canonicalizeTypeAssociatedLookup(suffix_ident, from_interpolation_method, region) orelse
try self.canonicalizedMalformedExpr(Diagnostic{ .undeclared_type = .{
.name = suffix_ident,
.region = region,
} });

const args_top = self.env.store.scratchExprTop();
try self.env.store.addScratchExpr(seg_exprs[0]);
try self.env.store.addScratchExpr(chain_idx);
const args_span = try self.env.store.exprSpanFrom(args_top);

const call_idx = try self.addSyntheticMethodCall(seg_exprs[0], from_interpolation_method, &.{chain_idx}, region);
try self.env.recordInterpolationCallNode(ModuleEnv.nodeIdxFrom(call_idx));
break :suffix_blk try self.env.addExpr(CIR.Expr{ .e_call = .{
.func = fn_expr.idx,
.args = args_span,
.called_via = CalledVia.apply,
} }, region);
} else try self.env.addExpr(CIR.Expr{ .e_interpolation = .{
.first = seg_exprs[0],
.parts = parts_span,
.rest = chain_idx,
.method_name_region = region,
} }, region);

return try self.env.addExpr(CIR.Expr{ .e_block = .{
.stmts = stmts_span,
.final_expr = call_idx,
.final_expr = final_idx,
} }, region);
}

Expand Down
6 changes: 6 additions & 0 deletions src/canonicalize/DependencyGraph.zig
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,12 @@ fn collectExprDependencies(
}
try pending.append(stack_allocator, call.receiver);
},
.e_interpolation => |interpolation| {
for (cir.store.sliceExpr(interpolation.parts)) |part_idx| {
try pending.append(stack_allocator, part_idx);
}
try pending.append(stack_allocator, interpolation.first);
},
.e_structural_eq => |eq| {
try pending.append(stack_allocator, eq.rhs);
try pending.append(stack_allocator, eq.lhs);
Expand Down
49 changes: 49 additions & 0 deletions src/canonicalize/Expression.zig
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,25 @@ pub const Expr = union(enum) {
args: Expr.Span,
constraint_fn_var: TypeVar,
},
/// Compiler-created interpolation dispatch.
///
/// Unlike an ordinary method call, this dispatch is owned by the result
/// type of the whole interpolation expression. Runtime arguments are the
/// first `Str` segment and the iterator of interpolated values paired with
/// following `Str` segments.
e_interpolation: struct {
first: Expr.Idx,
/// Flat `(interpolated, following_segment)` pairs. The span length is
/// always even, with `following_segment` expressions already typed as
/// builtin `Str` segments.
parts: Expr.Span,
/// Synthetic iterator chain for the custom-dispatch path. The builtin
/// `Str` path consumes `parts` directly and does not check or lower
/// this expression.
rest: Expr.Idx,
method_name_region: base.Region,
constraint_fn_var: ?TypeVar = null,
},
/// Structural equality chosen explicitly by the checker.
///
/// This is not method dispatch. It represents the semantic case where
Expand Down Expand Up @@ -1261,6 +1280,36 @@ pub const Expr = union(enum) {

try tree.endNode(begin, attrs);
},
.e_interpolation => |e| {
const begin = tree.beginNode();
try tree.pushStaticAtom("e-interpolation");
const region = ir.store.getExprRegion(expr_idx);
try ir.appendRegionInfoToSExprTreeFromRegion(tree, region);
if (e.constraint_fn_var) |constraint_fn_var| {
try tree.pushU64Pair("constraint-fn-var", @intFromEnum(constraint_fn_var));
}
const attrs = tree.beginNode();

{
const first_begin = tree.beginNode();
try tree.pushStaticAtom("first");
const first_attrs = tree.beginNode();
try ir.store.getExpr(e.first).pushToSExprTree(ir, tree, e.first);
try tree.endNode(first_begin, first_attrs);
}

{
const parts_begin = tree.beginNode();
try tree.pushStaticAtom("parts");
const parts_attrs = tree.beginNode();
for (ir.store.sliceExpr(e.parts)) |part_idx| {
try ir.store.getExpr(part_idx).pushToSExprTree(ir, tree, part_idx);
}
try tree.endNode(parts_begin, parts_attrs);
}

try tree.endNode(begin, attrs);
},
.e_structural_eq => |e| {
const begin = tree.beginNode();
try tree.pushStaticAtom("e-structural-eq");
Expand Down
Loading
Loading