Skip to content

Add multi module compilation for elm#8076

Merged
devongovett merged 34 commits into
parcel-bundler:v2from
ChristophP:add-multi-module-compilation-for-elm
Jul 31, 2022
Merged

Add multi module compilation for elm#8076
devongovett merged 34 commits into
parcel-bundler:v2from
ChristophP:add-multi-module-compilation-for-elm

Conversation

@ChristophP

@ChristophP ChristophP commented May 9, 2022

Copy link
Copy Markdown
Contributor

Fixes #2508
Fixes #8263

↪️ Pull Request

Add multi module compilation for Elm](58174b5)

The Elm compiler can compile multiple entrypoints into one bundle, which
helps keep the asset size small since they all share the same runtime.

The approach used here has been inspired by the elm plugin for vite: https://github.com/hmsk/vite-plugin-elm#combine-multiple-main-files-experimental-from-270-beta1

💻 Examples

import { Elm } from
'./Main.elm?with=./AnotherModule.elm,./YetAnotherModule.elm'

Elm.Main.init();
Elm.AnotherModule.init();
Elm.YetAnotherModule.init();

🚨 Test instructions

Use a js file like this:

// src/index.js
import { Elm } from "../../../Main.elm?with=./ModuleB.elm";

document.body.innerHTML = '<div id="elm-1"></div><div id="elm-2"></div>'

Elm.Main.init({ node: document.getElementById('elm-1') });
Elm.ModuleB.init({ node: document.getElementById('elm-2') });
module Main exposing (main)

import Browser
import Html exposing (Html, button, div, text)
import Html.Events exposing (onClick)


type Msg
    = Increment
    | Decrement


update msg model =
    case msg of
        Increment ->
           ({ model | count = model.count + 1 }, Cmd.none)

        Decrement ->
           ({ model | count = model.count - 1 }, Cmd.none)


view model =
    div []
        [ button [ onClick Increment ] [ text "+1" ]
        , div [] [ text <| String.fromInt model.count ]
        , button [ onClick Decrement ] [ text "-1" ]
        ]


main =
    Browser.sandbox
        { init = \_ -> ({ count = 0 }, Cmd.none)
        , view = view
        , update = update
        , subscriptions = always Sub.none
        }

and another Elm module

module ModuleB exposing (main)

import Browser
import Html exposing (Html, button, div, text)
import Html.Events exposing (onClick)


type Msg
    = Increment
    | Decrement


update msg model =
    case msg of
        Increment ->
           ({ model | count = model.count + 1 }, Cmd.none)

        Decrement ->
           ({ model | count = model.count - 1 }, Cmd.none)


view model =
    div []
        [ button [ onClick Increment ] [ text "+1" ]
        , div [] [ text <| String.fromInt model.count ]
        , button [ onClick Decrement ] [ text "-1" ]
        ]


main =
    Browser.sandbox
        { init = \_ -> ({ count = 0 }, Cmd.none)
        , view = view
        , update = update
        , subscriptions = always Sub.none
        }

ChristophP added 4 commits May 8, 2022 21:53
The Elm compiler can compile multiple entrypoints into one bundle, which
helps keep the asset size small since they all share the same runtime.

Use like this:
import { Elm } from
'./Main.elm?with=./AnotherModule.elm,./YetAnotherModule.elm'

Elm.Main.init();
Elm.AnotherModule.init();
Elm.YetAnotherModule.init();
@devongovett

Copy link
Copy Markdown
Member

I'm not very familiar with Elm, but what's the difference between doing this vs making a new entry file and importing things there instead? The syntax here of importing files with a query string seems kinda strange to me.

@ChristophP

Copy link
Copy Markdown
Contributor Author

@devongovett Let me give you a bit more background on this. The difference is basically creating smaller bundles.

Multiple compile steps vs a single compile step

import { Elm } from "src/Main.elm";
import { Elm as Elm2 } from "src/Main2.elm";

Elm.Main.init();
Elm2.Main2.init();

and

import { Elm } from "src/Main.elm?with=src/Main2.elm";

Elm.Main.init();
Elm.Main2.init();

is that

  • the former creates two JS assets in two compilation steps, effectively duplicating the Elm runtime + any Elm code that may be shared between the entrypoints and their dependencies.
  • the latter creates one JS asset in a single compilation step, combining multiple Elm entrypoints into the same output file. This way, the Elm runtime + any shared code gets shared in the compiled output between all entrypoints.

When using the elm compiler on the command line those cases be equivalent to this

elm make src/Main.elm --output=elm.js
elm make src/Main2.elm --output=elm2.js
# vs
elm make src/Main.elm src/Main2.elm --output=elm.js

How common is this?

I wouldn't say this is super common, but if anyone wants to do this they couldn't use Elm with parcel right now.

Query string syntax

I agree that it seems a bit odd. I was wondering how to do this and found that this is how it's done with Vite. https://github.com/hmsk/vite-plugin-elm#combine-multiple-main-files-experimental-from-270-beta1
If there's a better way of specifying that I'd be happy to change it.

@ChristophP

Copy link
Copy Markdown
Contributor Author

For reference, webpack handles it like this.
https://github.com/elm-community/elm-webpack-loader#files-default---path-to-required-file
It provides some extra config, but one of the things I love about parcel is that you specify most things where you import it.

@devongovett

Copy link
Copy Markdown
Member

Sure. I guess what I was asking is if you could make a new Elm file that imports all of your entry points and re-exports them somehow, and then import that from JS instead?

@teppix

teppix commented May 17, 2022

Copy link
Copy Markdown

if you could make a new Elm file that imports all of your entry points and re-exports them somehow

Elm-applications are imported by their module names, and for a module to be considered an application, it has to contain a function named main. That means that you can't really define more than one application in the same module.

@ChristophP

Copy link
Copy Markdown
Contributor Author

Yes, what @teppix said. The Elm compiler doesn't allow more than one main per entry point.

@devongovett

Copy link
Copy Markdown
Member

So you can't do something like this?

module Main exposing (ModuleA, ModuleB)

import ModuleA
import ModuleB

and then from JS:

import { Elm } from './Main.elm';

Elm.ModuleA.init();
Elm.ModuleB.init();

I have no idea if this is valid syntax, but that's what I meant. If possible, that seems nicer than doing it in the query string.

@ChristophP

ChristophP commented May 17, 2022

Copy link
Copy Markdown
Contributor Author

Unfortunately not. In order to have an .init() function on the JS side you need to define a main function on the Elm side. Currently Elm only allows one main function per entry point module and it has to be in the topmost module (not imported by any other module).

@devongovett

Copy link
Copy Markdown
Member

Got it. Maybe that would be a good issue to open in Elm then. In the meantime, I guess this is fine. Would you mind adding a test? We'll need to add this to the documentation as well.

@ChristophP

Copy link
Copy Markdown
Contributor Author

@devongovett

Tests

sure, I'll add a test for this.

Docs

I can also prepare some documentation if you'd like. Would that be another PR in this repo over here https://github.com/parcel-bundler/website/blob/v2/src/languages/elm.md ?

Syntax / API design

I've been thinking a bit more on the not-so-perfect syntactical way of using this.

import { Elm } from './Main.elm?with=./AnotherModule.elm,./YetAnotherModule.elm'

When there are more than one extra modules I separate them with a , withing the value of the query param with. This is kind of noisy and hard to read.

I thought maybe this could be improved a little by specifying the with param once for each module, like this.

import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm'

and then get those values with asset.query.getAll('with').
The upside is it seem a bit more readable and more natural with how URL params are used normally, the downside is that instead of one extra character per module (,) its 6 (&with=).

Do you have any thoughts on this? Improvement or just as bad?

@devongovett

Copy link
Copy Markdown
Member

Better I think!

@ChristophP ChristophP force-pushed the add-multi-module-compilation-for-elm branch from 3c9eb60 to 4e8eec7 Compare May 20, 2022 18:40
@ChristophP

Copy link
Copy Markdown
Contributor Author

I made the changes regarding multiple with params as well as adding a test. I was still planning on improving error handling when one of the extra sources isn't there but haven't gotten around to it yet. Will hopefully do it this week.

@ChristophP

ChristophP commented May 26, 2022

Copy link
Copy Markdown
Contributor Author

I have a question that I just thought about:

What would happen if someone imported this twice?
Assume Elm code is imported with parcel in two files like this.

// in file1.js
import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm';
// in file2
import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm';

I would assume the asset is compiled once and imported in both files, right?
Would the same be true if someone did the same but changed the order of the query params?

// in file1.js
import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm';
// in file2
import { Elm } from './Main.elm?with=./YetAnotherModule.elm&with=./AnotherModule.elm';

How does parcel treat the specifiers where the only difference is the order of the query params, i.e. when the name is the same, when the queries are the same, when the exact string is the same? Would parcel run the transformer twice and include the compiled JS twice in the resulting bundle?

@mischnic

Copy link
Copy Markdown
Member

Here's where the query comes from in the resolver:

query: url.search ? new URLSearchParams(url.search) : undefined,

and it's then turned into a string to store it in the cache.

query: result.query?.toString(),

It looks like this won't normalize the order. So there will be two assets.

And theoretically you could also write a transformer where this order matters, so there's not much we can do here.

@ChristophP

ChristophP commented May 27, 2022

Copy link
Copy Markdown
Contributor Author

@mischnic thanks for the clarification.

I see how that could be difficult to change, but also surprising at the same time. (i.e. when importing ./image.jpeg?width=100&height=200 vs ./image.jpeg?height=200&width=100 would be different assets).

I wonder if the current solution with the query string is the way to go then, because I could see that leading to unintentionally increasing the bundle size significantly without warning or specifying the path slightly diffently (i.e. no leading ./).

So I'd like to propose a different solution for mutli-module compilation for Elm: In the package.json, one could add something like this:

  "@parcel/transformer-elm": {
    // this means: whenever Main.elm is imported, also compile AnotherModule.elm and YetAnotherModule.elm with it
    "extraSources": {
      "./src/Main.elm": ["./src/AnotherModule.elm", "./src/YetAnotherModule.elm"]
    }
  }

This is more of a static config but it has a few benefits IMO

  1. No pitfalls: It avoids the unintentional asset duplication problem
  2. Cleaner imports: When importing the compiled Elm code multiple times you only have to import { Elm } from "./src/Main.elm"; instead of import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm'; every time. Users have a single place to look at which Elm assets contain which modules.
  3. It scales better: Having 10+ modules in an array in the package.json works, while having 10 extra files in the query string becomes unreadable.

Any thoughts on this new approach? If you're in favor of this different approach, would you prefer me closing this PR and opening a new one or add commits to this one?

@ChristophP

ChristophP commented May 29, 2022

Copy link
Copy Markdown
Contributor Author

I went ahead and implemented the approach described above and it works. nicely e716224
Please let me know if you're okay with the approach, then I can add tests and documentation.

@mischnic

Copy link
Copy Markdown
Member

The biggest difference I see between query param and the central config: is there a usecase for importing A.elm?with=B.elm and in some other file importing A.elm?with=C.elm (which wouldn't be possible with the config the with parameter is essentially fixed for all imports)?

Apart from the other advantages you've listed, if the only concern is deduplicating two different but equivalent with lists, then this would work:

transform({asset}) {
    if(isQueryNormalized(asset.query)) {
        return whatTheElmTransformerDoesAtTheMoment(...);
    } else {
        let specifier = "./" + path.filename(asset.filePath) + "?" +
                        // Sort, normalize leading "./", etc.
                        normalizeQuery(asset.query).toString();
        asset.setCode(`export * from "${specifier}";`);
        asset.type = "js";
        return [asset];
    }
}

@ChristophP

ChristophP commented May 29, 2022

Copy link
Copy Markdown
Contributor Author

Hi @mischnic ,

is there a usecase for importing A.elm?with=B.elm and in some other file importing A.elm?with=C.elm (which wouldn't be possible with the config the with parameter is essentially fixed for all imports)?

I don't see one to be honest. The whole point of this multi-module compilation is saving on bundle size. If one needs A.elm and B.elm somewhere and A.elm and C.elm somewhere else, it would be better to have the three of them in one bundle together, because otherwise A.elm would be duplicated, which kinda defeats the purpose of doing mulit-module compilation in the first place.

Apart from the other advantages you've listed, if the only concern is deduplicating two different but equivalent with lists, then this would work:

This is very interesting. Nice trick to sort the query string and turn it into a JS asset with a normalized query. Maybe that would also be good for the image transformer.

Despite of there being a solution to the duplication problem, I think advantages of the config approach outweigh the ones of the query string approach (which is mostly not having to have a config). Especially the fact that the query approach gets really tedious with long file paths and/or many extra sources. So I'm proposing to go with the config solution.

@ChristophP

ChristophP commented Jun 17, 2022

Copy link
Copy Markdown
Contributor Author

So I ended up removing the query string deduplication logic for two reasons

  • I couldn't get it to work as I wanted it in the test. The Main.elm was always shown multiple times in the integration test.
  • The deduplication would have prevented unintentional asset duplication when things are imported more than once with different order of query params. If people use this feature sparingly or with the special resolver like this Add multi module compilation for elm #8076 (comment), unintentional asset duplication won't be a huge deal anyway.

Other than adding documentation for this feature and publishing the resolver package I think the work here is done. Does this look like something we could merge?

@mischnic

Copy link
Copy Markdown
Member

If they're duplicated, then there will be two assets with the filepath Main.elm:

The Main.elm was always shown multiple times in the integration test.

(Sorry, I might have confused you there. There will be multiple assets called Main.elm, but only one of them will be the actual Elm bundle, and all other ones will just contain export * from "...";)

@ChristophP

Copy link
Copy Markdown
Contributor Author

@mischnic

(Sorry, I might have confused you there. There will be multiple assets called Main.elm, but only one of them will be the actual Elm bundle, and all other ones will just contain export * from "...";)

Ah ok, yeah I got confused a bit. Would there be a good way to test that the other ones are just export * from ... then?

If so I could do that and bring back the deduplication. But overall, I guess the deduplication logic is not that important when the recommended way to use this feature is the resolver you described.

@mischnic

Copy link
Copy Markdown
Member

I can't think of a better way than asset.getCode().length < ???.

But yeah, I don't think that the deduplication is essential

@ChristophP

Copy link
Copy Markdown
Contributor Author

Ok, if you're fine not having the deduplication logic then I guess this PR is ready.

Would you like me to create another PR for the documentation?

@devongovett devongovett left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yes updating the documentation would be really helpful. 😄

@ChristophP

Copy link
Copy Markdown
Contributor Author

Added some docs here parcel-bundler/website#1038

Will still finish up the extra third party resolver package @mischnic was proposing. here #8076 (comment)

@ChristophP

ChristophP commented Jul 31, 2022

Copy link
Copy Markdown
Contributor Author

Thanks @mischnic will look at some of the suggestions

@ChristophP

ChristophP commented Jul 31, 2022

Copy link
Copy Markdown
Contributor Author

I also bumped terser to the latest version to fix some issues around production builds that hang forever
see terser/terser#1212 and passiomatic/elm-designer#58

@mischnic

Copy link
Copy Markdown
Member

You'll need to run yarn to update the lockfile after the Terser bump. And please also bump the version in https://github.com/parcel-bundler/parcel/blob/v2/packages/optimizers/terser/package.json

@devongovett

Copy link
Copy Markdown
Member

Handled terser bump separately

@devongovett devongovett merged commit fca5c8c into parcel-bundler:v2 Jul 31, 2022
@ChristophP ChristophP deleted the add-multi-module-compilation-for-elm branch July 31, 2022 19:52
gorakong pushed a commit that referenced this pull request Nov 3, 2022
* upstream/v2: (22 commits)
  Cross compile toolchains are built into docker image already
  Also fix release build
  Update centos node version
  v2.7.0
  Changelog for v2.7.0
  Use placeholder expression when replacing unused symbols (#8358)
  Lint (#8359)
  Add support for errorRecovery option in @parcel/transformer-css (#8352)
  VS Code Extension for Parcel (#8139)
  Add multi module compilation for elm (#8076)
  Bump terser from 5.7.2 to 5.14.2 (#8322)
  Bump node-forge from 1.2.1 to 1.3.0 (#8271)
  allow cjs config files on type module projects (#8253)
  inject script for hmr when there is only normal script in html (#8330)
  feat: support react refresh for @emotion/react (#8205)
  Update index.d.ts (#8293)
  remove charset from jsloader script set (#8346)
  Log resolved targets in verbose log level (#8254)
  Fix missing module in large app from Experimental Bundler (#8303)
  [Symbol Propagation] Non-deterministic bundle hashes (#8212)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Elm code with Debug.log fails with Javascript error Elm: Bulk compile assets to avoid duplication

4 participants