dependency pinning has always been a funny thing to me because it seems that people take one of two flawed approaches. To be up-front, i'm a type-2.
too strict: "glass menagerie" of exact-pin builds
for each built published package, include requirements.txt showing the output of e.g. conda list in a built+activated environment. this lists all packages in the build environment with their exact version and build.
the upside- you can explain the need and motivation to a beginner, feeling like it intuitively makes sense. it feels like a warm blanket of security...
...until the very strong downside of almost-absent flexibility for builds reveals it's actually a strait-jacket. want to update something? time to interact with dependencies-of-dependencies that one definitely has never heard of nor cares about. a user is only able to install such a build if their own environment is compatible with every. dang. thing. in. that. list. literal 0 flexibility.
the next step to compensate in this extreme is to create large number of unique builds with exactly pinned dependencies. it won't work! now we have N different builds for v X.Y.Z, each with their own set of M restrictions to match exactly, and breaks when they are not. Hence, "glass menagerie".
too flexible: "surprise!" noarch builds
in this approach we correctly assume the environment resolver exists for good reasons, but overly rely on it across all settings: developer, user, and package channel. no versions in the reqs are capped so we are aware of every possible update as soon as a CI runs.
the upside is flexibility and "heads-up" structurally provided by having to resolve the environment everytime- if it doesn't resolve, something changed! that's a very strong and reliable signal to "dig-in" for developers. users benefit too, they experience easy installations across a wide variety of potential environments.
the downside is when something changes upstream, and neither the CI nor a developer working with the unpinned builds/code has found the problem yet. without providing a ceiling to certain major packages, thing may break without warning on the user end, and they will not have much to go on to figure out what went wrong.
i.e., the versioned/packaged build breaks, saying, "surprise!" And the user confusingly is left there saying, "this worked yesterday...". for versioned packages we strive to bring to a high level of professional quality, this will not remain acceptable over time! Even if we are getting away with it right now.
just right: "goldilocks" noarch with ranged dependency pins
a balanced approach in the middle would accept that we keep a single, universal noarch build on the conda-forge channel, but we set safe upper AND lower boundaries on the most important/fragile dependencies the package relies on, letting others "float" un-pinned in the build for both users and developer's sanity sake. the environment.yaml should probably remain a super-set of the last versioned release, i.e. un-bound for the CI, where something can safely tank and let us know that's the case.
example to illustrate the point, fremor is very dependent on netcdf4 and cmor, so we could make a cmor>=3.15.0,<4.0.0 + netcdf4>=1.7,<2.0 build, leaving all other requirements the same. we could scan over all minor-versioned combos of the two for numpy=2.* and then again for numpy=1.*.
upside: we get the best of both of the above two extremes.
downside: it takes considered and targeted effort to figure out which reqs should be bounded, and how wide/narrow those safe version ranges should be.
dependency pinning has always been a funny thing to me because it seems that people take one of two flawed approaches. To be up-front, i'm a type-2.
too strict: "glass menagerie" of exact-pin builds
for each built published package, include
requirements.txtshowing the output of e.g.conda listin a built+activated environment. this lists all packages in the build environment with their exact version and build.the upside- you can explain the need and motivation to a beginner, feeling like it intuitively makes sense. it feels like a warm blanket of security...
...until the very strong downside of almost-absent flexibility for builds reveals it's actually a strait-jacket. want to update something? time to interact with dependencies-of-dependencies that one definitely has never heard of nor cares about. a user is only able to install such a build if their own environment is compatible with every. dang. thing. in. that. list. literal 0 flexibility.
the next step to compensate in this extreme is to create large number of unique builds with exactly pinned dependencies. it won't work! now we have
Ndifferent builds for vX.Y.Z, each with their own set ofMrestrictions to match exactly, and breaks when they are not. Hence, "glass menagerie".too flexible: "surprise!"
noarchbuildsin this approach we correctly assume the environment resolver exists for good reasons, but overly rely on it across all settings: developer, user, and package channel. no versions in the reqs are capped so we are aware of every possible update as soon as a CI runs.
the upside is flexibility and "heads-up" structurally provided by having to resolve the environment everytime- if it doesn't resolve, something changed! that's a very strong and reliable signal to "dig-in" for developers. users benefit too, they experience easy installations across a wide variety of potential environments.
the downside is when something changes upstream, and neither the CI nor a developer working with the unpinned builds/code has found the problem yet. without providing a ceiling to certain major packages, thing may break without warning on the user end, and they will not have much to go on to figure out what went wrong.
i.e., the versioned/packaged build breaks, saying, "surprise!" And the user confusingly is left there saying, "this worked yesterday...". for versioned packages we strive to bring to a high level of professional quality, this will not remain acceptable over time! Even if we are getting away with it right now.
just right: "goldilocks"
noarchwith ranged dependency pinsa balanced approach in the middle would accept that we keep a single, universal
noarchbuild on theconda-forgechannel, but we set safe upper AND lower boundaries on the most important/fragile dependencies the package relies on, letting others "float" un-pinned in the build for both users and developer's sanity sake. theenvironment.yamlshould probably remain a super-set of the last versioned release, i.e. un-bound for the CI, where something can safely tank and let us know that's the case.example to illustrate the point,
fremoris very dependent onnetcdf4andcmor, so we could make acmor>=3.15.0,<4.0.0+netcdf4>=1.7,<2.0build, leaving all other requirements the same. we could scan over all minor-versioned combos of the two fornumpy=2.*and then again fornumpy=1.*.upside: we get the best of both of the above two extremes.
downside: it takes considered and targeted effort to figure out which reqs should be bounded, and how wide/narrow those safe version ranges should be.