Skip to content

Add MDL Compiler dependency for [genmdl] tests#2254

Merged
jstone-lucasfilm merged 127 commits intoAcademySoftwareFoundation:mainfrom
krohmerNV:krohmer/mdl_sdk_integration
Nov 6, 2025
Merged

Add MDL Compiler dependency for [genmdl] tests#2254
jstone-lucasfilm merged 127 commits intoAcademySoftwareFoundation:mainfrom
krohmerNV:krohmer/mdl_sdk_integration

Conversation

@krohmerNV
Copy link
Copy Markdown
Contributor

@krohmerNV krohmerNV commented Feb 28, 2025

Extend test for MaterialXGenMdl.

This commit extends source/MaterialXTest/MaterialXGenMdl as a test for
source/MaterialXGenMdl, plus the necessary work on the cmake scripts and github
actions to add the MDL SDK as a dependency.

New cmake variables:

  • MATERIALX_MDL_SDK_DIR: if set and mdlc is found the test in
    source/MaterialXTest/MaterialXGenMdl is extended to run mdlc on the generated
    MDL files.
  • MATERIALX_MDL_BINARY_MDLC: name of the mdlc binary, derived from MATERIALX_MDL_SDK_DIR

The CI tests use vcpkg to install the MDL SDK, but a manually built one also
works (alternative for developers).

To avoid rebuilding the MDL SDK with every MR, the github actions use the
github cache to reuse binaries from earlier MRs. The ports glslang[tools] and
openimageio from other steps are also included in that cache.

Right now, the MDL SDK dependency is enabled only for Windows_VS2022_x64_Python313 in the extended build.

@krohmerNV krohmerNV force-pushed the krohmer/mdl_sdk_integration branch from 2cec37e to a3e9b36 Compare March 4, 2025 15:14
@krohmerNV krohmerNV force-pushed the krohmer/mdl_sdk_integration branch from a3e9b36 to 9ef10f1 Compare March 11, 2025 11:29
@krohmerNV krohmerNV force-pushed the krohmer/mdl_sdk_integration branch from 53174b2 to 0ecc251 Compare April 22, 2025 15:41
jstone-lucasfilm pushed a commit that referenced this pull request Jun 15, 2025
This PR is mostly about the added tests materials.

- MDL should be correct as far as I can tell. SSS is implemented using a random walk in the path tracer (which will be added to #2254) 
- For the GLSL implementation the radius seems to not have a big impact which might be okay due to the real-time approximation. But we are wondering about  the `max(mfp, 0.1)` [here ](https://github.com/AcademySoftwareFoundation/MaterialX/blob/0e6b5bfb46c7d3c4ad804e4394c8333d740e6d01/libraries/pbrlib/genglsl/lib/mx_microfacet_diffuse.glsl#L175). This means at least 10cm radius if the scene is modelled in meters, right?
- OSL matches GLSL but I haven't checked the source.
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented Jun 16, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@krohmerNV krohmerNV force-pushed the krohmer/mdl_sdk_integration branch 2 times, most recently from 1d11bcd to 20c58fe Compare June 26, 2025 13:51
@krohmerNV krohmerNV force-pushed the krohmer/mdl_sdk_integration branch from dd499f4 to e3003a0 Compare June 30, 2025 09:37
Comment thread .github/workflows/main.yml Outdated
@krohmerNV krohmerNV force-pushed the krohmer/mdl_sdk_integration branch from 029801a to 469b891 Compare July 4, 2025 11:49
@krohmerNV krohmerNV changed the title WIP: add MDL Compiler and Render Add MDL Compiler dependency for [genmdl] tests Jul 15, 2025
Comment thread .github/workflows/main.yml Outdated
@jstone-lucasfilm
Copy link
Copy Markdown
Member

@krohmerNV I agree that we should review and merge your work on MDL testing first, followed by the related PRs that affect MDL shader generation, but my sense is that all of these updates should wait until after the release of MaterialX 1.39.4.

In order to get this work on MDL testing ready to merge, we'll need to address the significant increase in build times, which rises from roughly 12 minutes to more than an hour.

One path forward would be to add a nightly build of MaterialX, with the MDL test suite included in only that build. This approach would also extend to the future inclusion of MaterialX testing with OpenImageIO, which similarly takes too long to be included in our builds that are run for every commit.

@jreichel-nvidia
Copy link
Copy Markdown
Contributor

About build times: this is why we added caching for vcpkg artefacts. The build time of ~1h in your commit occurs when there are no cache hits (probably because the vcpkg version changed in the last three weeks). With a warm cache the CI run takes 15 minutes (see my dummy commit above).

The images change roughly once per week (https://github.com/actions/runner-images/commits/main/images/ubuntu/Ubuntu2404-Readme.md), which likely includes the vcpkg version. AIUI once per week the first CI run after an image change would experience the longer build times, but not the others. Is that good enough to address the build time increase?

A more thorough CI run in a nightly build might make sense, though. I just wonder the genmdl test with mdlc needs to be distabled in the regular MR runs ...

@jstone-lucasfilm
Copy link
Copy Markdown
Member

@jreichel-nvidia I wonder if the presence of the cache just makes our build times more unpredictable, which is potentially a big disadvantage over the nightly build approach. Thinking of a new developer getting started with MaterialX for the first time, their first GitHub Actions run would always take the maximum possible time, since their cache would always be cold, and I'm concerned about the effect that this would have on Dev Days participants and other learners.

We definitely want to integrate your MDL testing work as soon as possible, ideally at the beginning of the development cycle for MaterialX 1.39.5, and I wonder if the best approach for now would be to make MDL builds opt-in through a GitHub Actions matrix flag that defaults to false.

This would allow teams to easily enable MDL builds on demand in the short term, and over time we would implement proper nightly builds that include both MDL and OIIO testing. This would create a framework that would extend naturally to OSL testing as well, where that build process is even longer than the one-hour build processes that we expect for MDL or OIIO.

What are your thoughts?

@jreichel-nvidia
Copy link
Copy Markdown
Contributor

@jstone-lucasfilm About new developers: The cache is not per account, but per project and shared between MRs. Each CI run should face the same risk of an outdated cache. But the build times are more unpredictable, yes.

My biggest concern with the nightly test is that one does not get the per-MR feedback and might unknowingly break tests. (Plus someone would regularly need to review test results, follow up on failures etc.)

Disabling the test matrix flags would be ok for me if that allows the MR getting merged.

What do you think about this experiment: Give it a try and merge the MR with the flags enabled. If the unpredictable build time becomes problematic you could still decide to disable it. And moving that test to the nightly tests later would also still be an option.

@jstone-lucasfilm
Copy link
Copy Markdown
Member

@jreichel-nvidia Is the cache shared between the main repository and all forks of that repository? The case I'm concerned about is a new developer working in a fork of MaterialX, and experiencing a one hour build that discourages them from using GitHub Actions. I've had that experience with other projects with GitHub Actions workflows that are so unwieldy that I don't end up using them at all.

@jreichel-nvidia
Copy link
Copy Markdown
Contributor

@jstone-lucasfilm Right, I forgot about forks. Their cache is empty on the very first MR.

I guess that means disabling the feature for now, and enabling it only later in the nightly tests. Is the MR fine apart from that (and merging/resolving conflicts)?

@jstone-lucasfilm
Copy link
Copy Markdown
Member

Bringing an internal NVIDIA thread back to the main PR, here are the next steps that I would recommend:

  • Let's reframe this as a validation step that will occur in a nightly build, and remove all of the infrastructure for CI caching.
  • For now, you can just include a build_mdl flag in CI, along with the build step that it invokes, without setting the flag in the matrix.
  • Once these changes are made, go ahead and highlight your updated PR in the MaterialX channel of the ASWF Slack, so that we can get the largest section of the community reviewing your work.

@jreichel-nvidia
Copy link
Copy Markdown
Contributor

  • I can only recommend to keep the vcpkg cache. Even if vcpkg is only used in nightly builds, devs will have to test changes for that mode and a quick CI turnaround time is very useful (working on github actions is already enough pain by itself). I can remove caching again, but for CI turnaround time I would suggest to delay that until it is clear that no other larger changes are required.
  • There is already a build_mdl_sdk flag. Is that what you meant? Do you want me to rename it? Still to be disabled of course. (There is also the additional cmake_config argument. The reason is that its value depends on the OS, but in the matrix I can't reference the variables above. This would be trivial in any reasonable scripting language, but YAML ...)

@jstone-lucasfilm
Copy link
Copy Markdown
Member

@jreichel-nvidia My sense is that the most important parts of this changelist are the new capabilities for MDL validation, and in order to get this merged soon, I think it will really help to make the PR as minimal as it can possibly be. We can then write up an independent PR where ideas for CI caching can be independently reviewed and tested across multiple forks of MaterialX, without slowing down the review and integration of this important work.

I've put together a first PR that actually implements a nightly MaterialX build, and my recommendation would be to structure your new MDL test as a parallel of the existing OIIO test, with a new extended_build_mdl flag that mirrors the existing extended_build_oiio:

#2621

We'll focus on getting the nightly build PR merged ASAP, so that you can test it directly with your MDL PR, making sure that the MDL test passes when you manually dispatch the main workflow in your fork of MaterialX.

Copy link
Copy Markdown
Member

@jstone-lucasfilm jstone-lucasfilm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@jstone-lucasfilm jstone-lucasfilm merged commit 3585b1a into AcademySoftwareFoundation:main Nov 6, 2025
32 checks passed
jreichel-nvidia added a commit to jreichel-nvidia/MaterialX that referenced this pull request Nov 25, 2025
Wrapper code based on that for the OSL test renderer and intermediate code
from AcademySoftwareFoundation#2254.

The new CMake variable MATERIALX_MDL_BINARY_TESTRENDER contains the name of the
df_vulkan binary, and is derived from MATERIALX_MDL_SDK_DIR.

Reserve name of the shader upfront, such that it becomes predictable (no "1"
suffix). The test renderer interface needs to know that name (and seems to be
useful in general).

TODO: Update CI integration once MDL SDK 2025.0.3 is on vcpkg and the df_vulkan
feature has been merged.
jreichel-nvidia added a commit to jreichel-nvidia/MaterialX that referenced this pull request Jan 5, 2026
Wrapper code based on that for the OSL test renderer and intermediate code
from AcademySoftwareFoundation#2254.

The new CMake variable MATERIALX_MDL_BINARY_TESTRENDER contains the name of the
df_vulkan binary, and is derived from MATERIALX_MDL_SDK_DIR.
jreichel-nvidia added a commit to jreichel-nvidia/MaterialX that referenced this pull request Jan 5, 2026
Wrapper code based on that for the OSL test renderer and intermediate code
from AcademySoftwareFoundation#2254.

The new CMake variable MATERIALX_MDL_BINARY_TESTRENDER contains the name of the
df_vulkan binary, and is derived from MATERIALX_MDL_SDK_DIR.
jreichel-nvidia added a commit to jreichel-nvidia/MaterialX that referenced this pull request Jan 8, 2026
Wrapper code based on that for the OSL test renderer and intermediate code
from AcademySoftwareFoundation#2254.

The new CMake variable MATERIALX_MDL_BINARY_TESTRENDER contains the path of the
df_vulkan binary, and is derived from MATERIALX_MDL_SDK_DIR.

Use x64-windows-release triplet for the MDL SDK to save disk space on the
runners (and build time).
jstone-lucasfilm pushed a commit that referenced this pull request Jan 28, 2026
Wrapper code based on that for the OSL test renderer and intermediate code from #2254.

The new CMake variable MATERIALX_MDL_BINARY_TESTRENDER contains the path of the df_vulkan binary, and is derived from MATERIALX_MDL_SDK_DIR.

Use x64-windows-release triplet for the MDL SDK to save disk space on the runners (and build time).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants