Skip to content

StashCalculation: a new CalcJob plugin#6772

Merged
khsrali merged 8 commits into
aiidateam:mainfrom
khsrali:stashing/calcjob
Apr 30, 2025
Merged

StashCalculation: a new CalcJob plugin#6772
khsrali merged 8 commits into
aiidateam:mainfrom
khsrali:stashing/calcjob

Conversation

@khsrali
Copy link
Copy Markdown
Collaborator

@khsrali khsrali commented Feb 26, 2025

Historically, stashing was only possible, if it was instructed before running a generic calcjob. The instruction had to be "attached" to the original calcjob, like this for example:

inputs = {
    'MyInputs': <MyInputs>,
    'metadata': {
        'computer': Computer.collection.get(label="localhost"),
        'options': {
            'resources': {'num_machines': 1}, 
            'stash': {
                'stash_mode': StashMode.COPY.value,
                'target_base': '/scratch/',
                 'source_list': ['heavy_data.xyz'],
            },
        },
    },
}
run(MyCalculation, **inputs)

However, if a user would realize they need to stash something only after running th calcjob, this would not be possible.

This PR, defines a new calcjob, that is able to perform a stashing operation even after a calculation is finished.
The usage is very similar, and for consistency and user-friendliness, we keep the instruction as part of the metadata. The only main input is obviously a source node, for example:

StashCalculation_ = CalculationFactory('core.stash')


MyCalculation = orm.load_node(pk=<PK>)
inputs = {
    'metadata': {
        'computer': Computer.collection.get(label="localhost"),
        'options': {
            'resources': {'num_machines': 1}, 
            'stash': {
                'stash_mode': StashMode.COPY.value,
                'target_base': '/scratch/',
                 'source_list': ['heavy_data.xyz'],
            },
        },
    },
    'source_node': MyCalculation,
}

result = run(StashCalculation_, **inputs)

P.S. This PR is part of the upcoming changes, reference in #6764

@khsrali khsrali mentioned this pull request Feb 26, 2025
7 tasks
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 26, 2025

Codecov Report

❌ Patch coverage is 92.10526% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.31%. Comparing base (eb34b06) to head (6c44e6b).
⚠️ Report is 60 commits behind head on main.

Files with missing lines Patch % Lines
src/aiida/engine/daemon/execmanager.py 72.73% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6772      +/-   ##
==========================================
+ Coverage   78.29%   78.31%   +0.02%     
==========================================
  Files         566      567       +1     
  Lines       42766    42796      +30     
==========================================
+ Hits        33481    33510      +29     
- Misses       9285     9286       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@agoscinski agoscinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First quick review.

Comment thread src/aiida/calculations/stash.py Outdated
spec.inputs.pop('code', None)

# Ideally one could use the same computer as the one of the `source_node`.
# However, if another computer has access to the directory, we don't want to restrict.`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how would I pass another computer here? Maybe you could also explain more scenario more detailed.

Suggested change
# However, if another computer has access to the directory, we don't want to restrict.`
# However if you cannot access the stash storage from the same computer anymore but you have access to it from another computer, you can can specify the computer with this commented block below

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I updated the comment, and moved it to docstring.
Passing computer appears to be a very routing procedure it's done via metadata.computer.
It should be very clear, now, even for the users.

Comment thread src/aiida/common/datastructures.py Outdated
Comment thread src/aiida/engine/daemon/execmanager.py Outdated

if calculation.process_type == 'aiida.calculations:core.stash':
remote_node = load_node(calculation.inputs.source_node.pk)
uuid = calculation.inputs.source_node.uuid
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might increase readability

Suggested change
uuid = calculation.inputs.source_node.uuid
uuid = remote_node.uuid

@khsrali khsrali closed this Mar 5, 2025
@khsrali khsrali force-pushed the stashing/calcjob branch from 239ff8e to c535928 Compare March 5, 2025 11:40
@khsrali khsrali reopened this Mar 5, 2025
@khsrali
Copy link
Copy Markdown
Collaborator Author

khsrali commented Mar 5, 2025

First quick review.

applied!

@khsrali khsrali force-pushed the stashing/calcjob branch from 84f49a7 to 4c1504b Compare March 7, 2025 14:03
@khsrali khsrali requested a review from agoscinski March 7, 2025 14:21
@khsrali khsrali changed the title StashCalculation: new CalcJob plugin StashCalculation: a new CalcJob plugin Mar 7, 2025
Comment thread src/aiida/calculations/stash.py Outdated

spec.input(
'source_node',
valid_type=(orm.RemoteData, orm.SinglefileData),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FolderData?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 yeah, you're probably right

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you change it?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I added orm.FolderData to the list.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you? I still don't see it here

Comment thread src/aiida/engine/daemon/execmanager.py

if stash_mode == StashMode.COPY.value:
target_basepath = Path(stash_options['target_base']) / uuid[:2] / uuid[2:4] / uuid[4:]
target_basepath = target_base / uuid[:2] / uuid[2:4] / uuid[4:]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the construction of thpa path uuid[:2] / uuid[2:4] / uuid[4:] part of the CalcJob or CalcInfo (not sure which makes more sense but I see that also the the uuid is used from the CalcInfo? This should make this path less magical and better understandable. Then you also need to use that function in engine.daemon.execmanager:upload_calculation (AFAIK this is where the calculation folder is created)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but that would only work if stashing is done as a separate calcjob.
If stashing is performed as part as a generic calcjob, e.g. qe , then this is not possible, And making the folder should be done in execmanager.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then do a util function that is used in both cases, that takes a uuid as input and in the docstring you can explain how it should be used. The current state is highly suboptimal.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but this has been like this even before this PR.
And it's used only here, and in execmanager.py::upload_calculation --but not in the exact same way--

So I'd leave that for another PR, if we have to functionalize that.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand but you are continuing in this PR the bad design. It should be simply fixable by creating an util function. Please explain then why this change needs a significant higher amount of effort beyond just creating an util function that would justifies a fix in a separate PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I made a function that is being used only once?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm convinced. It's actually two places. :) also here:

        workdir = Path(remote_working_directory).joinpath(calc_info.uuid[:2], calc_info.uuid[2:4])

Line 135 execmanager

is it ok, if I do this in a separate PR? I'll open a sub-issue for this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes issue is also okay or directly PR might be easier since it should be easy fix (if I am not mistaken)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had the look, the logic in upload_calculation is a bit not straight forward. I'd leave this for another PR.


elif self._command == STASH_COMMAND:
if node.get_option('stash') is not None:
if (node.get_option('stash') is not None) or (type(self.process).__name__ == 'StashCalculation'):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not isinstance(self.process, StashCalculation)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remember why that didn't work,
I have to try it again 🤔

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still open, It could be that you want to explicitly check that it is a StashCalculation and not a child class but that would indicate to me also something being wrong

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now I realize the second condition is not needed at all.
The first condition would also serve for stashing via a calcjob. The instruction is similar:

        inputs = {
            'metadata': {
                'computer': Computer.collection.get(label="localhost"),
                'options': {
                    'resources': {'num_machines': 1},
                    'stash': {
                        'stash_mode': StashMode.COPY.value,
                        'target_base': '/scratch/my_stashing/',
                        'source_list': ['aiida.in', '_aiidasubmit.sh'],
                    },
                },
            },
            'source_node': node_1,
        }

Therefore node.get_option('stash') is not None will be True also in this case.

Comment thread src/aiida/engine/processes/calcjobs/tasks.py

elif self._command == STASH_COMMAND:
if node.get_option('stash') is not None:
if (node.get_option('stash') is not None) or (type(self.process).__name__ == 'StashCalculation'):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still open, It could be that you want to explicitly check that it is a StashCalculation and not a child class but that would indicate to me also something being wrong

Comment thread src/aiida/engine/processes/calcjobs/tasks.py

if stash_mode == StashMode.COPY.value:
target_basepath = Path(stash_options['target_base']) / uuid[:2] / uuid[2:4] / uuid[4:]
target_basepath = target_base / uuid[:2] / uuid[2:4] / uuid[4:]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then do a util function that is used in both cases, that takes a uuid as input and in the docstring you can explain how it should be used. The current state is highly suboptimal.

Comment thread src/aiida/calculations/stash.py Outdated

spec.input(
'source_node',
valid_type=(orm.RemoteData, orm.SinglefileData),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you change it?

@khsrali
Copy link
Copy Markdown
Collaborator Author

khsrali commented Apr 14, 2025

Hi @agoscinski

Thanks a lot for the nice review.
Especially, the ASCII art 😉 I applied your review.

@khsrali khsrali requested a review from agoscinski April 14, 2025 15:11
Comment thread src/aiida/calculations/stash.py Outdated

spec.input(
'source_node',
valid_type=(orm.RemoteData, orm.SinglefileData),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you? I still don't see it here

Comment thread src/aiida/calculations/stash.py

if stash_mode == StashMode.COPY.value:
target_basepath = Path(stash_options['target_base']) / uuid[:2] / uuid[2:4] / uuid[4:]
target_basepath = target_base / uuid[:2] / uuid[2:4] / uuid[4:]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand but you are continuing in this PR the bad design. It should be simply fixable by creating an util function. Please explain then why this change needs a significant higher amount of effort beyond just creating an util function that would justifies a fix in a separate PR.

Comment thread src/aiida/calculations/stash.py Outdated

class StashCalculation(CalcJob):
"""
Utility to stash files from a remote folder.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not only remote since you also support SinglefileData (and hopefully FolderData)

Comment thread src/aiida/calculations/stash.py
Comment thread src/aiida/engine/daemon/execmanager.py
Comment thread src/aiida/engine/processes/calcjobs/tasks.py
'target_base': str(target_base),
'source_list': ['*'],
},
},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when I forget to add stash options for the StashCalculation? Would like to know if the error message is clear to the user in this case.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very good point!!!
I totally, missed that. Now made those fields compulsory via:

spec.inputs['metadata']['options']['stash'].required = True
spec.inputs['metadata']['options']['stash']['stash_mode'].required = True
spec.inputs['metadata']['options']['stash']['target_base'].required = True
spec.inputs['metadata']['options']['stash']['source_list'].required = True

Comment thread src/aiida/calculations/stash.py Outdated
@khsrali khsrali requested a review from agoscinski April 16, 2025 08:10
Copy link
Copy Markdown
Collaborator

@agoscinski agoscinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point is important and needs short discussion #6772 (comment) (maybe in person is easier) otherwise is almost there

Comment thread src/aiida/calculations/stash.py
Comment thread src/aiida/calculations/stash.py

if stash_mode == StashMode.COPY.value:
target_basepath = Path(stash_options['target_base']) / uuid[:2] / uuid[2:4] / uuid[4:]
target_basepath = target_base / uuid[:2] / uuid[2:4] / uuid[4:]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes issue is also okay or directly PR might be easier since it should be easy fix (if I am not mistaken)

@khsrali
Copy link
Copy Markdown
Collaborator Author

khsrali commented Apr 24, 2025

yes issue is also okay or directly PR might be easier since it should be easy fix (if I am not mistaken)

Just had the look, the logic in upload_calculation is a bit not straight forward. I'd leave this for another PR.

Opened an issue:
#6836

@khsrali
Copy link
Copy Markdown
Collaborator Author

khsrali commented Apr 25, 2025

@agoscinski has opened a PR #6837 that would allow a CalcJobNode be an input for another CalcJobNode.

It's better and more intuitive if StashCalculation would accept directly a CalcJobNode as an input.
Therefore, this PR has to wait until that change is merged.

Update:
That change is no longer justified, check this #6837 (comment)

@khsrali khsrali added the pr/blocked PR is blocked by another PR that should be merged first label Apr 25, 2025
@khsrali khsrali removed the pr/blocked PR is blocked by another PR that should be merged first label Apr 29, 2025
@khsrali
Copy link
Copy Markdown
Collaborator Author

khsrali commented Apr 29, 2025

@agoscinski please see #6837 (comment)
Given that, I think this PR is ready to go.

@khsrali khsrali requested a review from agoscinski April 29, 2025 14:58
Copy link
Copy Markdown
Collaborator

@agoscinski agoscinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PR #6837 we explored the possibility to input a CalcJobNode as input to a CalcJob, so we could use its path for stashing. In the discussion there it was concluded that a CalcJob already returns a RemoteData node that contains the calcjob working directory. This RemoteData node can be then passed to the StashCalculation implemented in this PR and allows a same usage of the stash options as one would do it when stashing in the first place when running the CalcJob. Therefore the usage of RemoteData as input is also a user friendly. This needs however to be documented in the readthedocs in a future PR as this is not clear for a user (and developers^^)

@khsrali khsrali merged commit bc25323 into aiidateam:main Apr 30, 2025
37 of 38 checks passed
@khsrali khsrali deleted the stashing/calcjob branch May 2, 2025 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants