Is your feature request related to a problem? Please describe.
The issue related to #1565, where the User encountered a problem while importing user-defined modules in Jupyter notebooks while running a FederatedRuntime experiment
In the current implementation of Workflow API, jupyter notebook is expected to define the Federated Learning experiment in it's entirety. If the user attempts to import a user defined module from a different python script it will fail due to following reasons:
- When the notebook is exported, the script inside the
generated_workspace does not contain the user defined code
- This will lead to
ModuleNotFoundError failure during execution on participants in a distributed infrastructure
Describe the solution you'd like
Enable users to import helper functions from a separate python script. For e.g. FL experiment tutorial: crowd_guard.ipynb is importing some helper functions / classes from user-defined script validation.py in a folder workspace
workspace
├── crowd_guard.ipynb
└── validation.py
validation.py (contains helper class)
class CrowdGuardClientValidation:
def __distance_global_model_final_metric(distance_type: str, prediction_matrix,
prediction_global_model, sample_indices_by_label,
own_index):
def __predict_for_single_model(model, local_data, device):
def __do_predictions(models, global_model, local_data, device):
def __prune_poisoned_models(num_layers, total_number_of_clients, own_client_index,
distances_by_metric, verbose=False):
def validate_models(global_model, models, own_client_index, local_data, device):
CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)
#| export
from validation import CrowdGuardClientValidation
class FederatedFlow_CrowdGuard(FLSpec):
@aggregator
def start(self):
@collaborator
def train(self):
@collaborator
def local_validation(self):
...
detected_suspicious_models = CrowdGuardClientValidation.validate_models(self.global_model,
all_models,
own_client_index,
self.train_loader,
self.device)
...
@aggregator
def end(self):
To support this use case, existing export process in notebook_tools needs to be enhanced to
- Analyse all imports in Jupyter Notebook and identify user-defined imports
- Copy user defined python scripts / folders containing these imports into the
generated_workpace
This shall ensure that generated_workspace (shown below) includes all user-defined code and ensure that it works on the distributed infrastructure
workspace
├── generated_workspace
│ ├── src
│ │ ├── __init__.py
│ │ ├── experiment.py
│ │ └── validation.py
│ ├── .workspace
│ ├── plan
│ │ └── plan.yaml
│ └── requirements.txt
├── crowd_guard.ipynb
└── validation.py
Describe alternatives you've considered
N.A.
Additional context
This enhancement shall be based on following Requirements & Guidelines:
-
Export Directives
- User-defined imports should be present in a notebook cell that is annotated by
#| export directive as the first line
- Rationale:
#| export directives are required to export the user-defined imports to exported script and further processing
-
User-defined scripts should not install any packages:
- User-defined scripts should not install any package
- Rationale:
- While the exported script is analyzed to identify dependencies and build the
requirements.txt for the FL experiment, User-defined scripts are not analyzed by the infrastructure to identify dependencies
-
Location of User defined python scripts:
- User-defined modules must be placed in the same directory as the Jupyter Notebook to enable the infrastructure to correctly locate and copy these modules into the
generate_workspace
- Rationale:
- Custom Path Dependencies: A user-defined module located
/home/users
└── fl_helper
├── __init__.py
└── validation.py
CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)
...
sys.path.append('/home/user/fl_helpers')
from validation import CrowdGuardClientValidation
- Importing from a custom path requires explicit modification of
sys.path, which is not recommended and can lead to inconsistency across distributed system
- Relying on custom paths or module locations outside the notebook directory, which adds complexity for the infrastructure in identifying and accessing the required user-defined modules
- Placing the modules in the same directory as the Jupyter Notebook streamlines the process, simplifies access, and eliminates the need to modify
sys.path
Example:
workspace
├── crowd_guard.ipynb
└── fl_helper
├── __init__.py
└── validation.py
-
Restrictions on User-defined imports
- User defined code should not modify the
sys.path to enable python to find the scripts to import. For e.g.
workspace
├── crowd_guard.ipynb
└── helper
└── validation.py
CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)
...
...
sys.path.append('./helper')
from validation import CrowdGuardClientValidation
...
Recommended_Usage
from helper.validation import CrowdGuardClientValidation
-
User-defined imports should be self-contained:
-
User defined code should not import other user-defined code from different python scripts. For e.g.
utils.py (contains additional helper functions)
def calculate_accuracy(predictions, labels):
correct = (predications == labels).sum()
return correct / len(labels)
validation.py (contains helper functions)
from utils import calculate_accuracy
class CrowdGuardClientValidation:
def validate_models(global_model, models, own_client_index, local_data, device):
...
accuracy = calculate_accuracy(predictions, labels)
Is your feature request related to a problem? Please describe.
The issue related to #1565, where the User encountered a problem while importing user-defined modules in Jupyter notebooks while running a
FederatedRuntimeexperimentIn the current implementation of Workflow API, jupyter notebook is expected to define the Federated Learning experiment in it's entirety. If the user attempts to import a user defined module from a different python script it will fail due to following reasons:
generated_workspacedoes not contain the user defined codeModuleNotFoundErrorfailure during execution on participants in a distributed infrastructureDescribe the solution you'd like
Enable users to import helper functions from a separate python script. For e.g. FL experiment tutorial:
crowd_guard.ipynbis importing some helper functions / classes from user-defined scriptvalidation.pyin a folderworkspacevalidation.py (contains helper class)
CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)
To support this use case, existing export process in
notebook_toolsneeds to be enhanced togenerated_workpaceThis shall ensure that
generated_workspace(shown below) includes all user-defined code and ensure that it works on the distributed infrastructureDescribe alternatives you've considered
N.A.
Additional context
This enhancement shall be based on following Requirements & Guidelines:
Export Directives
#| exportdirective as the first line#| exportdirectives are required to export the user-defined imports to exported script and further processingUser-defined scripts should not install any packages:
requirements.txtfor the FL experiment, User-defined scripts are not analyzed by the infrastructure to identify dependenciesLocation of User defined python scripts:
generate_workspacesys.path, which is not recommended and can lead to inconsistency across distributed systemsys.pathExample:
Restrictions on User-defined imports
sys.pathto enable python to find the scripts to import. For e.g.User-defined imports should be self-contained:
User defined code should not import other user-defined code from different python scripts. For e.g.
utils.py (contains additional helper functions)
validation.py (contains helper functions)