Skip to content

Federated Evaluation in Workflow API is not working as expected. #1637

@payalcha

Description

@payalcha

Describe the bug
Federated Evaluation is not working in Workflow API for FederatedRuntime.

  1. If notebook is running in 2.7.0. Envoys are giving below error
EXCEPTION : <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Unexpected <class 'FileNotFoundError'>: [Errno 2] No such file or directory: '/var/github/workspace/openfl/payalcha_openfl/openfl-tutorials/experimental/workflow/FederatedEvaluation/director/db3919cc-9932-469b-8f10-4af39a420042'"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2025-05-20T00:56:15.785507319-07:00", grpc_status:2, grpc_message:"Unexpected <class \'FileNotFoundError\'>: [Errno 2] No such file or directory: \'/var/github/workspace/openfl/payalcha_openfl/openfl-tutorials/experimental/workflow/FederatedEvaluation/director/db3919cc-9932-469b-8f10-4af39a420042\'"}"
>
Traceback (most recent call last):
  File "/var/github/workspace/openfl/venv310/bin/fx", line 8, in <module>
    sys.exit(entry())
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/interface/cli.py", line 310, in entry
    error_handler(e)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/interface/cli.py", line 229, in error_handler
    raise error
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/interface/cli.py", line 308, in entry
    cli(max_content_width=120)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/interface/cli.py", line 131, in invoke
    super().invoke(ctx)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/experimental/workflow/interface/cli/envoy.py", line 158, in start_
    envoy.start()
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/experimental/workflow/component/envoy/envoy.py", line 222, in start
    self._run()
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/experimental/workflow/component/envoy/envoy.py", line 145, in _run
    data_file_path = self._save_data_stream_to_file(data_stream)
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/openfl/experimental/workflow/component/envoy/envoy.py", line 172, in _save_data_stream_to_file
    for response in data_stream:
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/grpc/_channel.py", line 543, in __next__
    return self._next()
  File "/var/github/workspace/openfl/venv310/lib/python3.10/site-packages/grpc/_channel.py", line 969, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Unexpected <class 'FileNotFoundError'>: [Errno 2] No such file or directory: '/var/github/workspace/openfl/payalcha_openfl/openfl-tutorials/experimental/workflow/FederatedEvaluation/director/db3919cc-9932-469b-8f10-4af39a420042'"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2025-05-20T00:56:15.785507319-07:00", grpc_status:2, grpc_message:"Unexpected <class \'FileNotFoundError\'>: [Errno 2] No such file or directory: \'/var/github/workspace/openfl/payalcha_openfl/openfl-tutorials/experimental/workflow/FederatedEvaluation/director/db3919cc-9932-469b-8f10-4af39a420042\'"}"

  1. requirements.txt file is not properly generated. It must hold all the requirements mention in the notebook except basic openfl or workflow interface requirements. My understanding is, it is due to # | export keyword not present in the notebook cell.
Image
  1. Even if I change the torch 2.7.0 to 2.3.1. Notebooks run successfully but model seems not get properly loaded in case of FederatedRuntime as Aggregated values of LocalRuntime is not even close to FederatedRuntime
Image

FederatedRuntime aggregated value -
Average aggregated model accuracy values = 0.11860000342130661
Bengaluru value of 0.11860000342130661
Portland value of 0.11860000342130661

LocalRuntime aggregated value -
Average aggregated model accuracy values = 0.9070000052452087
Bengaluru value of 0.9064000248908997
Portland value of 0.9075999855995178

Huge difference in my understanding is due to the reason that in FederatedRuntime trained model is not loaded properly.

To Reproduce
Steps to reproduce the behavior:

  1. Clone openfl
  2. Pip install openfl and openfl-tutorials/experimental/workflow/workflow_interface_requirements.txt
  3. perform fx experimental activate
  4. Start director, envoys in openfl-tutorials/experimental/workflow/FederatedEvaluation
  5. Start notebook thru jupyter lab or papermill command

Expected behaviorA clear and concise description of what you expected to happen.

  1. generated_workspace must hold proper requirements.txt with all requirements.
  2. In generated experiment.py runtime must be federated_runtime not local_runtime. There fflow is initiated in the generated experiment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions