This repository provides code and instructions for training agents under action disturbances for a safe and robust transfer between environments with different dynamics, preventing safety violations after the transfer.
This code builds up on OmniSafe.
Clone omnisafe:
git clone https://github.com/PKU-Alignment/omnisafeClone this repository:
git clone https://github.com/ai-fm/safe-and-robust-transferAdd the source code from this project to omnisafe:
- Copy the files in
./safe-and-robust-transfer/src/algorithms/into./omnisafe/omnisafe/algorithms/, and include these algorithms in./omnisafe/omnisafe/algorithms/__init__.py. - Copy the files in
./safe-and-robust-transfer/src/configs/into./omnisafe/omnisafe/configs/ - Copy the files in
./safe-and-robust-transfer/src/envs/into./omnisafe/omnisafe/envs/.
Done! Now the project can be installed with
cd omnisafe
pip install -e .Please take a look at the OmniSafe installation instructions for more details.
All scripts are located in src/scripts/:
src/scripts/train/train_guides.py trains the guides.
- Use
DDPGNoiseto train a guide with random noise. - Use
DDPGAdversarialto train a guide with adversarial perturbations. - Use
SACLagto train a guide with entropy maximization.
Environment options are SafetyPointGuide1-v0, SafetyPointGuide2-v0, and
SafetyPointGuide3-v0.
src/scripts/train/train_students.py trains the students.
- Use
SaGuiCSif the guide is nondeterministic (SAC). - Use
SaGuiCSDetif the guide is deterministic (DDPG).
Environment options are SafetyPointStudent1-v0, SafetyPointStudent2-v0, and
SafetyPointStudent3-v0.
src/scripts/robustness measures the robustness of an agent.
Make sure to provide:
- A
config.jsonfile. - A
torch_save/{MODEL_FNAME}file, whereMODEL_FNAMEusually looks likeepoch-XXX.pt.
This code is licensed under the terms of the Apache License.