Skip to content

Commit 832618e

Browse files
Update README.md
1 parent b5b33e1 commit 832618e

1 file changed

Lines changed: 45 additions & 23 deletions

File tree

UNETR/BTCV/README.md

Lines changed: 45 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,22 @@
11
# Model Overview
2+
23
This repository contains the code for UNETR: Transformers for 3D Medical Image Segmentation [1]. UNETR is the first 3D segmentation network that uses a pure vision transformer as its encoder without relying on CNNs for feature extraction.
34
The code presents a volumetric (3D) multi-organ segmentation application using the BTCV challenge dataset.
45
![image](https://lh3.googleusercontent.com/pw/AM-JKLU2eTW17rYtCmiZP3WWC-U1HCPOHwLe6pxOfJXwv2W-00aHfsNy7jeGV1dwUq0PXFOtkqasQ2Vyhcu6xkKsPzy3wx7O6yGOTJ7ZzA01S6LSh8szbjNLfpbuGgMe6ClpiS61KGvqu71xXFnNcyvJNFjN=w1448-h496-no?authuser=0)
56

67
### Installing Dependencies
8+
79
Dependencies can be installed using:
8-
``` bash
10+
11+
```bash
912
pip install -r requirements.txt
1013
```
1114

1215
### Training
1316

1417
A UNETR network with standard hyper-parameters for the task of multi-organ semantic segmentation (BTCV dataset) can be defined as follows:
1518

16-
``` bash
19+
```bash
1720
model = UNETR(
1821
in_channels=1,
1922
out_channels=14,
@@ -30,12 +33,13 @@ model = UNETR(
3033
```
3134

3235
The above UNETR model is used for CT images (1-channel input) and for 14-class segmentation outputs. The network expects
33-
resampled input images with size ```(96, 96, 96)``` which will be converted into non-overlapping patches of size ```(16, 16, 16)```.
36+
resampled input images with size `(96, 96, 96)` which will be converted into non-overlapping patches of size `(16, 16, 16)`.
3437
The position embedding is performed using a perceptron layer. The ViT encoder follows standard hyper-parameters as introduced in [2].
3538
The decoder uses convolutional and residual blocks as well as instance normalization. More details can be found in [1].
3639

3740
Using the default values for hyper-parameters, the following command can be used to initiate training using PyTorch native AMP package:
38-
``` bash
41+
42+
```bash
3943
python main.py
4044
--feature_size=32
4145
--batch_size=1
@@ -48,28 +52,30 @@ python main.py
4852
--data_dir=/dataset/dataset0/
4953
```
5054

51-
Note that you need to provide the location of your dataset directory by using ```--data_dir```.
55+
Note that you need to provide the location of your dataset directory by using `--data_dir`.
5256

53-
To initiate distributed multi-gpu training, ```--distributed``` needs to be added to the training command.
57+
To initiate distributed multi-gpu training, `--distributed` needs to be added to the training command.
5458

55-
To disable AMP, ```--noamp``` needs to be added to the training command.
59+
To disable AMP, `--noamp` needs to be added to the training command.
5660

57-
If UNETR is used in distributed multi-gpu training, we recommend increasing the learning rate (i.e. ```--optim_lr```)
58-
according to the number of GPUs. For instance, ```--optim_lr=4e-4``` is recommended for training with 4 GPUs.
61+
If UNETR is used in distributed multi-gpu training, we recommend increasing the learning rate (i.e. `--optim_lr`)
62+
according to the number of GPUs. For instance, `--optim_lr=4e-4` is recommended for training with 4 GPUs.
5963

6064
### Finetuning
65+
6166
We provide state-of-the-art pre-trained checkpoints and TorchScript models of UNETR using BTCV dataset.
6267

6368
For using the pre-trained checkpoint, please download the weights from the following directory:
6469

6570
https://developer.download.nvidia.com/assets/Clara/monai/research/UNETR_model_best_acc.pth
6671

67-
Once downloaded, please place the checkpoint in the following directory or use ```--pretrained_dir``` to provide the address of where the model is placed:
72+
Once downloaded, please place the checkpoint in the following directory or use `--pretrained_dir` to provide the address of where the model is placed:
6873

69-
```./pretrained_models```
74+
`./pretrained_models`
7075

7176
The following command initiates finetuning using the pretrained checkpoint:
72-
``` bash
77+
78+
```bash
7379
python main.py
7480
--batch_size=1
7581
--logdir=unetr_pretrained
@@ -88,12 +94,13 @@ For using the pre-trained TorchScript model, please download the model from the
8894

8995
https://developer.download.nvidia.com/assets/Clara/monai/research/UNETR_model_best_acc.pt
9096

91-
Once downloaded, please place the TorchScript model in the following directory or use ```--pretrained_dir``` to provide the address of where the model is placed:
97+
Once downloaded, please place the TorchScript model in the following directory or use `--pretrained_dir` to provide the address of where the model is placed:
9298

93-
```./pretrained_models```
99+
`./pretrained_models`
94100

95101
The following command initiates finetuning using the TorchScript model:
96-
``` bash
102+
103+
```bash
97104
python main.py
98105
--batch_size=1
99106
--logdir=unetr_pretrained
@@ -108,39 +115,53 @@ python main.py
108115
--pretrained_model_name='UNETR_model_best_acc.pt'
109116
--resume_jit
110117
```
111-
Note that finetuning from the provided TorchScript model does not support AMP.
112118

119+
Note that finetuning from the provided TorchScript model does not support AMP.
113120

114121
### Testing
122+
115123
You can use the state-of-the-art pre-trained TorchScript model or checkpoint of UNETR to test it on your own data.
116124

117125
Once the pretrained weights are downloaded, using the links above, please place the TorchScript model in the following directory or
118-
use ```--pretrained_dir``` to provide the address of where the model is placed:
126+
use `--pretrained_dir` to provide the address of where the model is placed:
127+
128+
`./pretrained_models`
119129

120-
```./pretrained_models```
130+
The following command runs inference(validation or predict mask) using the provided checkpoint:
121131

122-
The following command runs inference using the provided checkpoint:
123-
``` bash
132+
```bash
124133
python test.py
134+
--mode='validation'
125135
--infer_overlap=0.5
126136
--data_dir=/dataset/dataset0/
127137
--pretrained_dir='./pretrained_models/'
128138
--saved_checkpoint=ckpt
129139
```
130140

131-
Note that ```--infer_overlap``` determines the overlap between the sliding window patches. A higher value typically results in more accurate segmentation outputs but with the cost of longer inference time.
141+
```bash
142+
python test.py
143+
--mode='predict'
144+
--infer_overlap=0.5
145+
--pretrained_dir='./pretrained_models/'
146+
--saved_checkpoint=ckpt
147+
```
148+
149+
Note that `--infer_overlap` determines the overlap between the sliding window patches. A higher value typically results in more accurate segmentation outputs but with the cost of longer inference time.
132150

133-
If you would like to use the pretrained TorchScript model, ```--saved_checkpoint=torchscript``` should be used.
151+
If you would like to use the pretrained TorchScript model, `--saved_checkpoint=torchscript` should be used.
134152

135153
### Tutorial
154+
136155
A tutorial for the task of multi-organ segmentation using BTCV dataset can be found in the following:
137156

138157
https://github.com/Project-MONAI/tutorials/blob/main/3d_segmentation/unetr_btcv_segmentation_3d.ipynb
139158

140159
Additionally, a tutorial which leverages PyTorch Lightning can be found in the following:
141160

142161
https://github.com/Project-MONAI/tutorials/blob/main/3d_segmentation/unetr_btcv_segmentation_3d_lightning.ipynb
162+
143163
## Dataset
164+
144165
![image](https://lh3.googleusercontent.com/pw/AM-JKLX0svvlMdcrchGAgiWWNkg40lgXYjSHsAAuRc5Frakmz2pWzSzf87JQCRgYpqFR0qAjJWPzMQLc_mmvzNjfF9QWl_1OHZ8j4c9qrbR6zQaDJWaCLArRFh0uPvk97qAa11HtYbD6HpJ-wwTCUsaPcYvM=w1724-h522-no?authuser=0)
145166

146167
The training data is from the [BTCV challenge dataset](https://www.synapse.org/#!Synapse:syn3193805/wiki/217752).
@@ -152,14 +173,14 @@ Under Institutional Review Board (IRB) supervision, 50 abdomen CT scans of were
152173
- Modality: CT
153174
- Size: 30 3D volumes (24 Training + 6 Testing)
154175

155-
156176
We provide the json file that is used to train our models in the following link:
157177

158178
https://developer.download.nvidia.com/assets/Clara/monai/tutorials/swin_unetr_btcv_dataset_0.json
159179

160180
Once the json file is downloaded, please place it in the same folder as the dataset.
161181

162182
## Citation
183+
163184
If you find this repository useful, please consider citing UNETR paper:
164185

165186
```
@@ -173,6 +194,7 @@ If you find this repository useful, please consider citing UNETR paper:
173194
```
174195

175196
## References
197+
176198
[1] Hatamizadeh, Ali, et al. "UNETR: Transformers for 3D Medical Image Segmentation", 2021. https://arxiv.org/abs/2103.10504.
177199

178200
[2] Dosovitskiy, Alexey, et al. "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

0 commit comments

Comments
 (0)