This project conducts an in-depth study of how various pretrained image classification models perform when used as encoders for seismic image denoising tasks. Our architecture follows an encoder-decoder structure where:
- Encoder = pretrained model (e.g., EfficientNet, DenseNet, etc.)
- Decoder = UNet++, chosen for its dense skip connections and superior performance in preserving structural information during segmentation tasks.
Seismic images often contain noise that can obscure important geological structures. Our objective is to:
Denoise seismic images using a hybrid deep learning pipeline that leverages powerful pretrained encoders with a UNet++ decoder.
- Evaluate the performance of state-of-the-art pretrained models as encoders on seismic image denoising.
- Analyze how different encoder backbones affect the structure preservation capability of UNet++.
- Compare performances to identify top-performing models.
- Task: Image Denoising (Preserving key structures)
- Model Architecture: Encoder–Decoder
- Encoder: Pretrained classification model (from torchvision or timm)
- Decoder: UNet++
- Dataset: Seismic image dataset from Think-Towards Challenge
- Loss Functions: Binary Cross Entropy + Dice Loss
- Evaluation Metrics: PSNR, SSIM, IoU
| Model | Year | Type |
|---|---|---|
| AlexNet | 2012 | CNN |
| VGG | 2015 | CNN |
| ResNet | 2015 | CNN |
| DenseNet | 2016 | CNN |
| ShuffleNet V2 | 2016 | CNN |
| MobileNet V1/V2 | 2017 | CNN |
| EfficientNet | 2018 | CNN |
| EfficientNet V2 | 2019 | CNN |
| RegNet | 2020 | CNN |
| ConvNeXt | 2021 | CNN |
| RexNet | 2020 | CNN |
| MaxViT, Swin, ViT | 2020–2022 | ❌ Not tested (resource constraints) |
-
✅ Top Performers:
- RexNet
- DenseNet
- EfficientNetV2
-
❌ Transformers (e.g., Swin, ViT, MaxViT) were excluded due to computational limitations.
We chose UNet++ for its dense connections that enhance gradient flow and structural fidelity—ideal for segmentation-like tasks such as denoising, especially when retaining the underlying patterns in seismic data is crucial.
Based on our experiments, the following encoder backbones performed best when paired with the UNet++ decoder for seismic image denoising:
- ✅ RexNet
- ✅ DenseNet
- ✅ EfficientNetV2
| Model | Sample Output |
|---|---|
| RexNet | (![]() |
| ) | |
| DenseNet | (![]() |
| ) | |
| EfficientNetV2 | (![]() |
| ) |


