Skip to content

Commit 6fd6838

Browse files
authored
Merge pull request #984 from XenaLi96/main
OSRE26 project: Omni-ST
2 parents df67fbb + 81a330c commit 6fd6838

4 files changed

Lines changed: 60 additions & 0 deletions

File tree

content/project/osre26/.gitkeep

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
title: "Omni-ST: Instruction-Driven Any-to-Any Multimodal Modeling for Spatial Transcriptomics"
3+
date: 2026-01-29
4+
lastmod: 2026-01-29
5+
6+
authors:
7+
- "Xi Li"
8+
9+
tags:
10+
- osre26
11+
- spatial-transcriptomics
12+
- multimodal
13+
- instruction-tuning
14+
- computational-pathology
15+
16+
summary: "A unified instruction-driven multimodal framework that enables any-to-any translation across images, gene expression, spatial graphs, and text in spatial transcriptomics."
17+
---
18+
19+
## Project description
20+
21+
Spatial transcriptomics (ST) integrates spatially resolved gene expression with tissue morphology, enabling the study of cellular organization, tissue architecture, and disease microenvironments. Modern ST datasets are inherently multimodal, combining histology images (H&E / IF), gene expression vectors, spatial graphs, cell annotations, and free-text pathology descriptions.
22+
23+
However, most existing ST methods are task-specific and modality-siloed: separate models are trained for image-to-gene prediction, spatial domain identification, cell type classification, or text-based interpretation. This fragmentation limits cross-task generalization and scalability.
24+
25+
26+
![Omni-ST overview](omni-st-overview.png)
27+
28+
29+
**Omni-ST** proposes a single **instruction-driven any-to-any multimodal backbone** that treats each spatial transcriptomics modality as a “language” and formulates all tasks as:
30+
31+
**Instruction + Input Modality → Output Modality**
32+
33+
Natural language is elevated from auxiliary metadata to a **unifying interface** that specifies task intent, target modality, and biological context. This paradigm enables flexible, interpretable, and extensible spatial reasoning within a single model.
34+
35+
---
36+
37+
### Project Idea: Instruction-Driven Any-to-Any Modeling for Spatial Transcriptomics
38+
39+
**Topics:** spatial transcriptomics, multimodal learning, instruction tuning, computational pathology
40+
**Skills:** PyTorch, deep learning, Transformers, multimodal representation learning
41+
**Difficulty:** Hard
42+
**Size:** 350 hours
43+
44+
**Mentor:**
45+
- **Xi Li**<mailto:xil43@uci.edu>
46+
47+
**Essential information:**
48+
- Design a unified multimodal backbone with lightweight modality adapters for histology images, gene expression vectors, spatial graphs, and text.
49+
- Use natural language instructions to condition model behavior, enabling any-to-any translation without task-specific heads.
50+
- Support core tasks including image → gene expression prediction, gene expression → cell type / spatial domain identification, region → text-based biological explanation, and text-based spatial retrieval.
51+
- Evaluate the model across multiple spatial transcriptomics tasks within a single framework, emphasizing generalization and interpretability.
52+
- Develop visualization and interpretation tools such as spatial maps and language-grounded explanations.
53+
54+
**Expected deliverables:**
55+
- An open-source PyTorch implementation of the Omni-ST framework.
56+
- Unified multitask benchmarks for spatial transcriptomics.
57+
- Visualization and interpretation tools for spatial predictions.
58+
- Documentation and tutorials demonstrating how to add new tasks via instructions.
1.25 MB
Loading

0 commit comments

Comments
 (0)