ucsc-ospo
diff --git a/‎content/authors/jeanlucabez/_index.md‎
Lines changed: 3 additions & 5 deletions b/‎content/authors/jeanlucabez/_index.md‎
Lines changed: 3 additions & 5 deletions
diff --git a/‎content/authors/jeanlucabez/avatar.jpg‎
-266 KB b/‎content/authors/jeanlucabez/avatar.jpg‎
-266 KB
diff --git a/‎content/project/osre26/lbl/aidrin/featured.png‎
37 KB b/‎content/project/osre26/lbl/aidrin/featured.png‎
37 KB
diff --git a/‎content/project/osre26/lbl/aidrin/index.md‎
Lines changed: 22 additions & 0 deletions b/‎content/project/osre26/lbl/aidrin/index.md‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎content/project/osre26/lbl/drishti/featured.png‎
58.2 KB b/‎content/project/osre26/lbl/drishti/featured.png‎
58.2 KB
diff --git a/‎content/project/osre26/lbl/drishti/index.md‎
Lines changed: 20 additions & 0 deletions b/‎content/project/osre26/lbl/drishti/index.md‎
Lines changed: 20 additions & 0 deletions
@@ -15,16 +15,14 @@ role: "Research Scientist, Lawrence Berkeley National Laboratory"
 
 # Organizations/Affiliations
 organizations:
-- name: Scientific Data Management Research
-  url: "https://crd.lbl.gov/divisions/scidata/sdm"
 - name: Computing Sciences Research Division
-  url: "https://crd.lbl.gov"
+  url: "https://cs.lbl.gov"
 - name: Lawrence Berkeley National Laboratory
   url: "https://www.lbl.gov"
 
 
 # Short bio (displayed in user profile at end of posts)
-bio: Jean Luca's research interests are in high-performance computing + I/O + storage. 
+bio: Jean Luca is a Career-Track Research Scientist at Lawrence Berkeley National Laboratory (LBNL), USA. Jean Luca's research interests are in High Performance Computing (HPC), data management, I/O, storage, and AI data readiness.
 
 
 
@@ -35,7 +33,7 @@ bio: Jean Luca's research interests are in high-performance computing + I/O + st
 social:
 - icon: home
   icon_pack: fas
-  link: https://crd.lbl.gov/divisions/scidata/sdm/staff/jean-luca-bez/
+  link: https://profiles.lbl.gov/148621-jean-luca-bez
 - icon: github
   icon_pack: fab
   link: https://github.com/jeanbez
 
@@ -0,0 +1,22 @@
+---
+title: "AI Data Readiness Inspector (AIDRIN)"
+authors: [jeanlucabez, surenbyna]
+author_notes: ["Lawrence Berkeley National Laboratory", "The Ohio State University (OSU)"]
+tags: ["osre26", "uc", "LBNL", "data science", "AI"]
+date: 2026-01-30T10:15:00-07:00
+lastmod: 2026-01-30T10:15:00-07:00
+---
+
+Garbage In, Garbage Out (GIGO) is a widely accepted quote in computer science across various domains, including Artificial Intelligence (AI). As data is the fuel for AI, models trained on low-quality, biased data are often ineffective. Computer scientists who use AI invest considerable time and effort in preparing the data for AI. 
+
+[AIDRIN](https://arxiv.org/pdf/2406.19256) (AI Data Readiness INspector) is a framework that provides a quantifiable assessment of data readiness for AI processes, covering a broad range of dimensions from the literature. AIDRIN uses metrics from traditional data quality assessment, such as completeness, outliers, and duplicates, to evaluate data. Furthermore, AIDRIN uses metrics specific to assessing AI data, such as feature importance, feature correlations, class imbalance, fairness, privacy, and compliance with the FAIR (Findability, Accessibility, Interoperability, and Reusability) principles. AIDRIN provides visualizations and reports to assist data scientists in further investigating data readiness.
+
+### AIDRIN Multiple File Formats
+
+The proposed work will include improvements in the AIDRIN framework to (1) add support for new file formats such as Zarr, ROOT, and HDF5; and (2) to allow providing custom data ingestion mechanisms.
+
+- **Topics:** `data readiness`, `AI`, `data analysis`
+- **Skills:** Python, C/C++, data analysis, good communicator
+- **Difficulty:** Moderate
+- **Size:** Large (350 hours)
+- **Mentors:** {{% mention jeanlucabez %}} and {{% mention surenbyna %}}
@@ -0,0 +1,20 @@
+---
+title: "Drishti"
+authors: [jeanlucabez, "Suren Byna"]
+author_notes: ["Lawrence Berkeley National Laboratory", "The Ohio State University (OSU)"]
+tags: ["osre26", "uc", "LBNL", "data science", "visualization", "profiling", "tracing"]
+date: 2026-01-30T10:15:00-07:00
+lastmod: 2026-01-30T10:15:00-07:00
+---
+
+[Drishti](https://github.com/hpc-io/drishti) is a novel interactive web-based analysis framework to visualize I/O traces, highlight bottlenecks, and help understand the I/O behavior of scientific applications. Drishti aims to fill the gap between the trace collection, analysis, and tuning phases. The framework contains an interactive I/O trace analysis component for end-users to visually inspect their applications' I/O behavior, focusing on areas of interest and getting a clear picture of common root causes of I/O performance bottlenecks. Based on the automatic detection of I/O performance bottlenecks, our framework maps numerous common and well-known bottlenecks and their solution recommendations that can be implemented by users.
+
+### Drishti Comparisons and Heatmaps
+
+The proposed work will include investigating and building a solution to allow comparing and finding differences between two I/O trace files (similar to a `diff`), covering the analysis and visualization components. It will also explore additional metrics and counters such as Darshan heatmaps in the analysis and visualization components of the framework.
+
+- **Topics:** `I/O`, `HPC`, `data analysis`, `visualization`, `profiling`, `tracing`
+- **Skills:** Python, data analysis, performance profiling
+- **Difficulty:** Moderate
+- **Size:** Large (350 hours)
+- **Mentors:** {{% mention jeanlucabez %}} and [Suren Byna](mailto:sbyna@lbl.gov)