Skip to content

linjc/dataset-viewer

 
 

Repository files navigation

Dataset Viewer

⚡ Open massive files in seconds · 🔍 Millisecond search · 📦 Direct archive preview

GitHub release License: MIT Platform AI Generated

A modern, high-performance dataset viewer built with Tauri, React, and TypeScript. Designed to handle massive datasets from multiple sources with efficient streaming for large files (100GB+) and lightning-fast search capabilities.

中文文档 · Download · Report Bug · Request Feature

🚀 Key Features

  • Instant Large File Opening: Handle 100GB+ files with virtualized rendering, no waiting time
  • 🔍 Millisecond Search: Real-time search with highlighting, fast positioning in large files
  • 📦 Direct Archive Preview: Browse ZIP/TAR files without extraction, streaming file browser
  • 🗂️ Native Multi-Format Support: Optimized rendering for Parquet, Excel, CSV with syntax highlighting for JSON/YAML
  • 🌐 Multi-Source Data Access: WebDAV servers, local files, cloud storage (OSS), HuggingFace datasets
  • 🎨 Modern Interface: Dark/light themes, responsive design, multi-language support

📚 Supported File Types

  • 📄 Text & Code: Plain text, JSON, YAML, XML, JavaScript, Python, Java, C/C++, Rust, Go, PHP, and more

  • 📝 Documents: Markdown (rendered preview), Word Documents (.docx/.rtf, text extraction), PowerPoint (.pptx, slide preview), PDF (viewer with text search)

  • 📦 Archives: ZIP, TAR (streaming preview without extraction)

  • 📊 Data Files: Parquet (optimized), Excel, CSV, ODS with virtual scrolling for millions of rows

  • 📱 Media: Images, Videos, Audio (preview support)

📸 Screenshots

🔗 Connection Setup
Connection Setup
Easy connection management with multiple storage types
📊 JSON Viewer
JSON Viewer
Structured JSON display with syntax highlighting and collapsible nodes
💻 Code Viewer
Code Viewer
Multi-language syntax highlighting with large file support
📋 Data Sheets
Data Sheets
CSV/Excel visualization with filtering and sorting capabilities
🌐 Point Cloud Viewer
Point Cloud Viewer
Interactive 3D point cloud data visualization
📦 Archive Browser
Archive Browser
Browse ZIP/TAR archives without extraction

✨ Technical Highlights

  • 🤖 100% AI-Generated: Entire codebase created through AI assistance
  • 🚀 Native Performance: Tauri (Rust) backend + React frontend, cross-platform support
  • 🧠 Smart Memory Management: Chunked loading, virtual scrolling, handles millions of rows effortlessly
  • 📊 Streaming Processing: Large file chunked transmission, compressed files without full extraction

🎯 Perfect For

  • 📊 Data Scientists: Quickly explore large datasets, Parquet files, and CSV data
  • 🔍 Log Analysis: Search through massive log files without loading everything into memory
  • 📦 Archive Management: Browse ZIP/TAR contents without extraction
  • ☁️ Remote Data: Access files from WebDAV servers, cloud storage, and HuggingFace
  • 🚀 Performance Critical: When you need instant file access and lightning-fast search

🤝 Contributing

We welcome contributions! Here's how you can help:

  • 🐛 Bug Reports: Open an issue with clear description and steps to reproduce
  • 💡 Feature Requests: Suggest new features and explain why they would be useful
  • 🔧 Code Contributions: Fork → Create feature branch → Make changes → Submit PR
  • 📖 Documentation: Help improve our docs and examples
  • Star the project: Show your support by starring the repository

🙏 Acknowledgments

  • 🤖 AI Development: This project showcases the power of AI-assisted development
  • 🛠 Tauri Team: For creating an amazing framework
  • ⚛️ React Community: For the excellent ecosystem
  • 🦀 Rust Community: For the robust language and tools

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ and 🤖 AI

About

A sleek dataset viewer built entirely by AI Agent. Supports streaming large files from WebDAV, OSS, local or Hugging Face.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 70.9%
  • Rust 28.2%
  • Other 0.9%