-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Extremely slow image uploads and thumbnail generation on v8.2.1 (GridFS) despite massive hardware scaling and DB indexing #40011
Description
Description:
We are experiencing severe delays when users upload media files (especially heavy images). The upload process and the subsequent thumbnail generation take an unreasonable amount of time to render in the chat.
We have completely ruled out infrastructure bottlenecks. We drastically scaled our environment and applied known community workarounds for GridFS and Node.js memory limits, but the application still behaves sluggishly during media uploads, suggesting a bottleneck in UploadFS/GridFS or the internal image processing (Sharp).
Steps to reproduce:
- Go to any channel or direct message.
- Upload a heavy image file (e.g., 5MB - 15MB).
- Observe the extreme delay during the upload phase and the time it takes for the thumbnail to finally render in the chat timeline.
Expected behavior:
Image uploads and OEmbed/Thumbnail generation should process quickly, utilizing the available hardware resources without hanging the user experience.
Actual behavior:
The upload and thumbnail generation take a very long time (often 30+ seconds). During this time, the host OS shows plenty of idle CPU and abundant free RAM, meaning the Rocket.Chat container/Node.js is bottlenecking internally and not utilizing the available hardware to process the image.
Server Setup Information:
- Version of Rocket.Chat Server: 8.2.1
- License Type: Community
- Number of Users: [50]
- Operating System: AlmaLinux 10.1
- Deployment Method: Docker Compose
- Number of Running Instances: 1
- DB Replicaset Oplog: Enabled (rs0)
- NodeJS Version: Bundled with 8.2.1 Docker image
- MongoDB Version: 8.0
Client Setup Information
- Desktop App or Browser Version: Affects all clients (Desktop App and latest Chrome/Firefox browsers)
- Operating System: Windows / Linux / macOS
Additional context
Troubleshooting steps we already took (without success):
- Hardware Scale-Up: Increased the VM to 4 vCPUs and 8GB of RAM. The server is completely idle and there is no swap usage (over 5.5GB of RAM completely free).
- Node.js Tuning: Added
NODE_OPTIONS=--max-old-space-size=4096to the environment variables to prevent Garbage Collector thrashing and allow the V8 engine to use the abundant RAM. - MongoDB Indexing: We manually created the GridFS chunks index inside MongoDB as suggested in older community threads (
db.rocketchat_uploads.chunks.createIndex( { files_id: 1, n: 1 }, { unique: true } )).
Despite having a highly optimized and idle infrastructure, the image processing inside the container remains extremely slow. This points to an application-level limitation with GridFS chunking or the Sharp image processing library in this specific version.
Relevant logs:
There are no specific crash logs or stack traces during the upload, just the severe delay in processing. Server logs show normal operation, but the rendering time is exceptionally high.
(I can provide debug-level logs if requested by the engineering team).