Skip to content

Proposal: File Descriptor and Conditional Response Support for RSGI #816

@myers

Description

@myers

(revised after thinking about this overnight)

Problem

When a Python RSGI application serves a file, it typically needs to open and stat the file for metadata (Content-Length, ETag, Last-Modified), then pass the path to response_file(). Granian re-opens the same file by path. This double-open is unnecessary and introduces a Time-of-Check-to-Time-of-Use (TOCTOU) window — the file could be replaced between Python's fstat and Granian's open, making the metadata headers inconsistent with the served content.

Current syscall trace:

Python (GIL held between syscalls):
  openat          ← open(path, "rb")
  fstat           ← os.fstat() for ETag/mtime/size
  lseek × N       ← framework seek/tell for Content-Length
  close           ← response cleanup

Rust (no GIL):
  openat          ← File::open(&path) — redundant
  read × N        ← ReaderStream chunks
  close

Two proposals below, at different scope levels. Either would be useful; both are additive and don't change existing methods.

Current Implementation

response_file() accepts a path string, opens the file, and streams it via ReaderStream:

// src/rsgi/types.rs — PyResponseFile::to_response()
match File::open(&self.file_path).await {
    Ok(file) => {
        let stream = ReaderStream::with_capacity(file, 131_072);
        // ... wrap in hyper::Response ...
    }
    Err(_) => response_404()
}

Metadata headers (Content-Length, ETag, Last-Modified) are passed through from Python via the headers parameter. Because the API accepts a path, Python must open the file separately to obtain this metadata.


Proposal 1: response_file_fd() — Accept a File Descriptor

Accept an already-open file descriptor instead of a path string. Python passes ownership of the fd; Granian streams from it directly.

Python API

protocol.response_file_fd(status, headers, fd)
protocol.response_file_fd_range(status, headers, fd, start, end)

Python would typically os.dup() the fd before passing it, so the framework can close its own copy independently.

Rust — pyo3 methods

fn response_file_fd(&self, status: u16, headers: Vec<(PyBackedStr, PyBackedStr)>, fd: i32) {
    if let Some(tx) = self.tx.lock().unwrap().take() {
        _ = tx.send(PyResponse::FileFd(PyResponseFileFd::new(status, headers, fd)));
    }
}

fn response_file_fd_range(
    &self, status: u16, headers: Vec<(PyBackedStr, PyBackedStr)>,
    fd: i32, start: u64, end: u64,
) -> PyResult<()> {
    if start >= end {
        return Err(pyo3::exceptions::PyValueError::new_err("Invalid range"));
    }
    if let Some(tx) = self.tx.lock().unwrap().take() {
        _ = tx.send(PyResponse::FileFdRange(
            PyResponseFileFdRange::new(status, headers, fd, start, end)
        ));
    }
    Ok(())
}

Rust — to_response() (full file)

pub async fn to_response(self) -> hyper::Response<HTTPResponseBody> {
    let std_file = unsafe { std::fs::File::from_raw_fd(self.fd) };
    let file = tokio::fs::File::from_std(std_file);
    let stream = ReaderStream::with_capacity(file, 131_072);
    // ... same as current response_file ...
}

Rust — to_response() (range)

pub async fn to_response(self) -> hyper::Response<HTTPResponseBody> {
    let std_file = unsafe { std::fs::File::from_raw_fd(self.fd) };
    let mut file = tokio::fs::File::from_std(std_file);

    if file.seek(SeekFrom::Start(self.start)).await.is_err() {
        return response_500();
    }
    let take = file.take(self.end - self.start);
    let stream = ReaderStream::with_capacity(take, 131_072);
    // ... same as current response_file_range ...
}

Scope

  • New PyResponse::FileFd and PyResponse::FileFdRange enum variants
  • New PyResponseFileFd and PyResponseFileFdRange structs (fd: i32 instead of file_path: String)
  • New pyo3 methods: response_file_fd(), response_file_fd_range()
  • ~40 lines of new Rust code
  • Zero changes to existing methods

What changes

Before After
Rust: openat (redundant re-open) Rust: from_raw_fd (no syscall)
TOCTOU window between Python stat and Rust open Same fd throughout
Python closes fd, Rust opens a new one Rust owns the fd, closes when done

All HTTP semantics (ETag, conditional checks, range parsing) remain in Python. Rust just streams bytes from the fd it receives.


Proposal 2: response_file_conditional() — Conditional + Range in Rust

Move conditional response logic and range handling into Rust. Python passes the file path plus raw request headers. Granian does open + fstat + ETag generation + conditional checks + range handling + serving — all atomically, outside the GIL.

Python API

protocol.response_file_conditional(
    status,
    headers,
    file_path,                       # str
    if_none_match=etag_header,       # str | None
    if_modified_since=ims_header,    # str | None
    range_header=range_header,       # str | None
    if_range=if_range_header,        # str | None
)

Decision tree

Five possible outcomes from a single call:

request arrives
  │
  ├─ If-None-Match matches ETag?        → 304 Not Modified
  ├─ If-Modified-Since ≥ mtime?         → 304 Not Modified
  │
  ├─ Range header present?
  │   ├─ If-Range present and mismatches? → 200 (full file, ignore range)
  │   ├─ Range unsatisfiable?             → 416 Range Not Satisfiable
  │   ├─ Range valid (single)?            → 206 Partial Content
  │   └─ Range invalid/multiple?          → 200 (full file, ignore range)
  │
  └─ No Range header                     → 200 (full file)

Rust implementation sketch

pub async fn to_response(self) -> hyper::Response<HTTPResponseBody> {
    // 1. Open + stat
    let file = match File::open(&self.file_path).await {
        Ok(f) => f,
        Err(_) => return response_404(),
    };
    let metadata = match file.metadata().await {
        Ok(m) => m,
        Err(_) => return response_500(),
    };
    let mtime = metadata.modified().unwrap_or(SystemTime::UNIX_EPOCH);
    let size = metadata.len();
    let etag = format!("\"{:x}-{:x}\"", mtime_nanos(mtime), size);

    // 2. Merge server-generated headers (don't override app-set values)
    let mut headers = self.headers.clone();
    headers.entry("etag").or_insert(etag.parse().unwrap());
    headers.entry("last-modified").or_insert(http_date(mtime));
    headers.entry("accept-ranges").or_insert("bytes".parse().unwrap());

    // 3. Conditional checks (RFC 7232 §6)
    if let Some(ref inm) = self.if_none_match {
        if etag_matches(inm, &etag) {
            return response_304(headers);
        }
    } else if let Some(ref ims) = self.if_modified_since {
        if !modified_since(ims, mtime) {
            return response_304(headers);
        }
    }

    // 4. Range handling
    if let Some(ref range_header) = self.range_header {
        let range_applies = match &self.if_range {
            None => true,
            Some(if_range) => {
                if if_range.starts_with('"') || if_range.starts_with("W/") {
                    if_range.trim_start_matches("W/") == etag
                } else {
                    match parse_http_date(if_range) {
                        Some(d) => d >= mtime,
                        None => false,
                    }
                }
            }
        };

        if range_applies {
            match parse_range(range_header, size) {
                RangeResult::Single(start, end) => {
                    headers.insert("content-range",
                        format!("bytes {}-{}/{}", start, end, size).parse().unwrap());
                    headers.insert("content-length",
                        (end - start + 1).to_string().parse().unwrap());
                    let mut file = file;
                    if file.seek(SeekFrom::Start(start)).await.is_err() {
                        return response_500();
                    }
                    let take = file.take(end - start + 1);
                    let stream = ReaderStream::with_capacity(take, 131_072);
                    return response_with_body(206, headers, stream);
                }
                RangeResult::Unsatisfiable => {
                    headers.insert("content-range",
                        format!("bytes */{}", size).parse().unwrap());
                    return response_empty(416, headers);
                }
                RangeResult::Invalid | RangeResult::Multiple => {
                    // Fall through to full-file path
                }
            }
        }
    }

    // 5. Full file (200)
    headers.insert("content-length", size.to_string().parse().unwrap());
    let stream = ReaderStream::with_capacity(file, 131_072);
    response_with_body(200, headers, stream)
}

Scope

  • New PyResponse::FileConditional enum variant and struct
  • ETag generation from mtime + size (~5 lines)
  • If-None-Match parsing and matching (~15 lines)
  • If-Modified-Since date parsing (~10 lines, or httpdate crate)
  • Range header parsing (~40 lines)
  • If-Range evaluation (~15 lines)
  • Response construction for all five outcomes
  • ~120-150 lines of new Rust code

What changes

Metric Current Proposal 2
Python syscalls for file serving ~13 ~6 (lstat from path validation only)
Rust syscalls openat + reads openat + fstat + reads (or just 304)
GIL held during file I/O openat + fstat + lseek × N None
TOCTOU race Between Python stat and Rust open None (same fd)
304 responses Full Python handler cycle Short-circuit in Rust

Header precedence

If the application explicitly sets ETag or Last-Modified headers, Granian should use those values rather than generating its own. The headers.entry(...).or_insert(...) pattern in the sketch above handles this.


Comparison

Proposal 1 Proposal 2
Rust changes ~40 lines ~120-150 lines
New dependencies None Possibly httpdate
HTTP semantics in Rust None — Python handles everything ETag, conditional, range (RFC 7232/7233)
Eliminates double-open Yes Yes
Eliminates TOCTOU Yes Yes
Moves file I/O out of GIL No — Python still opens + stats Yes — Python does path validation only
304 without Python I/O No Yes

What Python retains responsibility for in both proposals

  • Path validation and traversal protection
  • Authorization (who can access files)
  • Content-Type detection
  • Content-Encoding
  • Custom headers (Cache-Control, Content-Disposition)

What moves to Rust in Proposal 2

  • ETag generation (from mtime + size)
  • Last-Modified header
  • If-None-Match / If-Modified-Since → 304
  • Range parsing, If-Range evaluation → 206 / 416
  • Accept-Ranges, Content-Range, Content-Length for partial responses

Implementation Offer

I can implement either proposal as a PR. Proposal 1 is straightforward; Proposal 2 would benefit from your input on whether you'd prefer the httpdate crate or a minimal inline parser for HTTP dates.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrsgiIssue related to RSGI protocol

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions