Overview
Implement intelligent header detection that matches BigQuery's behavior instead of always treating row 0 as the header.
Current Behavior
The emulator always treats row 0 as the header, regardless of the content.
Location: server/handler.go:975-981
Expected Behavior
BigQuery uses intelligent detection: "If the first line contains only strings, and the other lines contain other data types, BigQuery assumes that the first row is a header row."
If the first row does not meet this criteria, BigQuery treats it as data and generates column names (e.g., col_0, col_1, etc.).
Implementation Requirements
- Check if row 0 contains only string-like values (no obvious numbers, dates, etc.)
- Check if row 1+ contains non-string types (numbers, dates, timestamps, etc.)
- If both conditions are met → row 0 is header
- Otherwise → row 0 is data, generate column names like
col_0, col_1, col_2, etc.
Test Cases
All-numeric CSV (no header):
→ Row 0 should be data, not header. Columns should be named col_0, col_1, col_2
String header + numeric data:
name,age,score
Alice,25,95.5
→ Row 0 should be header
All-string CSV:
Alice,Bob,Charlie
Dave,Eve,Frank
→ Ambiguous case - may need additional heuristics or user configuration
Documentation Reference
Overview
Implement intelligent header detection that matches BigQuery's behavior instead of always treating row 0 as the header.
Current Behavior
The emulator always treats row 0 as the header, regardless of the content.
Location:
server/handler.go:975-981Expected Behavior
BigQuery uses intelligent detection: "If the first line contains only strings, and the other lines contain other data types, BigQuery assumes that the first row is a header row."
If the first row does not meet this criteria, BigQuery treats it as data and generates column names (e.g.,
col_0,col_1, etc.).Implementation Requirements
col_0,col_1,col_2, etc.Test Cases
All-numeric CSV (no header):
→ Row 0 should be data, not header. Columns should be named
col_0,col_1,col_2String header + numeric data:
→ Row 0 should be header
All-string CSV:
→ Ambiguous case - may need additional heuristics or user configuration
Documentation Reference