HmmâŚ
What is going on in your setup
You are doing the ârightâ thing for Parquet in the browser: byte-range reads so you only fetch the Parquet footer + the specific row groups you need, instead of downloading multi-GB files.
Your failure mode is also typical: Range + redirect + CORS preflight breaks when the final CDN hop does not implement CORS correctly for OPTIONS and 206 Partial Content.
On the Hugging Face Hub, many downloads now redirect from huggingface.co/.../resolve/... to Xetâs bridge/CDN hostnames such as cas-bridge.xethub.hf.co/... (this redirect behavior is documented in Hugging Faceâs Xet migration write-up). (Hugging Face)
So your browser ends up doing cross-origin requests to cas-bridge.xethub.hf.co, not to huggingface.co, and CORS must work on the redirected host too.
Why your Range request triggers a preflight at all
A lot of people assume âRange always preflights.â That is not strictly true.
- The
Range request header can be CORS-safelisted (no preflight) only in a narrow case: a single byte range like bytes=500-999. (MDN WebDocument)
- If your tooling sends multiple ranges (comma-separated), or adds other non-safelisted headers, you get a preflight. (MDN WebDocument)
- Many browser data stacks also do an initial
HEAD to check size and range support. DuckDB-Wasm users routinely hit failures when that HEAD is blocked by CORS. (GitHub)
Even if you personally only set Range, libraries in the chain (parquet readers, fetch wrappers, WASM httpfs layers, etc.) commonly add one or more of:
HEAD probe
Range in a non-safelisted form
- extra headers (caching validators, custom metadata, etc.)
So designing for âno preflightâ is fragile. You want the infrastructure to support preflight cleanly.
Why the error text points to âpreflight doesnât have HTTP ok statusâ
Browsers require that the preflight response is an âokâ HTTP status (typically 200 or 204). If not, the preflight fails and the actual request never runs. (GitHub)
Redirects make this worse because redirect handling for preflights is heavily constrained and often treated as a failure path; âredirect on preflightâ has long been a known sharp edge in the Fetch/CORS model. (GitHub)
So if OPTIONS on cas-bridge.xethub.hf.co returns a non-2xx (or redirects, or is blocked at the edge), you get exactly the error you posted.
What âgoodâ looks like for Range over CORS
You need two things to be correct:
1) OPTIONS succeeds for the redirected host
For requests that preflight, cas-bridge.xethub.hf.co must respond to OPTIONS with:
- Status:
204 (or 200)
- CORS allow headers including what the browser asked for
- Ideally
Access-Control-Max-Age to reduce repeat preflights
The key is: do not 3xx redirect the preflight and do not return 4xx/5xx.
This is not optional if you want robust compatibility. (GitHub)
2) GET supports 206 Partial Content with CORS headers
When the client does GET with Range, the server usually answers:
206 Partial Content
Accept-Ranges: bytes (advertises capability) (MDN WebDocument)
Content-Range: bytes start-end/total (tells the client what it got) (MDN WebDocument)
Content-Length for the returned chunk
And critically: the 206 response must include Access-Control-Allow-Origin and must expose the headers your JS needs.
Why Access-Control-Expose-Headers matters (people miss this)
Even if the server returns Accept-Ranges and Content-Range, browser JS cannot read them unless they are âsafelistedâ or explicitly exposed.
MDNâs rule is: only a small set of response headers are exposed by default, everything else needs Access-Control-Expose-Headers. (MDN WebDocument)
This is why projects like pdf.js historically logged errors like âRefused to get unsafe header âAccept-Rangesââ when trying to do progressive range loading. (GitHub)
For Parquet readers, exposing at least:
Accept-Ranges
Content-Range
Content-Length
is the difference between âcan stream selectivelyâ and âfalls back to full download.â
Your requested headers are basically right, with two practical refinements
You proposed:
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, HEAD, OPTIONS
Access-Control-Allow-Headers: Range, Content-Type
Access-Control-Expose-Headers: Content-Range, Accept-Ranges, Content-Length
That is a solid baseline for public, anonymous dataset files.
Two refinements that will reduce breakage:
Refinement A: donât under-allow request headers
If any client sends If-None-Match, If-Modified-Since, or other common headers, preflight will ask for them and fail if not allowed. Many CDNs solve this by echoing back Access-Control-Request-Headers.
Practical pattern:
- In
OPTIONS, read Access-Control-Request-Headers and return it in Access-Control-Allow-Headers.
This avoids whack-a-mole. It is especially useful when you do not control all downstream libraries.
Refinement B: be explicit about credentials vs wildcard
Access-Control-Allow-Origin: * is only valid for requests without credentials. If credentials are included, wildcard causes a browser error. (MDN WebDocument)
So you likely want two tiers:
- Public blobs (no auth cookies, no auth headers from browser):
Access-Control-Allow-Origin: *
- no
Access-Control-Allow-Credentials
- Gated/private blobs (if ever needed from browser):
- echo the request
Origin instead of *
- add
Vary: Origin for caches (MDN WebDocument)
- only then consider
Access-Control-Allow-Credentials: true
If Hugging Face only wants to support browser range reads for public assets, then the âpublic tierâ alone is enough.
Why this specifically matters for Parquet-WASM and DuckDB-WASM
DuckDB-Wasm explicitly calls out that browser deployments must obey CORS and remote HTTPFS reads depend on the remote server allowing it. (DuckDB)
Also, real bug reports show DuckDB-Wasm tries HEAD, and if that is blocked by CORS, the engine never reaches the ârange GETâ stage. (GitHub)
So your request is not hypothetical. It maps to an established failure pattern in browser analytics stacks.
Similar cases online (same class of problem)
These are âsame shapeâ incidents: byte-range/progressive loading + missing exposed headers or broken CORS:
- pdf.js: âRefused to get unsafe header âAccept-Rangesââ when trying range-based PDF loading. Root cause is missing
Access-Control-Expose-Headers. (GitHub)
- DuckDB-Wasm: CORS failures on
HEAD stop the pipeline before range reads happen. (GitHub)
- OGC Cloud Optimized GeoTIFF (COG) ecosystem: COG relies on HTTP Range requests; OGC docs explicitly call out CORS considerations around advertising range support. Different domain, same mechanism. (OGC Public Document Repository)
- Hugging Face Xet bridge operational threads: multiple HF community threads reference
cas-bridge.xethub.hf.co as an infrastructure hop that can break downloads or requires allowlisting. (Hugging Face Forums)
Workarounds you can use today (if HF infra does not change quickly)
Workaround 1: run a tiny proxy that terminates CORS correctly
You fetch from HF server-to-server, then serve to browser with correct:
OPTIONS handling
206 + Expose-Headers
Downside: you lose âdirectâ HF edge delivery unless you deploy the proxy at an edge (Cloudflare Workers, Fastly Compute, etc.).
Workaround 2: require user-provided files (local File API)
Parquet-WASM can read from a File handle. No CORS. Obvious UX cost.
Workaround 3: attempt the Xet-native APIs (advanced)
Hugging Face documents a Xet protocol where you first get X-Xet-Hash and then call a reconstruction API; it even recommends batched downloads and mentions using Range. (Hugging Face)
In practice, this still needs CORS on those endpoints if you do it directly from browser, so it is not a guaranteed escape hatch. But it is relevant context when discussing âHF already thinks in ranges.â
Copy-paste issue text (clean, reproducible, actionable)
Title
Enable CORS + HTTP Range support for browser partial reads on cas-bridge.xethub.hf.co (Parquet row-group access)
Summary
Browser-based data tools need Range requests to read Parquet efficiently (footer + selected row groups). Downloads from the Hub redirect to cas-bridge.xethub.hf.co (Xet bridge). The redirected host fails CORS preflight for Range/HEAD workflows, blocking partial reads. (Hugging Face)
Current behavior
- Plain
GET works via redirect.
Range workflows fail with: âResponse to preflight request doesnât pass access control check: It does not have HTTP ok status.â
- This blocks parquet-wasm and DuckDB-Wasm style readers which rely on
HEAD + Range or non-safelisted Range patterns. (GitHub)
Expected behavior
OPTIONS to the final redirected host returns 200/204 (no redirect) with appropriate CORS headers. Preflight responses must be âokâ status. (GitHub)
GET with Range returns 206 Partial Content and includes CORS headers, plus exposes Content-Range, Accept-Ranges, and Content-Length so browser JS can consume them. (MDN WebDocument)
Proposed CORS headers (public, anonymous files)
For responses from cas-bridge.xethub.hf.co (and any sibling Xet bridge hosts):
Preflight (OPTIONS)
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, HEAD, OPTIONS
Access-Control-Allow-Headers: Range, Content-Type (or echo Access-Control-Request-Headers)
Access-Control-Max-Age: 86400 (optional, reduces preflight spam)
Actual (GET/HEAD, including 206)
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Content-Range, Accept-Ranges, Content-Length
- Ensure
Accept-Ranges: bytes and Content-Range are present for range responses. (MDN WebDocument)
Notes on credentials (optional)
If any endpoint requires credentials, wildcard * cannot be used and the server must echo Origin and add Vary: Origin. (MDN WebDocument)
Impact
This unblocks efficient browser analytics and visualization on HF-hosted datasets using Parquet row groups, DuckDB-Wasm, parquet-wasm, and similar tooling. DuckDB-Wasm documentation explicitly notes that remote data access requires correct CORS on the hosting site. (DuckDB)
High-quality references worth linking in the issue thread
- Hugging Face: redirect to
cas-bridge.xethub.hf.co shown in Xet migration blog (Hugging Face)
- Fetch/CORS: preflight must be âokâ status (200/204) (GitHub)
- Fetch/CORS: redirect + preflight is a known sharp edge (GitHub)
- MDN CORS guide:
Range safelist caveat (MDN WebDocument)
- MDN Range header: single-range is safelisted, multi-range may preflight (MDN WebDocument)
- MDN Expose-Headers: non-safelisted headers must be exposed (MDN WebDocument)
- DuckDB-Wasm: remote HTTPFS requires correct CORS (DuckDB)
- DuckDB-Wasm issue: HEAD blocked by CORS breaks the pipeline (GitHub)
- pdf.js historical issues about
Accept-Ranges/Content-Range exposure (GitHub)
Summary
- Your request is standard: browser Parquet needs byte ranges.
- Redirect to
cas-bridge.xethub.hf.co makes CORS enforcement happen on the Xet bridge host. (Hugging Face)
- Fix requires:
OPTIONS returns 200/204 with CORS headers, and 206 responses include CORS + exposed headers. (GitHub)
- Similar failures exist across pdf.js and DuckDB-Wasm ecosystems. (GitHub)