Skip to content

Commit f609acf

Browse files
committed
Fix HTTP 403 when downloading Unicode data files
unicode.org blocks requests without a User-Agent header, causing urlretrieve to fail with HTTP 403 Forbidden. Switch to urlopen with an explicit User-Agent.
1 parent 23fbebe commit f609acf

1 file changed

Lines changed: 5 additions & 1 deletion

File tree

makeunicodedata.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -961,7 +961,11 @@ def open_data(template, version):
961961
else:
962962
url = ('https://www.unicode.org/Public/%s/ucd/'+template) % (version, '')
963963
os.makedirs(DATA_DIR, exist_ok=True)
964-
urllib.request.urlretrieve(url, filename=local)
964+
# unicode.org blocks requests without a User-Agent header
965+
req = urllib.request.Request(url, headers={'User-Agent': 'makeunicodedata.py'})
966+
with urllib.request.urlopen(req) as response:
967+
with open(local, 'wb') as f:
968+
f.write(response.read())
965969
if local.endswith('.txt'):
966970
return open(local, encoding='utf-8')
967971
else:

0 commit comments

Comments
 (0)