Skip to content

fix: catch UnicodeDecodeError on special character usernames#2840

Open
juliosuas wants to merge 1 commit intosherlock-project:masterfrom
juliosuas:fix/unicode-crash-special-chars
Open

fix: catch UnicodeDecodeError on special character usernames#2840
juliosuas wants to merge 1 commit intosherlock-project:masterfrom
juliosuas:fix/unicode-crash-special-chars

Conversation

@juliosuas
Copy link
Copy Markdown

Problem

When scanning usernames containing non-ASCII characters (e.g. Émile), Sherlock crashes with a UnicodeDecodeError:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 0: invalid continuation byte

This happens because some sites return HTTP redirect Location headers encoded in Latin-1 (ISO-8859-1) rather than UTF-8. The requests library raises UnicodeDecodeError when trying to decode these headers during redirect resolution.

Fix

Add UnicodeEncodeError and UnicodeDecodeError to the exception handlers in get_response(), alongside the existing requests exception handlers. This allows Sherlock to gracefully skip the problematic site (marking it as an error) and continue scanning the remaining targets instead of crashing.

Testing

Before (crashes midway through scan):

$ sherlock 'Émile'
[*] Checking username Émile on:
...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 ...

After (completes full scan, skipping sites with encoding issues):

$ sherlock 'Émile'
[*] Checking username Émile on:
...
[*] Search completed with 75 results

Fixes #2730

When scanning usernames containing non-ASCII characters (e.g. 'Émile'),
some sites return redirect Location headers encoded in Latin-1 instead
of UTF-8. The requests library raises UnicodeDecodeError when processing
these redirects, causing Sherlock to crash.

This fix catches UnicodeEncodeError and UnicodeDecodeError in
get_response() alongside the existing requests exception handlers,
allowing the scan to gracefully skip the affected site and continue
checking the remaining targets.

Fixes sherlock-project#2730
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash: UnicodeDecodeError on usernames with special characters

1 participant