Skip to content

Update number pattern to catch variables containing numbers#3648

Merged
joshgoebel merged 10 commits intohighlightjs:mainfrom
lachieh:bugfix/3645-update-number-matching-in-js
Nov 4, 2022
Merged

Update number pattern to catch variables containing numbers#3648
joshgoebel merged 10 commits intohighlightjs:mainfrom
lachieh:bugfix/3645-update-number-matching-in-js

Conversation

@lachieh
Copy link
Copy Markdown
Contributor

@lachieh lachieh commented Oct 23, 2022

Currently, numbers are matched based on the word boundary (\b) but that means that a valid variable identifier like $0 will have the 0 highlighted as a number.

Resolves #3645

Changes

  • Updated the regex patterns for numbers to lookbehind for a $ symbol.

Checklist

  • Added markup tests, or they don't apply here because...
  • Updated the changelog at CHANGES.md

@joshgoebel
Copy link
Copy Markdown
Member

joshgoebel commented Oct 25, 2022

Awesome contribution! Great tests.

Sadly, we can't use look-behinds, they aren't yet supported universal by all browsers (ie: Safari).

Typically for things like this you'd instead try to match the offending item... ie, add an earlier rule to match/capture $2323 patterns, and just ignore them (vs highlight)... this will prevent the number rule from firing and should also be another easy fix.

@taufik-nurrohman
Copy link
Copy Markdown
Member

taufik-nurrohman commented Oct 25, 2022

Yes. Adding [$][a-z\d]* as variable pattern before the number patterns should fix the issue. As the variable name that contains number has been captured before reaching the number matcher.

@joshgoebel
Copy link
Copy Markdown
Member

[$][a-z\d]*

We don't need a-z though since all we're trying to solve is the false-positive with $[number], yes?

@taufik-nurrohman
Copy link
Copy Markdown
Member

taufik-nurrohman commented Oct 25, 2022

OK. FYI, this also happen in default JacaScript’s syntax highlighter in Vim.

@lachieh
Copy link
Copy Markdown
Contributor Author

lachieh commented Oct 25, 2022

Awesome contribution! Great tests.

Sadly, we can't use look-behinds, they aren't yet supported universal by all browsers (ie: Safari).

I thought about this, but used it since there are two other tests patterns that also use look behinds. The suggested solution before will work though so I'll look at that.

@joshgoebel
Copy link
Copy Markdown
Member

Two other tests? Where?

@lachieh
Copy link
Copy Markdown
Contributor Author

lachieh commented Oct 27, 2022

Nevermind, did a cursory search and found this, but didn't notice it was a comment.

// TODO: replace with negative look-behind when available
// { begin: /(?<![a-zA-Z0-9._])0[xX][0-9a-fA-F]+\.[0-9a-fA-F]*[pP][+-]?\d+i?/ },
// { begin: /(?<![a-zA-Z0-9._])0[xX][0-9a-fA-F]+([pP][+-]?\d+)?[Li]?/ },
// { begin: /(?<![a-zA-Z0-9._])(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?[Li]?/ }

I will update with the suggested pattern!

@lachieh
Copy link
Copy Markdown
Contributor Author

lachieh commented Oct 31, 2022

Ok, this has been updated. I did have to leave the a-z in because it was failing on the test below. Adding the a-z in makes all tests pass.

 1) typescript
       should markup identifiers_that_include_keywords:
- actual
+ expected

-<span class="hljs-keyword">const</span> result = $<span class="hljs-title function_">class</span>();
+<span class="hljs-keyword">const</span> result = $class();

@joshgoebel
Copy link
Copy Markdown
Member

joshgoebel commented Nov 4, 2022

I did have to leave the a-z in because it was failing on the test below.

Isn't that because you are using * when it should be +?

/[$]\d+/

We don't care about just plain $ unless it's followed by variable like stuff... All you're really trying to skip is $\d (false positives for numbers inside variables) so that should be a sufficient regex IF you require at least a single digit after the $, which + will do for you.

@lachieh
Copy link
Copy Markdown
Contributor Author

lachieh commented Nov 4, 2022

Yes, fair point. Thanks for clarifying. Updated pattern and merged upstream.

Comment thread src/languages/javascript.js Outdated
@joshgoebel
Copy link
Copy Markdown
Member

Oh, you made it a variant..... :-). Interesting. I'll fix.

@lachieh
Copy link
Copy Markdown
Contributor Author

lachieh commented Nov 4, 2022

Yeah, took reference from here:

variants: [
// Exclude params in functions without params
{
className: "",
begin: /\(\s*\)/,
skip: true
},

@joshgoebel
Copy link
Copy Markdown
Member

Yeah, took reference from here:

Yeah, but that example sort of IS still an instance of params, just "empty params"... where as $123 isn't really an instance of a numeric, it's a variable. :-). I'm being pedantic now.

But what you're doing is one easy way to get that rule in all the same spots you'd want the NUMBER rule (which is convenient)... so that's an argument for it... so I might just revert and commit your last revision after all. :-)

@lachieh
Copy link
Copy Markdown
Contributor Author

lachieh commented Nov 4, 2022

Gotcha, I understand. I've moved it to before the both uses of NUMBER and added a few more tests for the same pattern in template strings. I think this is fine since it is only used twice, but if you'd prefer it inside the NUMBER I can make that change (or you can).

Thanks for the time and input!

Copy link
Copy Markdown
Member

@joshgoebel joshgoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!

@joshgoebel joshgoebel merged commit 4f5b769 into highlightjs:main Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(javascript, typescript) Incorrect handling of digits after $ in identifiers

3 participants