Skip to content

fix: add support for multibyte characters#2

Merged
Haroenv merged 5 commits into
algolia:masterfrom
Haroenv:test/complicated-characters
Jun 27, 2017
Merged

fix: add support for multibyte characters#2
Haroenv merged 5 commits into
algolia:masterfrom
Haroenv:test/complicated-characters

Conversation

@Haroenv

@Haroenv Haroenv commented Jun 26, 2017

Copy link
Copy Markdown
Contributor

if the length of a character is more than one (here tested with 2 and 7), it should be counted as that length, but definitely shouldn't cut in the middle of that character.
This does not work as expected yet, so the test is skipped

if the length of a character is more than one (here tested with 2 and 7), it should be counted as that length, but definitely shouldn't cut in the middle of that character.
This does not work as expected yet, so the test is skipped
@rayrutjes

Copy link
Copy Markdown
Contributor

Nice @Haroenv , feel free to also submit the solution so that we don't skip the test ;)

@Haroenv

Haroenv commented Jun 26, 2017

Copy link
Copy Markdown
Contributor Author

Couldn't simply find a rule for how to get the actual characters to not split them. Thought to add this test so we know that it should work

@Haroenv

Haroenv commented Jun 26, 2017

Copy link
Copy Markdown
Contributor Author

The difference now is that the character count is used, and not the string length. This means that 💩 counts as one character, while before it would count as two.

@rayrutjes

rayrutjes commented Jun 26, 2017

Copy link
Copy Markdown
Contributor

The difference now is that the character count is used, and not the string length. This means that 💩 counts as one character, while before it would count as two.

This is not a big issue.

@Haroenv

Haroenv commented Jun 26, 2017

Copy link
Copy Markdown
Contributor Author

I think this can be merged as-is, it will no longer break 💩 or 🧀 , but will still break (possibly) 🙌🏿 or 🏃🏿‍♀️

@Haroenv

Haroenv commented Jun 26, 2017

Copy link
Copy Markdown
Contributor Author

(by the way, ZWJ is also relevant for Arabic, and multibyte strings for some Chinese letters)

Comment thread src/index.js Outdated
const chunks = [];
while (text.length > chunkSize) {
const splitAt = text.lastIndexOf(' ', chunkSize);
let characters = [...text];

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that it matters where this is supported: node 6+

If it would be Array.from() it would be node 4+

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me.

@rayrutjes

Copy link
Copy Markdown
Contributor

We are using conventional commit format to generate changelog.
Could you rebase to add the correct fix: add support for multibyte characters?

@Haroenv

Haroenv commented Jun 26, 2017

Copy link
Copy Markdown
Contributor Author

I usually squash on merge, but if you prefer to do it this way, it's also fine for me

@Haroenv Haroenv changed the title test(unicode): add tests for complicated characters fix: add support for multibyte characters Jun 26, 2017
@Haroenv

Haroenv commented Jun 27, 2017

Copy link
Copy Markdown
Contributor Author

Runes is amazing, it works exactly how I want it!

https://github.com/dotcypress/runes

@rayrutjes rayrutjes left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive man! Go for it 🎉
Make sure you give it a "conventional" commit name.

@Haroenv

Haroenv commented Jun 27, 2017

Copy link
Copy Markdown
Contributor Author

There are some more advanced bugs, but I issued them to runes itself, will see what I can do to fix it. For now it's already going well

@Haroenv Haroenv merged commit 1398956 into algolia:master Jun 27, 2017
@Haroenv Haroenv deleted the test/complicated-characters branch June 27, 2017 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants