Improve performance of string interpolation#1626
Conversation
This patch will add pre-allocation in string interpolation.
By this, unecessary capacity resizing is avoided.
For small strings, optimized `rb_str_resurrect` operation is faster, so pre-allocation is done only when concatenated strings are large.
MIN_PRE_ALLOC_SIZE was decided by experimenting with local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version 8.1.0 (clang - 802.0.42)).
String interpolation will be faster around 72% when large string is created.
* Before
Calculating -------------------------------------
Large string interpolation
1.276M (± 5.9%) i/s - 6.358M in 5.002022s
Small string interpolation
5.156M (± 5.5%) i/s - 25.728M in 5.005731s
* After
Calculating -------------------------------------
Large string interpolation
2.201M (± 5.8%) i/s - 11.063M in 5.043724s
Small string interpolation
5.192M (± 5.7%) i/s - 25.971M in 5.020516s
* Test code
require 'benchmark/ips'
Benchmark.ips do |x|
x.report "Large string interpolation" do |t|
a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo"
b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld"
t.times do
"#{a}, #{b}!"
end
end
x.report "Small string interpolation" do |t|
a = "Hello"
b = "World"
t.times do
"#{a}, #{b}!"
end
end
end
| if (UNLIKELY(!num)) return rb_str_new(0, 0); | ||
| if (UNLIKELY(num == 1)) return rb_str_resurrect(strary[0]); | ||
|
|
||
| long len = 1; |
There was a problem hiding this comment.
https://travis-ci.org/ruby/ruby/builds/234558585#L1498
string.c:2887:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
There was a problem hiding this comment.
Thanks for your comment. 😄
I fixed it ! 👍
| if (LIKELY(len < MIN_PRE_ALLOC_SIZE)) { | ||
| str = rb_str_resurrect(strary[0]); | ||
| s = 1; | ||
| } else { |
There was a problem hiding this comment.
Adjust indentation and style to the rest.
There was a problem hiding this comment.
Thanks for your comment. 😄
I fixed it ! 👍
This patch fixes wrong coding style.
|
@nobu I fixed coding style! 😄 |
| long len = 1; | ||
|
|
||
| if (UNLIKELY(!num)) return rb_str_new(0, 0); | ||
| if (UNLIKELY(num == 1)) return rb_str_resurrect(strary[0]); |
There was a problem hiding this comment.
Thanks for your comment! 😄
If rb_str_concat_literals is called from concatstrings(YARV insn) and it is executed as YARV insns generated by Ruby compiler, num seems not to be 0 nor 1.
But, if rb_str_concat_literals is called directly in ruby internal, num may be 0 or 1.
I considered that situation.
But, rb_str_concat_literals is used only in concatstrings now, so it may be better to reject them 💡
There was a problem hiding this comment.
Thanks for explanation. I've just found that :"#{}" actually produces a concatstrings call with num==1 (possibly there are more cases that I didn't find). I think the concatstrings call can be eliminated in that case, but it would be out of scope of this PR.
| s = 1; | ||
| } | ||
| else { | ||
| str = rb_str_buf_new(len); |
There was a problem hiding this comment.
The encoding is not preserved. s="."*50; p "#{s}x".encoding would result in ASCII-8BIT.
There was a problem hiding this comment.
Sorry, I didn't notice it...
Thanks for pointing it out!
I 'll fix it and add the test case! 💡
There was a problem hiding this comment.
I fixed it by 1f6ad9b ! 💡
I referenced the similar operation in array.c. https://github.com/ruby/ruby/blob/trunk/array.c#L1974
`s="."*50; p "#{s}x".encoding` should be UTF-8, but resulted in
ASCII-8bit.
This patch fixes it.
|
|
||
| def test_concat_literals | ||
| s="."*50; "#{s}x".encoding | ||
| assert_equal("#{s}x".encoding, Encoding::UTF_8) |
There was a problem hiding this comment.
The argument order is "expected" and "actual"
|
@rhenium ping |
| end | ||
|
|
||
| def test_concat_literals | ||
| s="."*50; "#{s}x".encoding |
There was a problem hiding this comment.
The test repeats "#{s}x".encoding, which I believe to be unnecessary.
I would also add spaces for readability, so s="."*50; "#{s}x".encoding would become s = "." * 50.
* compile.c (iseq_peephole_optimize): optimize away unnecessary concatenation of single string, following tostring which always puts a String instance. #1626 (comment) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59945 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This patch will add pre-allocation in string interpolation.
By this, unecessary capacity resizing is avoided.
For small strings, optimized `rb_str_resurrect` operation is
faster, so pre-allocation is done only when concatenated strings
are large. `MIN_PRE_ALLOC_SIZE` was decided by experimenting with
local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version
8.1.0 (clang - 802.0.42)).
String interpolation will be faster around 72% when large string is created.
* Before
```
Calculating -------------------------------------
Large string interpolation
1.276M (± 5.9%) i/s - 6.358M in 5.002022s
Small string interpolation
5.156M (± 5.5%) i/s - 25.728M in 5.005731s
```
* After
```
Calculating -------------------------------------
Large string interpolation
2.201M (± 5.8%) i/s - 11.063M in 5.043724s
Small string interpolation
5.192M (± 5.7%) i/s - 25.971M in 5.020516s
```
* Test code
```ruby
require 'benchmark/ips'
Benchmark.ips do |x|
x.report "Large string interpolation" do |t|
a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo"
b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld"
t.times do
"#{a}, #{b}!"
end
end
x.report "Small string interpolation" do |t|
a = "Hello"
b = "World"
t.times do
"#{a}, #{b}!"
end
end
end
```
[Fix rubyGH-1626]
From: Nao Minami <south37777@gmail.com>
git-svn-id: svn+ssh://svn.ruby-lang.org/ruby/trunk@60320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This patch will add pre-allocation in string interpolation.
By this, unnecessary capacity resizing is avoided.
For small strings, optimized
rb_str_resurrectoperation is faster, so pre-allocation is done only when concatenated strings are large.MIN_PRE_ALLOC_SIZEwas decided by experimenting with my local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version 8.1.0 (clang - 802.0.42)).String interpolation will be faster around 72% when large string is created.
Issue
https://bugs.ruby-lang.org/issues/13587