Skip to content

Replace unsafe_(get|set)index with inbounds macro#14957

Merged
mbauman merged 6 commits intomasterfrom
mb/awesomebounds
Feb 14, 2016
Merged

Replace unsafe_(get|set)index with inbounds macro#14957
mbauman merged 6 commits intomasterfrom
mb/awesomebounds

Conversation

@mbauman
Copy link
Copy Markdown
Member

@mbauman mbauman commented Feb 6, 2016

This patch does two things:

  • It all but completely removes unsafe_getindex and unsafe_setindex! from the standard library. This merges their method definitions into the safe flavors with a (at)boundscheck checkbounds(...) clause, and it replaces call-sites with (at)inbounds A[I...]. A few uses remain because the inbounds macro doesn't return its value, making the construct a little awkward in some cases.
  • It changes where bounds are checked in SubArrays. Previously, SubArrays would defer bounds checks to their parent arrays; they wouldn't check the indices directly, but rather they'd transform the indices into the proper references into the parent array, and then index into the parent array safely. E.g., previously:
    julia> A = -.5:.1:.5
           S = sub(A, 7:11)
           S[-4], S[0], S[4]
    (-0.4,0.0,0.4)

    julia> S[-6]
    ERROR: BoundsError: attempt to access -0.5:0.1:0.5
      at index [0]

This behavior is neither tested nor depended upon in the Base library, but I believe some packages (cc @timholy?) have historically depended upon this. With this patch, the behavior is much more sane; S[0] now immediately throws a bounds error from the SubArray itself. I believe this is a strong requirement for making views more prominent (regardless of the syntax).

TODO:

  • runbenchmarks("array", vs = "JuliaLang/julia:master")
  • JULIA_TESTFULL=1 make -C test subarray
  • If the change in semantics is at all controversial, I can split this into two PRs — one for the spelling change and one for the change in SubArray functionality. No objections.
  • Restrict @inbounds annotations back to status quo.

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 6, 2016

Hrm, there's currently a failure on master with JULIA_TESTFULL=1 make -C test subarray.

@nanosoldier
Copy link
Copy Markdown
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @jrevels

@KristofferC
Copy link
Copy Markdown
Member

Nothing actually got benchmarked: https://github.com/JuliaCI/BaseBenchmarkReports/blob/master/f18e728/f18e728_vs_3256421.json

Edit: The tag is array for benchmarking the arrays.

@JeffBezanson
Copy link
Copy Markdown
Member

Yay! So nice to see this simplification.

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 6, 2016

Ah, that makes sense. I actually saw some speed ups in my spot-checks.

runbenchmarks("array", vs = "JuliaLang/julia:master")

@nanosoldier
Copy link
Copy Markdown
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@tkelman
Copy link
Copy Markdown
Contributor

tkelman commented Feb 7, 2016

Should the new behavior be tested for, if we want to go that direction?

@blakejohnson
Copy link
Copy Markdown
Contributor

Thanks for doing this, @mbauman. I'll take a look at this tomorrow.

@tkelman
Copy link
Copy Markdown
Contributor

tkelman commented Feb 7, 2016

there's currently a failure on master with JULIA_TESTFULL=1 make -C test subarray.

There's also been an intermittent subarray failure on CI and the buildbots even without setting JULIA_TESTFULL=1. Related?

base/subarray.jl Outdated
reindex(a::UnitRange, b::StepRange{Int}) = range(oftype(first(a), first(a)+first(b)-1), step(b), length(b))
reindex(a::StepRange, b::Range{Int}) = range(oftype(first(a), first(a)+(first(b)-1)*step(a)), step(a)*step(b), length(b))
reindex(a, b::Int) = unsafe_getindex(a, b)
reindex(a, b::Int) = (@inbounds r = a[b]; r)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably get rid of reindex altogether. The only reason we had it before is that it was impossible to turn off bounds checks on (3:6)[1:2].

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@timholy
Copy link
Copy Markdown
Member

timholy commented Feb 7, 2016

This is so great, thanks for doing it!

@timholy
Copy link
Copy Markdown
Member

timholy commented Feb 7, 2016

Oh, and I don't know of any packages that depend on out-of-bounds SubArray accesses. (I have some internal code that uses this, but that's just something I'll have to fix myself.) Besides, from my perspective out-of-bounds SubArray accesses are more of a get-like operation than a getindex operation.

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 7, 2016

Sure, I can add tests for bounds errors. The JULIA_TESTFULL failure in subarrays is entirely deterministic… and I don't think that the abridged tests do any randomization of the test subsets, so I don't think it'd be related.

I'm a little confused about the 12x regression in gemv. I can't reproduce it locally… assuming it's as simple as:

A = randn(256, 256);
v = randn(256);
@benchmark A*v

@KristofferC
Copy link
Copy Markdown
Member

It will take a while until we get experience with how noisy the different benchmarks are. Perhaps BLAS is more noisy than the others?

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 7, 2016

Let's try it again: runbenchmarks("array", vs = "JuliaLang/julia:master")

@nanosoldier
Copy link
Copy Markdown
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

Ds = start(D)
s = 0
for b in eachindex(I)
@inbounds for b in eachindex(I)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This @inbounds has wider scope than the original unsafe_getindex calls. It would be safer to individually mark the I[b] and dest[d] = src[s] lines.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's (currently) functionally the same, no? As I understand it, we no longer propagate the inbounds state through the next calls, so those aren't affected by the @inbounds annotation. Now, that may be something we decide to do later, so I can restrict this if you'd prefer… but I think that decision will have to consider blocks like this no matter what.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next will be affected by the @inbounds if it is inlined, since it is only one level deep. If that's okay, then decorating the entire loop is fine.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but we've not annotated any next method with @boundscheck/@inbounds or @_propagate_inbounds_meta yet. It may be something we want to do — especially since it'd allow us to actually follow the iteration protocol in these performance sensitive cases.

In any case, all five cases of expanded scope (there's also the @inbounds @nloops) here are iteration over AbstractArrays or their eachindex iterators or an infinite repeated iterator, and we're already assuming that AbstractArrays must be implemented correctly for this to work.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, my understanding here is wrong. I'll restrict these @inbounds cases back to the status quo for this PR… and we can work on how this works later.

@blakejohnson
Copy link
Copy Markdown
Contributor

I wonder if we ought to make @inbounds return its value, as it would clean up a number of use cases. Other than the few blocks I called out that should have more narrow callsite decorations, this is 👍 from me.

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 8, 2016

Yeah, I initially changed @inbounds to return its value, but very rapidly ran into cases like #11169 (comment).

@blakejohnson
Copy link
Copy Markdown
Contributor

I see. Bummer.

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 8, 2016

Benchmarks are looking much better! The BLAS guys do indeed look noisy — we've got a completely different set of successes/failures this time. The SIMD tests also look to be a little noisy. And a few of the tests just look like they got hit with unlucky GC pauses (1.3x perf regression, 1.3x more time spent in GC, but with the same number of allocations/bytes).

@KristofferC
Copy link
Copy Markdown
Member

Might be worth trying out the SIMD ones locally.

mbauman and others added 5 commits February 9, 2016 08:35
This patch does two things:
* It all but completely removes `unsafe_getindex` and `unsafe_setindex!` from the standard library. This merges their method definitions into the safe flavors with a `(at)boundscheck checkbounds(...)` clause, and it replaces call-sites with `(at)inbounds A[I...]`.  A few uses remain because the inbounds macro doesn't return its value, making the construct a little awkward in some cases.
* It changes *where* bounds are checked in SubArrays. Previously, SubArrays would defer bounds checks to their parent arrays; they wouldn't check the indices directly, but rather they'd transform the indices into the proper references into the parent array, and then index into the parent array safely. E.g., previously:

    julia> A = -.5:.1:.5
           S = sub(A, 7:11)
           S[-4], S[0], S[4]
    (-0.4,0.0,0.4)

    julia> S[-6]
    ERROR: BoundsError: attempt to access -0.5:0.1:0.5
      at index [0]

This behavior is neither tested nor depended upon in the Base library, but I believe some packages have historically depended upon this.  With this patch, the behavior is much more sane; `S[0]` now immediately throws a bounds error from the SubArray itself.  I believe this is a strong requirement for making views more prominent (regardless of the syntax).
Fixes the additional memory allocations
@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 9, 2016

I can't reproduce the SIMD regressions locally, but I have old hardware (Nehalem)… and there was a 10x regression on master for the past few days, too, so it hadn't been working correctly in any case. I've rebased and things now look good again. One last try? runbenchmarks("array", vs = "JuliaLang/julia:master")

@nanosoldier
Copy link
Copy Markdown
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@blakejohnson
Copy link
Copy Markdown
Contributor

Why did the blas 2 (gemv) benchmarks not run these last two times?

@jrevels
Copy link
Copy Markdown
Member

jrevels commented Feb 9, 2016

Just FYI, @andreasnoack and I are currently looking into how we can change the benchmark strategy in Benchmarks.jl to cut down on the large amount of variance we've been seeing with the @nanosoldier since the perf tracking infrastructure has been enabled. There are definitely some anomalies there, some related to GC and some not.

Why did the blas 2 (gemv) benchmarks not run these last two times?

I don't understand, it seems like they did get run?

@blakejohnson
Copy link
Copy Markdown
Contributor

Oh, sorry. I forgot that the table only shows "significant" results. So, the fact that the blas 2 benchmarks aren't in the table just means that they didn't change. Nevermind.

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 9, 2016

Alright, then I think this is good to go. Pending the final todo item. Done.

@mbauman
Copy link
Copy Markdown
Member Author

mbauman commented Feb 13, 2016

Bump! Any more comments?

end

@inline unsafe_setindex!(v::BitArray, x::Bool, ind::Int) = (Base.unsafe_bitsetindex!(v.chunks, x, ind); v)
@inline unsafe_setindex!(v::BitArray, x, ind::Real) = (Base.unsafe_bitsetindex!(v.chunks, convert(Bool, x), to_index(ind)); v)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these need modification in their setindex! equivalents?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. I'll add that.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, no, I was wrong — it's fine to delete these without expanding BitArray's specific setindex! method since the abstract fallbacks take care of the to_indexing these days. This is ok.

@timholy
Copy link
Copy Markdown
Member

timholy commented Feb 13, 2016

LGTM. Thanks again!

mbauman added a commit that referenced this pull request Feb 14, 2016
Replace unsafe_(get|set)index with inbounds macro
@mbauman mbauman merged commit 8ffca32 into master Feb 14, 2016
@mbauman mbauman deleted the mb/awesomebounds branch February 14, 2016 05:38
@timholy
Copy link
Copy Markdown
Member

timholy commented Feb 14, 2016

@mbauman and @blakejohnson, thanks for this huge step forward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants