Faster extrapolation by timholy · Pull Request #154 · JuliaMath/Interpolations.jl

timholy · 2017-04-27T08:24:42Z

I happened to notice that extrapolation is quite a lot slower than interpolation:

julia> using BenchmarkTools, Interpolations

julia> img = rand(3,3,3);

julia> itp = interpolate(img, BSpline(Linear()), OnGrid());

julia> etp = extrapolate(itp, Flat());

julia> @benchmark $itp[2.2,2.2,2.2] seconds=1
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     20.111 ns (0.00% GC)
  median time:      20.616 ns (0.00% GC)
  mean time:        20.820 ns (0.00% GC)
  maximum time:     38.106 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     997

julia> @benchmark $etp[2.2,2.2,2.2] seconds=1
BenchmarkTools.Trial: 
  memory estimate:  16 bytes
  allocs estimate:  1
  --------------
  minimum time:     63.526 ns (0.00% GC)
  median time:      63.929 ns (0.00% GC)
  mean time:        66.641 ns (1.99% GC)
  maximum time:     2.312 μs (96.59% GC)
  --------------
  samples:          10000
  evals/sample:     980

This slowness doesn't seem inevitable, because the amount of "work" (computation) performed by extrapolation is quite minor compared to the work done by interpolation:

julia> Interpolations.getindex_impl(typeof(etp))
quote  # /home/tim/.julia/v0.5/Interpolations/src/extrapolation/extrapolation.jl, line 53:
    @nexprs 3 (d->begin  # /home/tim/.julia/v0.5/Interpolations/src/extrapolation/extrapolation.jl, line 53:
                xs_d = xs[d]
            end) # /home/tim/.julia/v0.5/Interpolations/src/extrapolation/extrapolation.jl, line 54:
    begin 
        xs_1 = clamp(xs_1,lbound(etp,1),ubound(etp,1))
        xs_2 = clamp(xs_2,lbound(etp,2),ubound(etp,2))
        xs_3 = clamp(xs_3,lbound(etp,3),ubound(etp,3))
    end # /home/tim/.julia/v0.5/Interpolations/src/extrapolation/extrapolation.jl, line 55:
    etp.itp[xs_1,xs_2,xs_3]
end

Interpolation also involves a whole second round of clamps and throws in a bunch of floor/round and arithmetic operations to boot (16 multiplies and 16 adds for linear interpolation in 3 dimensions). So one might suspect this could be fixed through a careful look at the generated code.

The first commit here is almost trivial: it adds forced-inlining to circumvent the splatting penalty. This leads to an approximately 20ns reduction in this test case. The second commit is more complicated; size and indices, when called with a dimension argument, involve a branch that looks something like this:

size(A, d) = d <= ndims(A) ? size(A)[d] : 1

Our generated code used these dimension-specific constructs and triggered two branches per dimension (one for lbound and one for ubound), resulting in 6 branches total for this example. Note that such branches are not present if you start from sz = size(A), so I changed our generated code to use this style (with indices rather than size). This gave another approximate 20ns boost, so that the result with this PR is

julia> @benchmark $etp[2.2,2.2,2.2] seconds=1
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     26.879 ns (0.00% GC)
  median time:      27.188 ns (0.00% GC)
  mean time:        27.413 ns (0.00% GC)
  maximum time:     46.544 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     995

which is a much more reasonable overhead compared to interpolation.

sglyon · 2017-04-27T14:48:22Z

This looks fantastic. Great optimizing work.

I don't have time right now, but I will take a closer look at this before the end of the week (probably Friday).

If others look/review this and approve the PR, feel free to merge without me.

sglyon

Looks good!

timholy added 2 commits April 27, 2017 13:56

Speed up extrapolation by eliminating splat penalty

77f96ec

Speed up extrapolation by eliminating branches in indices/lbound/ubound

87964b1

timholy force-pushed the teh/faster_etp branch from 6ea564b to 87964b1 Compare April 27, 2017 18:57

sglyon approved these changes Apr 28, 2017

View reviewed changes

timholy merged commit 70e4128 into master Apr 28, 2017

timholy deleted the teh/faster_etp branch April 28, 2017 13:49

timholy mentioned this pull request May 9, 2017

Fix type instability when using custom indices JuliaLang/julia#21599

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster extrapolation#154

Faster extrapolation#154
timholy merged 2 commits intomasterfrom
teh/faster_etp

timholy commented Apr 27, 2017

Uh oh!

sglyon commented Apr 27, 2017

Uh oh!

sglyon left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

timholy commented Apr 27, 2017

Uh oh!

sglyon commented Apr 27, 2017

Uh oh!

sglyon left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants