Skip to content

perf: add Vector::Add, Sub and ScalarMul assembly (and purego) implementations#536

Merged
gbotrel merged 9 commits intomasterfrom
experiment/vecops
Sep 12, 2024
Merged

perf: add Vector::Add, Sub and ScalarMul assembly (and purego) implementations#536
gbotrel merged 9 commits intomasterfrom
experiment/vecops

Conversation

@gbotrel
Copy link
Copy Markdown
Collaborator

@gbotrel gbotrel commented Sep 9, 2024

Description

Adds

// Add adds two vectors element-wise and stores the result in self.
// It panics if the vectors don't have the same length.
func (vector *Vector) Add(a, b Vector)

// Sub subtracts two vectors element-wise and stores the result in self.
// It panics if the vectors don't have the same length.
func (vector *Vector) Sub(a, b Vector)

// ScalarMul multiplies a vector by a scalar element-wise and stores the result in self.
// It panics if the vectors don't have the same length.
func (vector *Vector) ScalarMul(a Vector, b *Element)

Assembly (amd64 target only) is generated for modulus < 256 bits.

Benchmarks on r7i.xlarge of x86 assembly vs pure go path: (perf gain are less interesting on on some AMD chips like hpc6a instances)

benchmark                              old ns/op     new ns/op     delta
BenchmarkElementVecOps/Add-4           2779          1762          -36.60%
BenchmarkElementVecOps/Add-4           2787          1768          -36.56%
BenchmarkElementVecOps/Sub-4           2767          1706          -38.34%
BenchmarkElementVecOps/Sub-4           2779          1715          -38.29%
BenchmarkElementVecOps/ScalarMul-4     18014         13515         -24.98%
BenchmarkElementVecOps/ScalarMul-4     18030         13533         -24.94%

Good starting point against something using AVX like this.

Copy link
Copy Markdown
Collaborator

@AlexandreBelling AlexandreBelling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but do you have the branch on top of v0.10.1-0.20240904184047-9db0eff0e5d3

@gbotrel gbotrel merged commit df40d22 into master Sep 12, 2024
@gbotrel gbotrel deleted the experiment/vecops branch September 12, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants