Skip to content

[WIP] ARM: Add some inline assembly for bigint#31

Closed
redstar wants to merge 1 commit intoldc-developers:ldc-ltsmasterfrom
redstar:biguint
Closed

[WIP] ARM: Add some inline assembly for bigint#31
redstar wants to merge 1 commit intoldc-developers:ldc-ltsmasterfrom
redstar:biguint

Conversation

@redstar
Copy link
Copy Markdown
Member

@redstar redstar commented Mar 31, 2016

Implements the internal interface in ARM assembly.

@redstar
Copy link
Copy Markdown
Member Author

redstar commented Mar 31, 2016

@smolt @joakim-noah Is this useful? For sure I need to add loop unrolling.

@joakim-noah
Copy link
Copy Markdown

Looks like it might be, but its release tests assert at line 90 on Android/ARM and it causes test failures in other dependent modules like std.bigint too (at line 236 in ltsmaster).

@redstar
Copy link
Copy Markdown
Member Author

redstar commented Mar 31, 2016

@joakim-noah Thanks for trying. I try to correct the failures. (Didn't see them yesterday on my arm box but I made some more changes this morning.)

@JohanEngelen
Copy link
Copy Markdown
Member

@redstar Just a suggestion. Isn't it possible to do this in IR? Perhaps then you get the optimization for free. Or can LLVM also optimize the asm?

@redstar
Copy link
Copy Markdown
Member Author

redstar commented Mar 31, 2016

IR is no better than D. With D and IR you have to use the overflow intrinsics which leads to horrible codegen. I tried it with x86_64 and it is only marginable better than the D version.

@smolt
Copy link
Copy Markdown
Member

smolt commented Mar 31, 2016

Any opportunity to write assembly code should be taken advantage of 😄

I am on vacation, would like to spend time on on this when I am worn out.

@JohanEngelen
Copy link
Copy Markdown
Member

Any opportunity to write assembly code should be taken advantage of

Haha, indeed!
Thanks @redstar for increasing my ammunition in discussions about asm :)

Enjoy the snow @smolt .

@redstar redstar force-pushed the biguint branch 9 times, most recently from 9cad644 to fbfa504 Compare April 3, 2016 13:32
Implements the internal interface in ARM assembly.
@redstar
Copy link
Copy Markdown
Member Author

redstar commented Apr 3, 2016

The implemented functions now pass all tests, including -O3 -release. Currently measured speedup: multibyteAddSub!('+'): 2.2, multibyteMul: 1.4 and multibyteMulAdd!('+'): 1.2. There is still room for improvement.
To measure the timings I compiled the binary with: ldc2 -release -O3 -singleobj -unittest -d-version=timings biguintarm.d biguintnoasm.d.

@joakim-noah
Copy link
Copy Markdown

All tests now pass on Android/ARM when compiled in release mode, the new module's tests take 6 seconds to run.

@redstar
Copy link
Copy Markdown
Member Author

redstar commented Apr 21, 2016

@jpf91 Maybe this is interesting for gdc, too?

@jpf91
Copy link
Copy Markdown

jpf91 commented Apr 22, 2016

That's certainly interesting, thanks for the ping :-)

Is the LDC inline ASM string and the constraint string GCC compatible? Then we could probably implement the __asm interface in gdc to provide a common inline ASM syntax for LDC and GDC. ping @ibuclaw

@redstar
Copy link
Copy Markdown
Member Author

redstar commented Apr 22, 2016

@jpf91 Unfortunately the constraint string is not GCC compatible. (The content is but not the syntax.) Maybe we can agree on some standard here?

@ibuclaw
Copy link
Copy Markdown

ibuclaw commented Apr 23, 2016

Does the constraints string match what the LLVM backend expects? Or is it specially handled in LDC?

@dnadlinger
Copy link
Copy Markdown
Member

It is a backend thing.

@ibuclaw
Copy link
Copy Markdown

ibuclaw commented Apr 23, 2016

Then we should continue doing what is most natural for our respective backends. :-)

@redstar
Copy link
Copy Markdown
Member Author

redstar commented Jul 10, 2016

Merged in master now.

@redstar redstar closed this Jul 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants