The most significant change in this release is the rewriting of the DFT library in C++. This has made the code simpler, cleaner, and more stable. Next, tester4 has been introduced, which uses TLFloat instead of MPFR for testing. This allows for post-build testing in a wider range of environments than before. Various other changes were made to reduce project maintenance costs. For more information, see the list below.
-
Change templates for pull requests and submitting issues. by @shibatch in shibatch#617
-
Add jenkinsfile by @shibatch in shibatch#619
-
Fix cuda by @shibatch in shibatch#620
-
Add tester4 second attempt by @shibatch in shibatch#622
-
fix xexpfrexp by @longbiao7498 in shibatch#627
-
Port Sleef to AIX by @KamathForAIX in shibatch#629
-
Change CI settings by @shibatch in shibatch#630
-
Add QNX SDP 8.0, 7.1 Support by @eleir9268 in shibatch#631
-
Change hash algorithm from md5 to sha256 by @shibatch in shibatch#634
-
Change tester3printf algo by @shibatch in shibatch#635
-
Code cleanup by @shibatch in shibatch#637
-
Add qtester4 by @shibatch in shibatch#639
-
Update qtester4simd.cpp by @shibatch in shibatch#640
-
Make tester4 default by @shibatch in shibatch#642
-
Robust ieee128 detection by @shibatch in shibatch#643
-
Convert dft to cpp by @shibatch in shibatch#644
-
Update copyright years. by @shibatch in shibatch#646
-
Various updates to DFT by @shibatch in shibatch#652
-
Add DFT planner testing. by @shibatch in shibatch#653
-
Update DFT benchmark tool by @shibatch in shibatch#655
-
Add 2D DFT benchmark tool by @shibatch in shibatch#656
-
Add DFT benchmark results to the web page by @shibatch in shibatch#657
-
Improve dft performance by @shibatch in shibatch#658
-
Add missing exe suffix for host executables by @daschuer in shibatch#660
-
New dft planner by @shibatch in shibatch#662
-
Better 2D DFT planner by @shibatch in shibatch#663
-
Better logic for saving dft plans by @shibatch in shibatch#665
-
Add LTO testing by @shibatch in shibatch#666
-
Add comparison results of DFT performance on macbook pro. by @shibatch in shibatch#668
-
This patch enables testing with tester4 on windows with clang. by @shibatch in shibatch#669
-
Revert the entire web page to the HTML version by @shibatch in shibatch#674
-
@longbiao7498 made their first contribution in shibatch#627
-
@KamathForAIX made their first contribution in shibatch#629
-
@eleir9268 made their first contribution in shibatch#631
-
@daschuer made their first contribution in shibatch#660
The focus of this release has been to facilitate benchmarking in SLEEF. It does so by providing a benchmarking tool and a plotting tool to postprocess the results. AArch64 self-hosted runners have been added to CI. Following this, the Linux and compiler version have been updated. Fix inaccuracy issues in a few functions, failures with cpp checks and a few bugs. Finally, the project has been extended with a blog section and its first blog post.
-
Add benchmark and plotting tool by @joanaxcruz in #589, #597, #608 and #609
-
Use Arm-hosted runners by @blapie in #581
-
Add blog section and first post. by @blapie in #582
The focus of this release has been to meet open-source community standards. It does so by providing Contributing Guidelines, Issues and Pull-Requests templates. Additionally, the documentation has been reworked to improve navigation (via search bar, side menu/panel, eased navigation on GitHub, …) and maintainability (reduced line count, mostly markdown sources, …). The website rendering is now delegated to a template customisable theme. See the new website at sleef.org, and docs/ for the GitHub-rendered documentation. The release also provides various bug fixes on several targets, for CPU detection and in the benchmark infrastructure.
-
Add issue and PR templates. by @blapie in shibatch#565
-
Adjust scheduling of GHA workflows by @blapie in shibatch#553
-
Port documentation from html to markdown by @blapie in shibatch#564
-
Update acosh documentation by @joanaxcruz in shibatch#572
-
S/390: Use getauxval for detecting VXE2 to fix #560 by @Andreas-Krebbel in shibatch#561
-
Revive micro-benchmarks for vector functions by @joanaxcruz in shibatch#571
This patch release provides important bug fixes, including a fix for API compatibility with 3.5 (#534). The support and test for some features is still limited, as documented in README, however significant progress was made in order to test on Linux, macOS and Windows.
-
Add support for RISC-V in DFT, QUAD and inline headers (#503, #522).
-
Add GHA workflow to run CI tests on Windows x86 (#540) and macOS x86/aarch64 (#543). And update test matrix.
-
Add GHA workflows to run examples in CI (#550).
-
Cleanup/Improve support for RISC-V in LIBM (#520, #521).
-
Update supported environment in documentation (#529, #549), including website and test matrix from README.
This release follows a long period of inactivity. The library is now being actively maintained. However, the support and test for some features is currently limited, as documented in README.
-
Add documentation for the quad precision math library
-
Enable generation of inline header file for CUDA (PR #337)
-
Add support for System/390 z15 support (PR #343)
-
Add support for POWER 9 (PR #360)
-
Add quad-precision functions (PR #375, #377, #380, #381, #382, #383, #385, #386, #387)
-
Add preliminary support for iOS and Android (PR #388, #389)
-
Add OpenMP pragmas to the function declarations in sleef.h to enable auto-vectorization by GCC (PR #404, #406)
-
Add new public CI test infrastructure using GitHub Actions (PR #476)
-
Add support for RISC-V in libm (PR #477)
-
Optimise error functions (PR #370)
-
Update CMake package config (PR #412)
-
Update documentation and move doc/website to main repository (PR #504, #513)
-
Add SLEEF_ prefix to user-facing CMake options (PR #509)
-
Disable SVE on Darwin (PR #512)
-
IBM System/390 support is added.
-
The library can be built with Clang on Windows.
-
Static libraries with LTO can be generated.
-
Alternative division and sqrt methods can be chosen with AArch64.
-
Header files for inlining the whole SLEEF functions can be generated.
-
IEEE remainder function is added.
-
GCC-10 can now build SLEEF with SVE support.
-
Fixed accuracy problem with tan_u35, atan_u10, log2f_u35 and exp10f_u10. shibatch#260 shibatch#265 shibatch#267
-
SVE intrinsics that are not supported in newer ACLE are replaced. shibatch#268
-
FMA4 detection problem is fixed. shibatch#262
-
Compilation problem under Windows with MinGW is fixed. shibatch#266
-
Faster and low precision functions are added. shibatch#229
-
Functions that return consistent results across platforms are added shibatch#216 shibatch#224
-
Quad precision math library(libsleefquad) is added shibatch#235 shibatch#237 shibatch#240
-
AArch64 Vector Procedure Call Standard (AAVPCS) support. # Changed
-
Many functions are now faster
-
Testers are now faster
-
FreeBSD support is added # Changed
-
i386 build problem is fixed
-
Trigonometric functions now evaluate correctly with full FP domain. shibatch#210
-
SVE target support is added to libsleef. shibatch#180
-
SVE target support is added to DFT. With this patch, DFT operations can be carried out using 256, 512, 1024 and 2048-bit wide vectors according to runtime availability of vector registers and operators. shibatch#182
-
3.5-ULP versions of sinh, cosh, tanh, sinhf, coshf, tanhf, and the corresponding testing functionalities are added. shibatch#192
-
Power VSX target support is added to libsleef. shibatch#195
-
Payne-Hanek like argument reduction is added to libsleef. shibatch#197
-
The whole build system of the project migrated from makefiles to cmake. In particualr this includes
libsleef,libsleefgnuabi,libdftand all the tests. -
Benchmarks that compare
libsleefvsSVMLon X86 Linux are available in the project tree under src/libm-benchmarks directory. -
Extensive upstream testing via Travis CI and Appveyor, on the following systems:
-
OS: Windows / Linux / OSX.
-
Compilers: gcc / clang / MSVC.
-
Targets: X86 (SSE/AVX/AVX2/AVX512F), AArch64 (Advanced SIMD), ARM (NEON). Emulators like QEMU or SDE can be used to run the tests.
-
-
Added the following new vector functions (with relative testing):
-
log2
-
-
New compatibility tests have been added to check that
libsleefgnuabiexports the GNUABI symbols correctly. -
The library can be compiled to an LLVM bitcode object.
-
Added masked interface to the library to support AVX512F masked vectorization.
-
Use native instructions if available for
sqrt. -
Fixed fmax and fmin behavior on AArch64: shibatch#140
-
Speed improvements for
asin,acos,fmodandlog. Computation speed of other functions are also improved by general optimization. shibatch#97 -
Removed
libmdependency.
-
Added AArch64 support
-
Implemented the remaining C99 math functions : lgamma, tgamma, erf, erfc, fabs, copysign, fmax, fmin, fdim, trunc, floor, ceil, round, rint, modf, ldexp, nextafter, frexp, hypot, and fmod.
-
Added dispatcher for x86 functions
-
Improved reduction of trigonometric functions
-
Added support for 32-bit x86, Cygwin, etc.
-
Improved tester
-
New API is defined
-
Functions for DFT are added
-
sincospi functions are added
-
gencoef now supports single, extended and quad precision in addition to double precision
-
Linux, Windows and Mac OS X are supported
-
GCC, Clang, Intel Compiler, Microsoft Visual C++ are supported
-
The library can be compiled as DLLs
-
Files needed for creating a debian package are now included
-
The valid range of argument is extended for trig functions
-
Specification of each functions regarding to the domain and accuracy is added
-
A coefficient generation tool is added
-
New testing tools are introduced
-
Following functions returned incorrect values when the argument is very large or small : exp, pow, asinh, acosh
-
SIMD xsin and xcos returned values more than 1 when FMA is enabled
-
Pure C cbrt returned incorrect values when the argument is negative
-
tan_u1 returned values with more than 1 ulp of error on rare occasions
-
Removed support for Java language(because no one seems using this)
-
Added ilogbf. All the reported bugs(listed below) are fixed.
-
Log function returned incorrect values when the argument is very small.
-
Signs of returned values were incorrect when the argument is signed zero.
-
Tester incorrectly counted ULP in some cases.
-
ilogb function returned incorrect values in some cases.
-
Added support for ARM NEON. Added higher accuracy single precision functions : sinf_u1, cosf_u1, sincosf_u1, tanf_u1, asinf_u1, acosf_u1, atanf_u1, atan2f_u1, logf_u1, and cbrtf_u1.
-
Added higher accuracy functions : sin_u1, cos_u1, sincos_u1, tan_u1, asin_u1, acos_u1, atan_u1, atan2_u1, log_u1, and cbrt_u1. These functions evaluate the corresponding function with at most 1 ulp of error.
-
Added the remaining single precision functions : powf, sinhf, coshf, tanhf, exp2f, exp10f, log10f, log1pf. Added support for FMA4 (for AMD Bulldozer). Added more test cases. Fixed minor bugs (which degraded accuracy in some rare cases).
-
Fixed incorrect denormal/nonnumber handling in ldexp, ldexpf, sinf and cosf. Removed support for Go language.
-
Added single precision functions : sinf, cosf, tanf, asinf, acosf, atanf, logf, expf, atan2f and cbrtf.
-
asin() and acos() are back.
-
Added ilogb() and ldexp().
-
Added hyperbolic functions.
-
Eliminated dependency on frexp, ldexp, fabs, isnan and isinf.
-
All of the algorithm has been updated.
-
Both accuracy and speed are improved since version 1.10.
-
Denormal number handling is also improved.