sleef/CHANGELOG.adoc at master · Blosc/sleef

3.9 - 2025-03-27

The most significant change in this release is the rewriting of the DFT library in C++. This has made the code simpler, cleaner, and more stable. Next, tester4 has been introduced, which uses TLFloat instead of MPFR for testing. This allows for post-build testing in a wider range of environments than before. Various other changes were made to reduce project maintenance costs. For more information, see the list below.

What’s Changed

Change templates for pull requests and submitting issues. by @shibatch in shibatch#617
Add jenkinsfile by @shibatch in shibatch#619
Fix cuda by @shibatch in shibatch#620
Add tester4 second attempt by @shibatch in shibatch#622
fix xexpfrexp by @longbiao7498 in shibatch#627
Port Sleef to AIX by @KamathForAIX in shibatch#629
Change CI settings by @shibatch in shibatch#630
Add QNX SDP 8.0, 7.1 Support by @eleir9268 in shibatch#631
Change hash algorithm from md5 to sha256 by @shibatch in shibatch#634
Change tester3printf algo by @shibatch in shibatch#635
Code cleanup by @shibatch in shibatch#637
Add qtester4 by @shibatch in shibatch#639
Update qtester4simd.cpp by @shibatch in shibatch#640
Make tester4 default by @shibatch in shibatch#642
Robust ieee128 detection by @shibatch in shibatch#643
Convert dft to cpp by @shibatch in shibatch#644
Update copyright years. by @shibatch in shibatch#646
Various updates to DFT by @shibatch in shibatch#652
Add DFT planner testing. by @shibatch in shibatch#653
Update DFT benchmark tool by @shibatch in shibatch#655
Add 2D DFT benchmark tool by @shibatch in shibatch#656
Add DFT benchmark results to the web page by @shibatch in shibatch#657
Improve dft performance by @shibatch in shibatch#658
Add missing exe suffix for host executables by @daschuer in shibatch#660
New dft planner by @shibatch in shibatch#662
Better 2D DFT planner by @shibatch in shibatch#663
Better logic for saving dft plans by @shibatch in shibatch#665
Add LTO testing by @shibatch in shibatch#666
Add comparison results of DFT performance on macbook pro. by @shibatch in shibatch#668
This patch enables testing with tester4 on windows with clang. by @shibatch in shibatch#669
Revert the entire web page to the HTML version by @shibatch in shibatch#674

New Contributors

@longbiao7498 made their first contribution in shibatch#627
@KamathForAIX made their first contribution in shibatch#629
@eleir9268 made their first contribution in shibatch#631
@daschuer made their first contribution in shibatch#660

Full Changelog: https://github.com/shibatch/sleef/compare/3.8…3.9.0

3.8 - 2025-01-27

The focus of this release has been to facilitate benchmarking in SLEEF. It does so by providing a benchmarking tool and a plotting tool to postprocess the results. AArch64 self-hosted runners have been added to CI. Following this, the Linux and compiler version have been updated. Fix inaccuracy issues in a few functions, failures with cpp checks and a few bugs. Finally, the project has been extended with a blog section and its first blog post.

Added

Add benchmark and plotting tool by @joanaxcruz in #589, #597, #608 and #609
Use Arm-hosted runners by @blapie in #581
Add blog section and first post. by @blapie in #582

Changed

Update GH runners to Ubuntu 24.04 and GCC14 by @blapie in #598, #599 and #601

Fixed

Fix cbrt on AArch32, and atanf(+-0) with gcc-13 by @shibatch in #592
Fix oflow bound in log1p(f), exp and pow by @blapie in #604 and #606
Work around removal of some PowerPC intrinsics in GCC 15 by @musicinmybrain in #612
Fix errors reported by cppcheck by @blapie in #595

3.7 - 2024-09-17

The focus of this release has been to meet open-source community standards. It does so by providing Contributing Guidelines, Issues and Pull-Requests templates. Additionally, the documentation has been reworked to improve navigation (via search bar, side menu/panel, eased navigation on GitHub, …) and maintainability (reduced line count, mostly markdown sources, …). The website rendering is now delegated to a template customisable theme. See the new website at sleef.org, and docs/ for the GitHub-rendered documentation. The release also provides various bug fixes on several targets, for CPU detection and in the benchmark infrastructure.

Added

Add issue and PR templates. by @blapie in shibatch#565

Changed

Adjust scheduling of GHA workflows by @blapie in shibatch#553
Port documentation from html to markdown by @blapie in shibatch#564
Update acosh documentation by @joanaxcruz in shibatch#572

Fixed

S/390: Use getauxval for detecting VXE2 to fix #560 by @Andreas-Krebbel in shibatch#561
Revive micro-benchmarks for vector functions by @joanaxcruz in shibatch#571

3.6.1 - 2024-06-10

This patch release provides important bug fixes, including a fix for API compatibility with 3.5 (#534). The support and test for some features is still limited, as documented in README, however significant progress was made in order to test on Linux, macOS and Windows.

Added

Add support for RISC-V in DFT, QUAD and inline headers (#503, #522).
Add GHA workflow to run CI tests on Windows x86 (#540) and macOS x86/aarch64 (#543). And update test matrix.
Add GHA workflows to run examples in CI (#550).

Changed

Cleanup/Improve support for RISC-V in LIBM (#520, #521).
Update supported environment in documentation (#529, #549), including website and test matrix from README.

Fixed

Major fix and cleanup of CMakeLists.txt (#531).
Fix compatibility issue after removal of quad and long double sincospi (#545). Restores functions that are missing in 3.6.
Various bug fixes (#528, #533, #536, #537).

3.6 - 2024-02-14

This release follows a long period of inactivity. The library is now being actively maintained. However, the support and test for some features is currently limited, as documented in README.

Added

Add documentation for the quad precision math library
Enable generation of inline header file for CUDA (PR #337)
Add support for System/390 z15 support (PR #343)
Add support for POWER 9 (PR #360)
Add quad-precision functions (PR #375, #377, #380, #381, #382, #383, #385, #386, #387)
Add preliminary support for iOS and Android (PR #388, #389)
Add OpenMP pragmas to the function declarations in sleef.h to enable auto-vectorization by GCC (PR #404, #406)
Add new public CI test infrastructure using GitHub Actions (PR #476)
Add support for RISC-V in libm (PR #477)

Removed

Remove old CI scripts based on Travis/Jenkins/Appveyor (PR #502)

Changed

Optimise error functions (PR #370)
Update CMake package config (PR #412)
Update documentation and move doc/website to main repository (PR #504, #513)
Add SLEEF_ prefix to user-facing CMake options (PR #509)
Disable SVE on Darwin (PR #512)

Fixed

Fix parallel builds with GNU make (PR #491)
Various bug fixes (PR #492, #499, #508)

3.5.1 - 2020-09-15

Changed

Fixed a bug in handling compiler options

3.5 - 2020-09-01

IBM System/390 support is added.
The library can be built with Clang on Windows.
Static libraries with LTO can be generated.
Alternative division and sqrt methods can be chosen with AArch64.
Header files for inlining the whole SLEEF functions can be generated.
IEEE remainder function is added.
GCC-10 can now build SLEEF with SVE support.

3.4.1 - 2019-10-01

Changed

Fixed accuracy problem with tan_u35, atan_u10, log2f_u35 and exp10f_u10. shibatch#260 shibatch#265 shibatch#267
SVE intrinsics that are not supported in newer ACLE are replaced. shibatch#268
FMA4 detection problem is fixed. shibatch#262
Compilation problem under Windows with MinGW is fixed. shibatch#266

3.4 - 2019-04-28

Added

Faster and low precision functions are added. shibatch#229
Functions that return consistent results across platforms are added shibatch#216 shibatch#224
Quad precision math library(libsleefquad) is added shibatch#235 shibatch#237 shibatch#240
AArch64 Vector Procedure Call Standard (AAVPCS) support. # Changed
Many functions are now faster
Testers are now faster

3.3.1 - 2018-08-20

Added

FreeBSD support is added # Changed
i386 build problem is fixed
Trigonometric functions now evaluate correctly with full FP domain. shibatch#210

3.3 - 2018-07-06

Added

SVE target support is added to libsleef. shibatch#180
SVE target support is added to DFT. With this patch, DFT operations can be carried out using 256, 512, 1024 and 2048-bit wide vectors according to runtime availability of vector registers and operators. shibatch#182
3.5-ULP versions of sinh, cosh, tanh, sinhf, coshf, tanhf, and the corresponding testing functionalities are added. shibatch#192
Power VSX target support is added to libsleef. shibatch#195
Payne-Hanek like argument reduction is added to libsleef. shibatch#197

3.2 - 2018-02-26

Added

The whole build system of the project migrated from makefiles to cmake. In particualr this includes libsleef, libsleefgnuabi, libdft and all the tests.
Benchmarks that compare libsleef vs SVML on X86 Linux are available in the project tree under src/libm-benchmarks directory.
Extensive upstream testing via Travis CI and Appveyor, on the following systems:
- OS: Windows / Linux / OSX.
- Compilers: gcc / clang / MSVC.
- Targets: X86 (SSE/AVX/AVX2/AVX512F), AArch64 (Advanced SIMD), ARM (NEON). Emulators like QEMU or SDE can be used to run the tests.
Added the following new vector functions (with relative testing):
- log2
New compatibility tests have been added to check that libsleefgnuabi exports the GNUABI symbols correctly.
The library can be compiled to an LLVM bitcode object.
Added masked interface to the library to support AVX512F masked vectorization.

Changed

Use native instructions if available for sqrt.
Fixed fmax and fmin behavior on AArch64: shibatch#140
Speed improvements for asin, acos, fmod and log. Computation speed of other functions are also improved by general optimization. shibatch#97
Removed libm dependency.

Removed

Makefile build system

3.1 - 2017-07-19

Added AArch64 support
Implemented the remaining C99 math functions : lgamma, tgamma, erf, erfc, fabs, copysign, fmax, fmin, fdim, trunc, floor, ceil, round, rint, modf, ldexp, nextafter, frexp, hypot, and fmod.
Added dispatcher for x86 functions
Improved reduction of trigonometric functions
Added support for 32-bit x86, Cygwin, etc.
Improved tester

3.0 - 2017-02-07

New API is defined
Functions for DFT are added
sincospi functions are added
gencoef now supports single, extended and quad precision in addition to double precision
Linux, Windows and Mac OS X are supported
GCC, Clang, Intel Compiler, Microsoft Visual C++ are supported
The library can be compiled as DLLs
Files needed for creating a debian package are now included

2.120 - 2017-01-30

Relicensed to Boost Software License Version 1.0

2.110 - 2016-12-11

The valid range of argument is extended for trig functions
Specification of each functions regarding to the domain and accuracy is added
A coefficient generation tool is added
New testing tools are introduced
Following functions returned incorrect values when the argument is very large or small : exp, pow, asinh, acosh
SIMD xsin and xcos returned values more than 1 when FMA is enabled
Pure C cbrt returned incorrect values when the argument is negative
tan_u1 returned values with more than 1 ulp of error on rare occasions
Removed support for Java language(because no one seems using this)

2.100 - 2016-12-04

Added support for AVX-512F and Clang Extended Vectors.

2.90 - 2016-11-27

Added ilogbf. All the reported bugs(listed below) are fixed.
Log function returned incorrect values when the argument is very small.
Signs of returned values were incorrect when the argument is signed zero.
Tester incorrectly counted ULP in some cases.
ilogb function returned incorrect values in some cases.

2.80 - 2013-05-18

Added support for ARM NEON. Added higher accuracy single precision functions : sinf_u1, cosf_u1, sincosf_u1, tanf_u1, asinf_u1, acosf_u1, atanf_u1, atan2f_u1, logf_u1, and cbrtf_u1.

2.70 - 2013-04-30

Added higher accuracy functions : sin_u1, cos_u1, sincos_u1, tan_u1, asin_u1, acos_u1, atan_u1, atan2_u1, log_u1, and cbrt_u1. These functions evaluate the corresponding function with at most 1 ulp of error.

2.60 - 2013-03-26

Added the remaining single precision functions : powf, sinhf, coshf, tanhf, exp2f, exp10f, log10f, log1pf. Added support for FMA4 (for AMD Bulldozer). Added more test cases. Fixed minor bugs (which degraded accuracy in some rare cases).

2.50 - 2013-03-12

Added support for AVX2. SLEEF now compiles with ICC.

2.40 - 2013-03-07

Fixed incorrect denormal/nonnumber handling in ldexp, ldexpf, sinf and cosf. Removed support for Go language.

2.31 - 2012-07-05

Added sincosf.

2.30 - 2012-01-20

Added single precision functions : sinf, cosf, tanf, asinf, acosf, atanf, logf, expf, atan2f and cbrtf.

2.20 - 2012-01-09

Added exp2, exp10, expm1, log10, log1p, and cbrt.

2.10 - 2012-01-05

asin() and acos() are back.
Added ilogb() and ldexp().
Added hyperbolic functions.
Eliminated dependency on frexp, ldexp, fabs, isnan and isinf.

2.00 - 2011-12-30

All of the algorithm has been updated.
Both accuracy and speed are improved since version 1.10.
Denormal number handling is also improved.

1.10 - 2010-06-22

AVX support is added. Accuracy tester is added.

1.00 - 2010-05-15

Initial release

FilesExpand file tree

CHANGELOG.adoc

Latest commit

History

CHANGELOG.adoc

File metadata and controls

3.9 - 2025-03-27

What’s Changed

New Contributors

3.8 - 2025-01-27

Added

Changed

Fixed

3.7 - 2024-09-17

Added

Changed

Fixed

3.6.1 - 2024-06-10

Added

Changed

Fixed

3.6 - 2024-02-14

Added

Removed

Changed

Fixed

3.5.1 - 2020-09-15

Changed

3.5 - 2020-09-01

3.4.1 - 2019-10-01

Changed

3.4 - 2019-04-28

Added

3.3.1 - 2018-08-20

Added

3.3 - 2018-07-06

Added

3.2 - 2018-02-26

Added

Changed

Removed

3.1 - 2017-07-19

3.0 - 2017-02-07

2.120 - 2017-01-30

2.110 - 2016-12-11

2.100 - 2016-12-04

2.90 - 2016-11-27

2.80 - 2013-05-18

2.70 - 2013-04-30

2.60 - 2013-03-26

2.50 - 2013-03-12

2.40 - 2013-03-07

2.31 - 2012-07-05

2.30 - 2012-01-20

2.20 - 2012-01-09

2.10 - 2012-01-05

2.00 - 2011-12-30

1.10 - 2010-06-22

1.00 - 2010-05-15