[llama.cpp] Update llama.cpp to latest release b2581 (2024-03-30)#3055
Closed
howlger wants to merge 148 commits intodeepjavalibrary:masterfrom
Closed
[llama.cpp] Update llama.cpp to latest release b2581 (2024-03-30)#3055howlger wants to merge 148 commits intodeepjavalibrary:masterfrom
howlger wants to merge 148 commits intodeepjavalibrary:masterfrom
Conversation
--------- Co-authored-by: Administrator <Administrator@tech8> Co-authored-by: KexinFeng <fenkexin@amazon.com>
* Implement PtNDArraryEx.multiboxDetection * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * format code * Fix, add tests, and pass CI --------- Co-authored-by: Zach Kimberg <kimbergz@amazon.com>
…brary#2796) This reverts commit 3a90d0a.
This fixes the markdown headers to be h1 so they render correctly in docs.
…valibrary#2806) * [api] Added Early stopping configuration (deepjavalibrary#38) * [api] Added Builder for Early stopping configuration (deepjavalibrary#38) * Explicitly set NDManager for dataset in EarlyStoppingListenerTest to make the test run on JDK11 in gradle.
This creates an abstraction for combining devices into a single device. The main use case for now is in DJL Serving TP_parallel. It will allow us to create a WorkerGroup and a PyPredictor for a set of devices and then track the usage of devices properly. It could also be used later for multi-gpu training or other multi-device cases.
* Updates doc versions to 0.24.0 Also moves android gradle.properties to the new 0.25.0. * Remove android change
* Updates XGBoost to 2.0.1 * Use devtools 8 * Updates based on new Xgboost JNI API. --------- Co-authored-by: Frank Liu <frankfliu2000@gmail.com>
* Added element-wise gauss error function (ERF) * Added element-wise arctan2 * Format java * Fixed docs * added * to other_ptr in Atan2
* Added 2D FFT * Format java * Add default fft2 * Convert array to vectors * Add inverse fft2 * Add better assersion in ifft2 test * Add really better assersion in ifft2 test * Move cast bellow ifft2 for unsupported exception * Format java * changed dims to axes * changed dims to axes
* only build triton binaries * install requests library * remove script
Updates the navigation as a followup to deepjavalibrary/djl-serving#1316.
…brary#3032) * support includeTokenTypes in TextEmbeddingBatchTranslator Co-authored-by: Frank Liu <frankfliu2000@gmail.com>
* Increase DJL version to 0.27.0 * Update README
In order to get support for BERT based sentence embedding models like BAAI/bge-base-en-v1.5, mixedbread-ai/mxbai-embed-large-v1, or others, update llama.cpp from b1696 (2023-12-12): https://github.com/ggerganov/llama.cpp/releases/tag/b1696 to the current latest release b2581 (2024-03-30): https://github.com/ggerganov/llama.cpp/releases/tag/b2581 BERT support was added to llama.cpp in February 2024: ggml-org/llama.cpp#5423
frankfliu
approved these changes
Apr 1, 2024
frankfliu
suggested changes
Apr 1, 2024
Contributor
frankfliu
left a comment
There was a problem hiding this comment.
Llama.cpp implementation has changed, just bump up version won't work, we have to make JNI code change to make it compile
Author
|
I see. Thanks for taking the time to try it. The native libraries are built for all platforms with Native S3 llama.cpp, right? I can't find the log of the failed build. Could you please share it? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In order to get support for BERT based sentence embedding models like BAAI/bge-base-en-v1.5, mixedbread-ai/mxbai-embed-large-v1, or others, update llama.cpp from b1696 (2023-12-12) to the current latest release b2581 (2024-03-30).
BERT support was added to llama.cpp in February 2024: ggml-org/llama.cpp#5423
This change has not yet been tested. Maybe updating the Gradle property
llamacpp_versionis not enough andai_djl_llama.cppneeds to be adapted as well. If so, please do so.