WIP: Integrate ISA-L & Generalised Erasure Coding.#80
Draft
BlamKiwi wants to merge 367 commits intokoverstreet:masterfrom
Draft
WIP: Integrate ISA-L & Generalised Erasure Coding.#80BlamKiwi wants to merge 367 commits intokoverstreet:masterfrom
BlamKiwi wants to merge 367 commits intokoverstreet:masterfrom
Conversation
We weren't checking for errors when trying to delet stripes, which meant ec_stripe_delete_work() would spin trying to delete the same stripe over and over. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
If there is only a single entry at 0, the first time we call xas_next(), we return the entry. Unfortunately, all subsequent times we call xas_next(), we also return the entry at 0 instead of noticing that the xa_index is now greater than zero. This broke find_get_pages_contig(). Fixes: 64d3e9a ("xarray: Step through an XArray") Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
There was a null ptr deref when there wasn't a stripes heap allocated Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Change it to not mark keys that will be overwritten by keys in the journal - this fixes a bug where we pop an assertion in bucket_set_stripe() because of a stale pointer - because the stripe that has the stale pointer has been deleted. This code could be factored out and used elsewhere, at some point. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Actual repair code will come later, but this is a start Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
With reflink, we'll no longer be able to calculate the offset of the data we want into the extent we're reading from from the extent pos and the iter pos - we'll have to pass it in separately. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
for_each_btree_key() calls bch2_trans_get_iter() - we have to reset the transaction state before getting the iterator again, in the retry path Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Where unlink_on_commit is used, on unsuccessfull commit we're likely retrying the whole update and were going to be using the same iterators again. The management of multiple iterators needs to be gone over a fair bit more at some point... Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Prep work for reflink - for reflink, we're going to be using bch2_extent_update() with other updates in the same transaction. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Minor cleanup - prep work for new key types for reflink Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
With reflink, various code now has to handle both KEY_TYPE_extent or KEY_TYPE_reflink_v - so, convert it to be generic across all keys with pointers. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
More prep work for reflink: for extents, we're not looking for an exact mach on pos, rather that the pos is within the range of the key the iterator points to. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
bch2_btree_node_iter_prev_filter() tried to be smart about iterating backwards when skipping over whiteouts/discards - but unfortunately, doing so can leave the node iterator in an inconsistent state; the sane solution is to just always iterate backwards one key at a time. But we compact btree nodes when more than a quarter of the keys are whiteouts/discards, so the optimization wasn't buying us that much anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
8f49267 to
0b5e3ee
Compare
ad68801 to
6e8f25f
Compare
83dd3db to
6a3927a
Compare
b0f77a0 to
f2700b9
Compare
da5ffff to
75a3eb8
Compare
4b2d093 to
45665ce
Compare
16cbc9a to
8fc58b1
Compare
fa97ffc to
a4c0a23
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Over the weekend I got ISA-L building and integrated CRC64 (5-15x speed-up Ryzen 2200G) as a quick proof point. I just want some quick feedback before tackling full Erasure Coding.
KBuild -
I've added ISA-L and EC as some boolean flags to KBuild. I assume you don't want EC support as a separate module?
Makefile -
The ISA-L code builds without modification from Intel's upstream. This has resulted in very verbose KBuild Special Rules due to the NASM dependency and unnecessary CRC implementations. I would be interested in advice for a better approach until I can port ISA-L to GAS and strip out unused code.
Accel.h/c -
This is a temporary integration point for accessing optimised primitives. I intend to move them to the appropriate kernel lib folders once everything is working.
MD-RAID Compatibility -
The website TODO list mentions Andrea Mazzoleni's technique of combining Vandermonde and Cauchy matrices to implement Erasure Coding compatible with MD-RAID. To begin with I won't be implementing this technique. When stuff is stable I will dig into those mathematics a bit.