Describe the bug
The Set Difference and Set Intersection operations preserve duplicate items from the first sample, violating mathematical set semantics and producing unexpected results.
src/core/operations/SetDifference.mjs, runSetDifference()
src/core/operations/SetIntersection.mjs, runIntersect()
Both operations perform a simple .filter() on array a:
// SetDifference
runSetDifference(a, b) {
return a
.filter((item) => {
return b.indexOf(item) === -1;
})
.join(this.itemDelimiter);
}
// SetIntersection
runIntersect(a, b) {
return a
.filter((item) => {
return b.indexOf(item) > -1;
})
.join(this.itemDelimiter);
}
This preserves all occurrences of items that pass the filter. If a = ["x", "x", "y"] and b = ["y"], Set Difference returns "x,x". If a = ["y", "y", "z"] and b = ["y"], Set Intersection returns "y,y". Set operations should return each element at most once.
To Reproduce
configure both operations with sample delimiter \n\n and item delimiter ,:
- Set Difference: input
red,red,blue\n\nblue — expected red, actual red,red
- Set Intersection: input
red,red,blue\n\nred,blue — expected red,blue, actual red,red,blue
Expected behaviour
This is as much a design question as a bug. Users familiar with set theory expect A ∩ B and A - B to produce proper sets, but the current behavior treats these as "filter array A by membership in array B", preserving order and multiplicity. Set Union in the same module already deduplicates its output, which suggests the original intent was mathematical set semantics.
Screenshots
Additional context
Suggested fix:
// SetDifference
runSetDifference(a, b) {
const excluded = new Set(b);
const seen = new Set();
return a
.filter((item) => {
if (excluded.has(item) || seen.has(item)) {
return false;
}
seen.add(item);
return true;
})
.join(this.itemDelimiter);
}
// SetIntersection
runIntersect(a, b) {
const included = new Set(b);
const seen = new Set();
return a
.filter((item) => {
if (!included.has(item) || seen.has(item)) {
return false;
}
seen.add(item);
return true;
})
.join(this.itemDelimiter);
}
This change would alter behavior for users relying on duplicate-preserving output. If backward compatibility matters, consider adding a "Deduplicate results" boolean argument (default: true), or renaming these to "List Difference" / "List Intersection" and creating separate true set operations.
Describe the bug
The
Set DifferenceandSet Intersectionoperations preserve duplicate items from the first sample, violating mathematical set semantics and producing unexpected results.src/core/operations/SetDifference.mjs,runSetDifference()src/core/operations/SetIntersection.mjs,runIntersect()Both operations perform a simple
.filter()on arraya:This preserves all occurrences of items that pass the filter. If
a = ["x", "x", "y"]andb = ["y"],Set Differencereturns"x,x". Ifa = ["y", "y", "z"]andb = ["y"],Set Intersectionreturns"y,y". Set operations should return each element at most once.To Reproduce
configure both operations with sample delimiter
\n\nand item delimiter,:red,red,blue\n\nblue— expectedred, actualred,redred,red,blue\n\nred,blue— expectedred,blue, actualred,red,blueExpected behaviour
This is as much a design question as a bug. Users familiar with set theory expect
A ∩ BandA - Bto produce proper sets, but the current behavior treats these as "filter array A by membership in array B", preserving order and multiplicity.Set Unionin the same module already deduplicates its output, which suggests the original intent was mathematical set semantics.Screenshots
Additional context
Suggested fix:
This change would alter behavior for users relying on duplicate-preserving output. If backward compatibility matters, consider adding a "Deduplicate results" boolean argument (default: true), or renaming these to "List Difference" / "List Intersection" and creating separate true set operations.