[SPARK-1552] Fix type comparison bug in {map,outerJoin}Vertices#967
[SPARK-1552] Fix type comparison bug in {map,outerJoin}Vertices#967ankurdave wants to merge 6 commits intoapache:masterfrom
Conversation
…ices In GraphImpl, mapVertices and outerJoinVertices use a more efficient implementation when the map function conserves vertex attribute types. This is implemented by comparing the ClassTags of the old and new vertex attribute types. However, ClassTags store erased types, so the comparison will return a false positive for types with different type parameters, such as Option[Int] and Option[Double]. This PR resolves the problem by unconditionally using the general implementation in mapVertices and outerJoinVertices, and introducing "Conserve" variants of these methods that enforce type equality and use the more efficient implementation. It also adds a test called "mapVertices changing type with same erased type" that failed before the PR and succeeds now. The "Conserve" naming comes from Scala's `List#mapConserve` method.
|
@rxin This is a binary-compatible bugfix, so it should also be backported into branch-1.0. |
|
Merged build triggered. |
|
Merged build started. |
|
Actually, never mind about the backporting -- apparently introducing new methods breaks compatibility, and this bug is pretty hard to trigger, so I think it's fine to just put the fix into 1.1.0. |
|
@ankurdave in general new methods should be fine wrt compatibility. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
There was a problem hiding this comment.
What if we just use mapVertices, but compare the two classtags and call the right mapVerticesConserve?
There was a problem hiding this comment.
Are you saying to detect whether to call mapVerticesConserve by comparing ClassTags? That's the point of the change -- comparing ClassTags can give a false positive when the type is erased (e.g., classTag[Option[Int]] == classTag[Option[String]]), so it's unsafe to rely on it. See https://issues.apache.org/jira/browse/SPARK-1552.
There was a problem hiding this comment.
Would TypeTags solve this problem?
…JoinVertices" This reverts commit 16d6af8.
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15456/ |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
|
Thanks. Merging this in master. |
In GraphImpl, mapVertices and outerJoinVertices use a more efficient implementation when the map function conserves vertex attribute types. This is implemented by comparing the ClassTags of the old and new vertex attribute types. However, ClassTags store erased types, so the comparison will return a false positive for types with different type parameters, such as Option[Int] and Option[Double]. This PR resolves the problem by requesting that the compiler generate evidence of equality between the old and new vertex attribute types, and providing a default value for the evidence parameter if the two types are not equal. The methods can then check the value of the evidence parameter to see whether the types are equal. It also adds a test called "mapVertices changing type with same erased type" that failed before the PR and succeeds now. Callers of mapVertices and outerJoinVertices can no longer use a wildcard for a graph's VD type. To avoid "Error occurred in an application involving default arguments," they must bind VD to a type parameter, as this PR does for ShortestPaths and LabelPropagation. Author: Ankur Dave <ankurdave@gmail.com> Closes apache#967 from ankurdave/SPARK-1552 and squashes the following commits: 68a4fff [Ankur Dave] Undo conserve naming 7388705 [Ankur Dave] Remove unnecessary ClassTag for VD parameters a704e5f [Ankur Dave] Use type equality constraint with default argument 29a5ab7 [Ankur Dave] Add failing test f458c83 [Ankur Dave] Revert "[SPARK-1552] Fix type comparison bug in mapVertices and outerJoinVertices" 16d6af8 [Ankur Dave] [SPARK-1552] Fix type comparison bug in mapVertices and outerJoinVertices
In GraphImpl, mapVertices and outerJoinVertices use a more efficient implementation when the map function conserves vertex attribute types. This is implemented by comparing the ClassTags of the old and new vertex attribute types. However, ClassTags store erased types, so the comparison will return a false positive for types with different type parameters, such as Option[Int] and Option[Double]. This PR resolves the problem by requesting that the compiler generate evidence of equality between the old and new vertex attribute types, and providing a default value for the evidence parameter if the two types are not equal. The methods can then check the value of the evidence parameter to see whether the types are equal. It also adds a test called "mapVertices changing type with same erased type" that failed before the PR and succeeds now. Callers of mapVertices and outerJoinVertices can no longer use a wildcard for a graph's VD type. To avoid "Error occurred in an application involving default arguments," they must bind VD to a type parameter, as this PR does for ShortestPaths and LabelPropagation. Author: Ankur Dave <ankurdave@gmail.com> Closes apache#967 from ankurdave/SPARK-1552 and squashes the following commits: 68a4fff [Ankur Dave] Undo conserve naming 7388705 [Ankur Dave] Remove unnecessary ClassTag for VD parameters a704e5f [Ankur Dave] Use type equality constraint with default argument 29a5ab7 [Ankur Dave] Add failing test f458c83 [Ankur Dave] Revert "[SPARK-1552] Fix type comparison bug in mapVertices and outerJoinVertices" 16d6af8 [Ankur Dave] [SPARK-1552] Fix type comparison bug in mapVertices and outerJoinVertices
In GraphImpl, mapVertices and outerJoinVertices use a more efficient implementation when the map function conserves vertex attribute types. This is implemented by comparing the ClassTags of the old and new vertex attribute types. However, ClassTags store erased types, so the comparison will return a false positive for types with different type parameters, such as Option[Int] and Option[Double].
This PR resolves the problem by requesting that the compiler generate evidence of equality between the old and new vertex attribute types, and providing a default value for the evidence parameter if the two types are not equal. The methods can then check the value of the evidence parameter to see whether the types are equal.
It also adds a test called "mapVertices changing type with same erased type" that failed before the PR and succeeds now.
Callers of mapVertices and outerJoinVertices can no longer use a wildcard for a graph's VD type. To avoid "Error occurred in an application involving default arguments," they must bind VD to a type parameter, as this PR does for ShortestPaths and LabelPropagation.