Skip to content

Different results and accuracy down to 10% with PandasParallelLFApplier vs PandasLFApplier in Snorkel 0.9.5 #1587

@durgeshiitj

Description

@durgeshiitj

Issue description

I ran snorkel(v 0.9.5) on a dataset using PandasParrallelLFApplier and to my surprise I got 10% accuracy which I was expecting to be 90%. Then tried to use PandasLFApplier just to cross verify and I got 90% accuracy. When I compared the LabelMatrixs, both were not eqauls.

Before I was using 0.9.3 never faced problem. Just to cross verify I ran the same dataset on a different sytem having version 0.9.3 with both PandasParallelLFApplier and PandasLFApplier and found that in 0.9.3, both are yielding same Label-Matrix and same accuracy with same LFAnalysis.

Expected behavior

Both LFAppliers should yield similar results.

Screenshots

I'm attaching screenshots for your reference.

V 0.9.5 Analysis:

PandasLFApplier:
nonp095

PandasParallelLFApplier:
paralle095

Label-Matrix Comparison:
npequals095

V 0.9.3 Analysis:

PandasLFApplier:
pandasLfApp

PandasParallelLFApplier:
parallel

Label-Matrix Comparison:
noeqals093

System info

  • How you installed Snorkel (conda, pip, source): PIP
  • OS: Windows/Linux
  • Python version: 3.7
  • Snorkel version: 0.9.3(Windows)/ 0.9.5(Linux)

Additional context

Please look into this asap.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions