Runless events - consume job event#2661
Conversation
5158fc2 to
bdcfec7
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #2661 +/- ##
============================================
+ Coverage 83.76% 84.05% +0.28%
- Complexity 1338 1379 +41
============================================
Files 247 248 +1
Lines 6112 6297 +185
Branches 281 286 +5
============================================
+ Hits 5120 5293 +173
- Misses 843 851 +8
- Partials 149 153 +4 ☔ View full report in Codecov by Sentry. |
32deb1a to
151ed6b
Compare
| ) e | ||
| GROUP BY e.run_uuid | ||
| ) f ON f.run_uuid=jv.latest_run_uuid | ||
| LEFT OUTER JOIN job_versions_facets f ON j.current_version_uuid = f.job_version_uuid |
| * @param jobRow The job. | ||
| * @return A {@link BagOfJobVersionInfo} object. | ||
| */ | ||
| default BagOfJobVersionInfo upsertRunlessJobVersion( |
There was a problem hiding this comment.
Minor: We can consider factoring out common code in JobVersionDao.upsertJobVersionOnRunTransition() and upsertRunlessJobVersion() to avoid duplication. But, not a major concern as we are aware we'll need to revisit some of this code later.
There was a problem hiding this comment.
Yes, we can do later further refactor with this section.
| CREATE INDEX job_facets_job_version_uuid ON job_facets (job_version_uuid); | ||
|
|
||
| ALTER TABLE lineage_events ADD COLUMN spec_event_type VARCHAR(64); | ||
| UPDATE lineage_events SET spec_event_type = 'RunEvent'; No newline at end of file |
There was a problem hiding this comment.
do we want to set all events to RunEvent? Also, I'd define the _event_type enum as RUN_EVENT, DATASET_EVENT , or JOB_EVENT.
There was a problem hiding this comment.
I switched to enum and set a default value instead of running update table.
84f4670 to
de98803
Compare
✅ Deploy Preview for peppy-sprite-186812 canceled.
|
| @@ -0,0 +1,4 @@ | |||
| CREATE TYPE EVENT_TYPE AS ENUM ('RUN_EVENT', 'DATASET_EVENT', 'JOB_EVENT'); | |||
There was a problem hiding this comment.
Minor: We standardized on not defining enums types in the DB layer but rather enforce them in the application layer. This way, it avoids a DB migration every time we (might) add a new event type.
tl;dr, I'd just define _event_type as a string
|
|
||
| @SqlUpdate( | ||
| "INSERT INTO lineage_events (" | ||
| + "event_type, " |
There was a problem hiding this comment.
Minor: We can consider defining a _run_state column and eventually dropping the event_type. That is, we can consider columns prefixed with _ to be "remappings" of OL properties to Marquez.
wslulciuc
left a comment
There was a problem hiding this comment.
Minor comments / suggests, otherwise 💯 💯 🥇
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
b7e40f8 to
0a3f98a
Compare
* Runless events - consume job event Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com> * Runless events - fix listLineage API Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com> * fix rebase Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com> * modify event type column Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com> * make _event_type varchar(64) instead of enum Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com> --------- Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Problem
As a followup of #2641, this PR introduces support for
JobEvent. PR shall be merged after #2641.Solution
job_version_uuidcolumn is added tojob_facetstable. This requires db migration to backfill existingjob_facetsentries.JobDaoasfindJobByNamemethod should work without join torunstable.spec_event_typetolineage_eventstable to indicate which type of event is stored.Limitations:
listLineageendpoints filtersRunEventonly and does not support introduced event types. This can be implemented within other PR and issue.One-line summary:
Checklist
CHANGELOG.md(Depending on the change, this may not be necessary)..sqldatabase schema migration according to Flyway's naming convention (if relevant)