Problem
Job context is a structure that serves as code location / SQL container to show them in Marquez UI. Job context upsert takes only checksum on context's body on conflict. This means that when e.g. at the start and the end of job the context is different there would be 2 different entries in job_context table for this job. That still might be ok, however this has its result in exposing in API only first captured context which means if you don't send SqlJobFacet in the START event you won't see it even if you send it in the COMPLETE event.
Solutions
I foresee couple of ways to solve this problem:
- Update
job_context_uuid when upserting into runs table. This will result in getting only most recent context exposed which might be acceptable but probably not.
- Add some custom logic to merge arrays when context relates to the same run (or job?).
- Merge contexts in API. This would change run <--> job_context relation to 1-to-many.
- Change structure of
job_contexts table: replace context column with 3 following: code_location_type, code_location_url, sql which would be filled on upsert. Some concatenation would still be needed probably.
Problem
Job context is a structure that serves as code location / SQL container to show them in Marquez UI. Job context upsert takes only checksum on context's body on conflict. This means that when e.g. at the start and the end of job the context is different there would be 2 different entries in
job_contexttable for this job. That still might be ok, however this has its result in exposing in API only first captured context which means if you don't send SqlJobFacet in the START event you won't see it even if you send it in the COMPLETE event.Solutions
I foresee couple of ways to solve this problem:
job_context_uuidwhen upserting intorunstable. This will result in getting only most recent context exposed which might be acceptable but probably not.job_contextstable: replacecontextcolumn with 3 following:code_location_type,code_location_url,sqlwhich would be filled on upsert. Some concatenation would still be needed probably.