You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,7 @@
6
6
* Implemented dataset symlink feature which allows providing multiple names for a dataset and adds edges to lineage graph based on symlinks [`#2066`](https://github.com/MarquezProject/marquez/pull/2066)[@pawel-big-lebowski](https://github.com/pawel-big-lebowski)
7
7
* Store column lineage facets in separate table [`#2096`](https://github.com/MarquezProject/marquez/pull/2096)[@mzareba382](https://github.com/mzareba382)[@pawel-big-lebowski](https://github.com/pawel-big-lebowski)
8
8
* Lineage graph endpoint for column lineage [`#2124`](https://github.com/MarquezProject/marquez/pull/2124)[@pawel-big-lebowski](https://github.com/pawel-big-lebowski)
9
+
* Enrich returned dataset resource with column lineage information [`#2113`](https://github.com/MarquezProject/marquez/pull/2113)[@pawel-big-lebowski](https://github.com/pawel-big-lebowski)
9
10
10
11
### Fixed
11
12
* Add support for `parentRun` facet as reported by older Airflow OpenLineage versions [@collado-mike](https://github.com/collado-mike)
SELECT DISTINCT ON (cl.output_dataset_field_uuid, cl.input_dataset_field_uuid) cl.*
161
+
FROM column_lineage cl
162
+
JOIN dataset_fields df ON df.uuid = cl.output_dataset_field_uuid
163
+
JOIN datasets_view dv ON dv.uuid = df.dataset_uuid
164
+
WHERE ARRAY[<values>]::DATASET_NAME[] && dv.dataset_symlinks -- array of string pairs is cast onto array of DATASET_NAME types to be checked if it has non-empty intersection with dataset symlinks
165
+
ORDER BY output_dataset_field_uuid, input_dataset_field_uuid, updated_at DESC, updated_at
166
+
),
167
+
dataset_fields_view AS (
168
+
SELECT d.namespace_name as namespace_name, d.name as dataset_name, df.name as field_name, df.type, df.uuid
169
+
FROM dataset_fields df
170
+
INNER JOIN datasets_view d ON d.uuid = df.dataset_uuid
171
+
)
172
+
SELECT
173
+
output_fields.namespace_name,
174
+
output_fields.dataset_name,
175
+
output_fields.field_name,
176
+
output_fields.type,
177
+
ARRAY_AGG(ARRAY[input_fields.namespace_name, input_fields.dataset_name, input_fields.field_name]) AS inputFields,
178
+
c.transformation_description,
179
+
c.transformation_type,
180
+
c.created_at,
181
+
c.updated_at
182
+
FROM selected_column_lineage c
183
+
INNER JOIN dataset_fields_view output_fields ON c.output_dataset_field_uuid = output_fields.uuid
184
+
LEFT JOIN dataset_fields_view input_fields ON c.input_dataset_field_uuid = input_fields.uuid
185
+
GROUP BY
186
+
output_fields.namespace_name,
187
+
output_fields.dataset_name,
188
+
output_fields.field_name,
189
+
output_fields.type,
190
+
c.transformation_description,
191
+
c.transformation_type,
192
+
c.created_at,
193
+
c.updated_at
194
+
""")
195
+
/**
196
+
* Each dataset is identified by a pair of strings (namespace and name). A query returns column
197
+
* lineage for multiple datasets, that's why a list of pairs is expected as an argument. "left"
198
+
* and "right" properties correspond to Java Pair class properties defined to bind query template
0 commit comments