L

Thursday, December 2nd, 2021 2:18 PM

REVIEWED-INTERNAL ONLY: Can Collibra DQ allow me to JOIN my CSV files?

Q: Can Collibra DQ allow me to JOIN my CSV files?

A: Kirk confirmed that we DO NOT allow for JOIN on files, and almost no one does.
Your options are: Amazon S3, or HDFS files. You can put Apache Hive, Presto or Athena over top the file and JOIN via DB layer as those engines can do it. You can use Collibra DQ’s notebook API (meaning you can JOIN them in Apache Spark yourself) then pass in the resulting Spark DataFrame to Collibra DQ.

There is not a button to JOIN files in Collibra DQ.

683 Messages

 • 

15.3K Points

3 years ago

From the Review Team: "4/11/22 - With the information provided, we do not have enough information to make this customer facing. We would advise this remain internal only. "

Loading...