J

Thursday, May 26th, 2022 4:31 PM

Collibra to redshift connector

Hi, Data citizens,

Today I have an interesting discussion with my colleague. Currently, alongside other connectors, we are using the connector to ingest metadata from Redshift to Collibra. And we were thinking if it is possible to do it vice-versa and have a connection from Collibra to Redshift. I was searching through Collibra documentation and I haven’t found any info related to a use case like this.
Does anybody have experience with it or give it a try to build such a connector?

Thank you

668 Messages

 • 

11.9K Points

2 years ago

@spring-team.collibra.com Is this something you could chime in on or no?

1.2K Messages

2 years ago

@jakub.janiga

  1. The official solution is the Collibra Insights: it’s a star-schema optimized for analysis, but users have complained about its incompleteness and I believe a redesign is underway.
  2. You can create a connector based on the outputmodule, that would allow you to extract deltas from collibra and pipe into redshift on a regular (e.g. daily) basis.
  • Quite comprehensive, but documentation is not 100% and it can be difficult to write queries. e.g. you can query all attributes of java objects, but they are not all documented. e.g. users.lastLogin.
  • Make sure to implement pagination to avoid platform instability.
  • Consistency is not guaranteed (every page is a new transaction, state can change between transactions) => You need to account for those in your pipeline. e.g. if you fetch attributes then assets, you might fetch attributes related to assets that have been deleted in between calls, leading to an inconsistent state.
  1. You can download the daily repo backup, restore into a postgres server, and query the data directly from the replicated collibra postgres database.

38 Messages

2 years ago

Hi @jakub.janiga,

The Redshift connector is fully technically supported by Collibra. Please feel free to create a suppport ticket if you’re still blocked.

Best regards,
Paulo

Loading...