U

24 Messages

 • 

4.8K Points

Thursday, May 23rd, 2024 12:37 AM

Lineage Harvester vs. Edge

Hi. I'm new to Collibra. Currently, we're using both Lineage Harvester and Edge for ingesting Technical lineage from various sources.

Can anyone give me an idea what's the advantages and disadvantages of using Lineage Harvester vs Edge?

I would like to know, based on your experience which one is better to use for certain sources such as Power BI, Tableau, SSRS, SSIS, Databricks Unity Catalog?

I found this reference in another thread: Supported data sources for technical lineage (collibra.com) and for some data sources line Tableau and PowerBI, both Lineage Harvester and Edge can be utilized. What's the best practice and considerations in choosing what to use?

Thanks!

143 Messages

 • 

9.8K Points

7 months ago

Lineage Harvester won't be round forever, suggest switching. They are apparently at functional parity. We are moving our Tableau across as soon as Edge gets packaged internally for deploy into Prod. We have though tested in Dev that the migration works.

Runbook, in Dev:

  • ingest using LiHa from our Tableau Sites
  • switch to Edge, for the same Sites
  • reconcile that there is no change (Global View / export all characteristics) via asset & asset characteristic line-by-line reconciliation
  • Test passed ;-) 

Remember the backstory is Collibra purchased SQLdeb in 2019, which had it's own ingestion method Lineage Harvester. It's taken awhile but they have moved that capability across to Edge now, so the LiHa can fall away, goodbye. 

Collibra Announces Acquisition of SQLdep | Collibra

Edge gives you substantially more functionality too;

  1. Metadata
  2. Profiling
  3. AI-driven data classification
  4. Sampling
  5. Technical lineage

... whereas Lineage Harvester only gives you 5 (and 1 although Edge has deeper value here too).

(edited)

88 Messages

 • 

2.8K Points

7 months ago

Hi,

Collibra University has a free course, which may help you understand more about using Edge for Technical Lineage: Why you should use Lineage on Edge today.

My thoughts

  • Lineage Harvester makes it harder to collaborate with colleagues because you are using files on a local device. Edge can be configured within Collibra's UI where others can freely access the same Edge Capabilities and Connections.
  • Lineage Harvester is more technical than Lineage on Edge because you have to use Terminal/Command + JSON script.
  • Edge has settings you can configure within the Collibra platform versus having to remember commands that you reference on Collibra's product documentation.
  • Edge is probably easier to synchronize because you can set a schedule through the UI on Collibra versus Lineage Harvester where you would need to do it programmatically.

(edited)

143 Messages

 • 

9.8K Points

Hi @SeanPyle on this point, you can do this with Edge too as I understand it

  • Lineage Harvester can have more flexibility for situations where you need custom technical lineage from unsupported data sources.

see Prepare an SQL directory (collibra.com)

but keen to hear where LiHa is ahead specifically

I haven't done it, but was hoping to use this pathway to ingest stored procedures that otherwise are not ingested, eg Redshift scope = SQL based input without stored procedures.

88 Messages

 • 

2.8K Points

You're right, Grant – my mistake. It requires some configuration with JSON files, but it is possible.

Here's the reference.

Loading...