R

Wednesday, November 8th, 2023 11:45 PM

Creating Relationships When Loading Assets With Spreadsheets

For the past 3 years I’ve been loading data dictionary metadata into Collibra using Excel spreadsheets. Till now I’ve been able to successfully create a relationship by providing just the Full Name and Domain of the target asset type. Recently I received an error message “Important Conflict: The domain type for the creation of domain ‘xxxxx’ is missing.” I was trying to create a new Schema and relate it to an existing Component with the “is part of” relation. The Domain has been in existence for years and I was able to create this link in the past. Can someone explain what has gone wrong?

91 Messages

 • 

905 Points

11 months ago

Hi,
when was the last time that you imported something? Because starting with version 2023.08 Collibra forces theimport wizard v2 for everyone. Maybe that broke something?

6 Messages

Stefan,

Thanks for the reply. Your link to the “improved” import wizard may help. I’m writing a longer reply to both you and Richard (his nick name is forbidden here) Bakker.

Rich

16 Messages

11 months ago

Hi Richard,

De v2Import wizard did not make everything better, so it indeed could be that some input is missing, which was not necessary to provide in the previous version of the import wizard. If at all possible, please share a fake example of your file, that helps wth finding the root cause of this one.

Regards,
D.B.

6 Messages

D1ck & Stefan,

Thanks for the reply. I haven’t done a major DD import (Schemas, Tables and Columns) for about 6 months and I don’t know which release we are currently using. I plan to do some test loads on Monday and I’ll post the results here when I’m done.

Here are some details of what I’ve been doing that may better identify my problem:

As I said in my first post, I’ve been able to create a relation as part of an import simply by providing the Full Name and Domain of the “to’” asset.

Two days ago I updated a single Component asset successfully. It had a “is part of [System]” relationship to an existing System asset and it worked. That spreadsheet had the following columns:

Asset Type
Name
Full Name
Community
Domain
Description
Acronym
is part of [System] > Full Name
is part of [System] > Domain

All had valid data. The parent system was unchanged from the previous version. It is my habit to include child to parent relations when updating an existing asset even if the relation has not changed. In the past this was not a problem and it wasn’t a problem with this Component asset.

Then I tried to update a Schema asset that is a child of the Component I just updated. It had the following columns:

Asset Type
Name
Full Name
Community
Doman
Description
DBMS Product (a custom attribute)
DBMS Version (a custom attribute)
is part of [Component] > Full Name
is part of [Component] > Domain

All had valid data. The Component Full Name matched what I had just imported as did the Component Domain. However the import failed with “Important Conflict: The domain type for the creation of domain ‘xxxxx’ is missing.” My interpretation of this message is that Collibra was trying to create a new Domain record even though the domain already exists. I’ve been told that adding the following columns will resolve the problem:

is part of [Component] > Asset Type
is part of [Component] > Community
is part of [Component] > Domain Type

But this doesn’t make sense. Asset Type and Community are attributes of the Component not part of its unique identifier. I thought that Domain Type was simply an attribute of Domain, but perhaps I’m wrong. Is it possible to have two Domains with the same name and different Domain Types???

One important difference between the two imports is that the Component and its parent System are in the same Domain, but the Schema and its parent Component are in different Domains. It the past this was not an issue, but??

In the Differences and guidelines document for import wizard v2 I found the following:

" * For an improved and more accurate relation mapping, use the latest file format that contains more details in the relation header."

This may be an important clue, but I don’t know what “the latest file format” is.

As I said, I plan to do some testing on Monday and I’ll post the results here. I’m the kind of person who really likes to understand why things work and not just blindly follow rules.

Thank you both for trying to help.

Rich

91 Messages

 • 

905 Points

Hi Rich,

One important difference between the two imports is that the Component and its parent System are in the same Domain, but the Schema and its parent Component are in different Domains. It the past this was not an issue, but??

Are these domains in the same community?

6 Messages

Thank you for the follow-up.

Ah! Six months ago they were all in the same community. That is no longer the case. All the DD assets types (Schemas, Tables, and Columns) are in a new community that is subordinate to a larger community that includes the domain containing both the System and Component assets.

I just made an interesting discovery. About 6 months ago the full names of both the domain that contain the System and Component assets and the domain that contains the DD asset types were changed. Those Full Names are no longer identical to the Names. The new Full Names now have an added prefix. I suspect this may be part of the problem. My guess is that Collibra searches on the Domain Full Name rather than the Domain Name when trying to find an asset to be the parent in a relation. I plan to test for this on Monday.

6 Messages

11 months ago

All,

I’ve found my answer. The key information was found at https://productresources.collibra.com/docs/collibra/latest/Content/BestPractices/co_bp-import-assets-using-UI.htm

In addition I did a good bit of experimenting over the last 2 weeks. Here’s what I learned

  1. To fully specify an asset you need Full Name + Domain + Community.

  2. If you don’t provide the Community, the Community which “owns” the location you are in when importing will be used.

  3. If you don’t provide the Domain & Community, the Domain & Community which “owns” the location you are in when importing will be used.

  4. If your import includes a relation and the “to” end of the relation is in another Community, you must provide that community name or default rule 2 will cause an error.

The fact that our new Domain Full Names do not match the Domain Names doesn’t seem to matter, but it strikes me as a undesirable practice.

Thank you for your help.

Loading...