40 Messages
How to configure DQ Connector
In this post we are going to review the DQ connector configuration step by step.
Prerequisites
- Edge
- Create an Edge site via Collibra Data Intelligence Cloud Settings.
- Install the Edge site close to the data source you want to access.
- Collibra Data Quality
- Install Collibra Data Quality.
- Connect a data source within Collibra Data Quality.
- Run a Data Quality Test.
- Create some automatic Rules within Collibra Data Quality.
-
A Collibra Data Governance instance with administrator privileges.
-
Enable Database registration and Data Quality Synchronization through Edge.
-
Create the domains for the Data Quality assets: Rulebook for Rules and Business Asset Domains for metrics and dimensions.
Note: Make sure the Edge user is assigned as Technical Steward on the Responsibilities section in every new document to have enough privileges to bring all the information from Collibra Data Quality. -
To be able to see the rules in Table and Column asset, these relations have to be configured in these assets. For example: “governed by Governance Asset” and " is governed by Data Quality Rule”. And configure the opposite relations in Rules assets if you want to see the complete relation.
After configuring all these prerequisites, we can start configuring the DQ Connector.
Connect to a Data Quality source.
- Inside DGC, click on “Settings” -> “Edge” and check your Edge site is healthy.
If not, please, check the installation steps of Edge site.
#Connect to your Data Quality source:
Create a connection for each CollibraData Quality data source you want to synchronize.
- Click on the Edge name.
- Click on “Create connection”.
These are the connection settings that we have to provide:
-
Name -> The same name as the Collibra Data Quality connection name.
-
Description -> The description of the JDBC connection. This field is also visible when you register content.
-
Connection provider. In this example, we are going to select “Generic JDBC connection”
-
Fill all the connection parameters to be able to connect with the data source.
-
Finally, click on Create. The connection will be shown in the Edge site.
- Check the connection is correctly configuring, clicking on “Test Connection”
Now, we are going to add ingestion capabilities to your Data Quality connection.
- Click on “Capabilities” tab and click on “Add Capability” and select “Catalog JDBC ingestion”. Complete the needed parameters:
- JDBC Connection: Select the connection created previously.
- JDBC Datasource type: Select the type of JDBC connection.
- After creating the JDBC ingestion for our connection, we have to configure the DQ Connector. Click on “Capabilities” tab and click on “Add Capability” and select “DQ Connector”.
Complete the needed parameters:
- Base URL: The URL of the DQ installation. IMPORTANT: Don’t add a “/” at the end of the URL.
- Username/Password to connect with DQ instance.
- DQ Rules domain id: The id of the created Rulebook.
- DQ Metrics: The id of the Business Asset Domains created in the previous steps.
- DQ Dimensions: The id of the Business Asset Domains created in the previous steps.
- Click on “Create” and the DQ Connector capability will be available in the Edge site.
Register Data Quality Edge connection in Data Catalog:
#Create a Data Catalog System Asset.
- If it is not created, create a Physical Data Dictionary to save the assets that will be created.
- Click on “+” and create a new System.
Note: the name of the system should match the name of the connection configured in DQ.
- Click on “Create” and the System asset will be created.
Now, register the Data Quality data source in Data Catalog.
-
Go to Catalog.
-
Click on “+” and select “Register a data source”.
- Click on Add in the Connection configured previously.
- Configure the configuration parameters:
- Select the desired Community.
- Select the system configured previously.
- Select the database that you want to synchronize.
- Click on Register
Now, we have to execute the JDBC ingestion and the Data Quality extraction.
-
After the registration of the Datasource, the Database screen will be opened. Click on “Configuration”
-
Click on the desired Schema name and click on “Save”
-
Then, click on “Synchronized metadata”
- Now, go to “Quality extractión”, you will see the schemas that were synchronized before.
- Click on Edit.
- Select “Extract” in the Data Quality column.
- Click on Save
- Click on Synchronize.
Now, we can check if we have all the information of Data Quality in Collibra Data Governance.
- Check your Rulebook and domains.
- Check the Table Asset.
- Check the Column Asset.
Note: To be able to configure the Data Quality graphs, the Data Quality Rules have to be configured in settings. For example:
No Responses!