A

Tuesday, November 16th, 2021 2:21 PM

Issue with Hive query (FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask)

We are running a POC for customer RBC and running into an issue with the Hive environment specifically.
Other sources are able to profile and run jobs with no issue.
The error occurs when we try to just Estimate Job, not even at the point of running the profiling yet. The operation appears to be doing select counts. Attached are the logs (changed extension to allow attachment).
Checking google appears to indicate many different solutions to this issue, tried a few but wanted to see if anyone has seen this issue before working with Hive for DQ.

RBC-owl-web.txt (8.3 MB)

3 years ago

Hi Adam, I see a security problem in the log here, which would rapidly stop a DQ Check:
FAILED: SemanticException Unable to fetch table cdr_data. org.apache.hadoop.security.AccessControlException: Permission denied: user=DSJD0SRVCMHIVE, access=EXECUTE, inode="/qa/06295/app/ZIU0/data/cdr/outbound_timeseries":hdfs:hdfs:drwxr-x---

16 Messages

Worked with the customer and we got this issue resolved (seems to be resolved anyway):
Properties: tez.queue.name=D_NO_SLA,tez.runtime.io.sort.mb=3548,hive.support.sql11.reserved.keywords=false

But now encountering this:

Error:

Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.241.165.29, executor 1): java.sql.SQLException: [Cloudera]HiveJDBCDriver Error creating login context using ticket cache: Unable to obtain Principal Name for authentication . at com.cloudera.hiveserver2.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source) at com.cloudera.hiveserver2.hivecommon.api.ServiceDiscoveryFactory.createClient(Unknown Source) at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source) at com.cloudera.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source) at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45) at org.apache.spark.sql.exe

Yes, that is a pure Kerberos issue. We would need to see the client’s Kereberos configurations. Kerbereos configuration is not being properly picked up: [https://dq-docs.collibra.com/v/2.15.0/connecting-to-dbs-in-owl-web/owl-db-connection/connecting-to-cdh-5.16-hive-ssl-tls-kerberos-setup](http://Collibra DQ Kerberos Config). Is this a CDH 5 Hive Kerberos platform?

16 Messages

3 years ago

Provided them this documentation. The latest error is:

Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.241.165.29, executor 0): java.sql.SQLException: [Cloudera]HiveJDBCDriver Error creating login context using ticket cache: Unable to obtain Principal Name for authentication . at com.cloudera.hiveserver2.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source) at com.cloudera.hiveserver2.hivecommon.api.ServiceDiscoveryFactory.createClient(Unknown Source) at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source) at com.cloudera.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source) at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45) at org.apache.spark.sql.exe

Loading...