[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox

ShreelekhyaG opened a new pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919


    ### Why is this PR needed?
   Load fails with aborted exception when Bad records action is unspecified.
   
   When the partition column is loaded with a bad record value, load fails with 'Job aborted' message in cluster. However in complete stack trace we can see the actual error message. (Like, 'Data load failed due to bad record: The value with column name projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type')
   
    ### What changes were proposed in this PR?
    Fix bad record error message for the partition column. Added the error message to `operationContext` map and if its not null throwing exception with `errorMessage` from  `CarbonLoadDataCommand`.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No, tested in cluster.
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690196778


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2298/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690199098


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4037/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [WIP] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690905770


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2306/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [WIP] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690906823


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4044/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [WIP] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690905770






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [WIP] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690905770






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r487970151



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##########
@@ -191,7 +191,12 @@ case class CarbonLoadDataCommand(databaseNameOp: Option[String],
         if (isUpdateTableStatusRequired) {
           CarbonLoaderUtil.updateTableStatusForFailure(carbonLoadModel, uuid)
         }
-        throw ex
+        val errorMessage = operationContext.getProperty("Error message")
+        if (errorMessage != null) {
+          throw new Exception(errorMessage.toString, ex.getCause)

Review comment:
       Instead of Exception, try to use specific class of Exception like RuntimeException, IOException or something suitable here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r487970978



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala
##########
@@ -1064,6 +1064,7 @@ object CommonLoadUtils {
         if (loadParams.updateModel.isDefined) {
           CarbonScalaUtil.updateErrorInUpdateModel(loadParams.updateModel.get, executorMessage)
         }

Review comment:
       if possible, Please add a small testcase that intercepts exception with proper error message




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488422763



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala
##########
@@ -1064,6 +1064,7 @@ object CommonLoadUtils {
         if (loadParams.updateModel.isDefined) {
           CarbonScalaUtil.updateErrorInUpdateModel(loadParams.updateModel.get, executorMessage)
         }

Review comment:
       Added testcase




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488423451



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##########
@@ -191,7 +191,12 @@ case class CarbonLoadDataCommand(databaseNameOp: Option[String],
         if (isUpdateTableStatusRequired) {
           CarbonLoaderUtil.updateTableStatusForFailure(carbonLoadModel, uuid)
         }
-        throw ex
+        val errorMessage = operationContext.getProperty("Error message")
+        if (errorMessage != null) {
+          throw new Exception(errorMessage.toString, ex.getCause)

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449175



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala
##########
@@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter
     }
   }
 
+  test("test load with partition column having bad record value") {
+    sql("drop table if exists dataloadOptionTests")
+    sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " +
+      "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " +
+      "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " +
+      "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ")
+    val csvFilePath = s"$resourcesPath/data.csv"
+    val ex = intercept[Exception] {
+      sql("LOAD DATA local inpath '" + csvFilePath +
+          "' INTO TABLE dataloadOptionTests OPTIONS ('bad_records_action'='FAIL', 'DELIMITER'= '," +
+          "', 'QUOTECHAR'= '\"', 'dateformat'='DD-MM-YYYY','timestampformat'='DD-MM-YYYY')");
+    }
+    assert(ex.getMessage.contains(
+      "DataLoad failure: Data load failed due to bad record: The value with column name " +
+      "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type.Please " +
+      "enable bad record logger to know the detail reason."))
+  }

Review comment:
       please drop the table here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449347



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala
##########
@@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter
     }
   }
 
+  test("test load with partition column having bad record value") {
+    sql("drop table if exists dataloadOptionTests")
+    sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " +
+      "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " +
+      "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " +
+      "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ")
+    val csvFilePath = s"$resourcesPath/data.csv"
+    val ex = intercept[Exception] {

Review comment:
       please intercept RuntimeException only




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692563259






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692645252


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4080/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692654382


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2340/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488595453



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala
##########
@@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter
     }
   }
 
+  test("test load with partition column having bad record value") {
+    sql("drop table if exists dataloadOptionTests")
+    sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " +
+      "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " +
+      "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " +
+      "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ")
+    val csvFilePath = s"$resourcesPath/data.csv"
+    val ex = intercept[Exception] {
+      sql("LOAD DATA local inpath '" + csvFilePath +
+          "' INTO TABLE dataloadOptionTests OPTIONS ('bad_records_action'='FAIL', 'DELIMITER'= '," +
+          "', 'QUOTECHAR'= '\"', 'dateformat'='DD-MM-YYYY','timestampformat'='DD-MM-YYYY')");
+    }
+    assert(ex.getMessage.contains(
+      "DataLoad failure: Data load failed due to bad record: The value with column name " +
+      "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type.Please " +
+      "enable bad record logger to know the detail reason."))
+  }

Review comment:
       Ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692670191


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

GitBox
In reply to this post by GitBox

asfgit closed pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]