ShreelekhyaG opened a new pull request #3919: URL: https://github.com/apache/carbondata/pull/3919 ### Why is this PR needed? Load fails with aborted exception when Bad records action is unspecified. When the partition column is loaded with a bad record value, load fails with 'Job aborted' message in cluster. However in complete stack trace we can see the actual error message. (Like, 'Data load failed due to bad record: The value with column name projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type') ### What changes were proposed in this PR? Fix bad record error message for the partition column. Added the error message to `operationContext` map and if its not null throwing exception with `errorMessage` from `CarbonLoadDataCommand`. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No, tested in cluster. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690196778 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2298/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690199098 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4037/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690905770 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2306/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690906823 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4044/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690905770 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-690905770 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r487970151 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ########## @@ -191,7 +191,12 @@ case class CarbonLoadDataCommand(databaseNameOp: Option[String], if (isUpdateTableStatusRequired) { CarbonLoaderUtil.updateTableStatusForFailure(carbonLoadModel, uuid) } - throw ex + val errorMessage = operationContext.getProperty("Error message") + if (errorMessage != null) { + throw new Exception(errorMessage.toString, ex.getCause) Review comment: Instead of Exception, try to use specific class of Exception like RuntimeException, IOException or something suitable here. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r487970978 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala ########## @@ -1064,6 +1064,7 @@ object CommonLoadUtils { if (loadParams.updateModel.isDefined) { CarbonScalaUtil.updateErrorInUpdateModel(loadParams.updateModel.get, executorMessage) } Review comment: if possible, Please add a small testcase that intercepts exception with proper error message ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488422763 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala ########## @@ -1064,6 +1064,7 @@ object CommonLoadUtils { if (loadParams.updateModel.isDefined) { CarbonScalaUtil.updateErrorInUpdateModel(loadParams.updateModel.get, executorMessage) } Review comment: Added testcase ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488423451 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ########## @@ -191,7 +191,12 @@ case class CarbonLoadDataCommand(databaseNameOp: Option[String], if (isUpdateTableStatusRequired) { CarbonLoaderUtil.updateTableStatusForFailure(carbonLoadModel, uuid) } - throw ex + val errorMessage = operationContext.getProperty("Error message") + if (errorMessage != null) { + throw new Exception(errorMessage.toString, ex.getCause) Review comment: Done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449175 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala ########## @@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter } } + test("test load with partition column having bad record value") { + sql("drop table if exists dataloadOptionTests") + sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " + + "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " + + "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " + + "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ") + val csvFilePath = s"$resourcesPath/data.csv" + val ex = intercept[Exception] { + sql("LOAD DATA local inpath '" + csvFilePath + + "' INTO TABLE dataloadOptionTests OPTIONS ('bad_records_action'='FAIL', 'DELIMITER'= '," + + "', 'QUOTECHAR'= '\"', 'dateformat'='DD-MM-YYYY','timestampformat'='DD-MM-YYYY')"); + } + assert(ex.getMessage.contains( + "DataLoad failure: Data load failed due to bad record: The value with column name " + + "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type.Please " + + "enable bad record logger to know the detail reason.")) + } Review comment: please drop the table here ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449347 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala ########## @@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter } } + test("test load with partition column having bad record value") { + sql("drop table if exists dataloadOptionTests") + sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " + + "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " + + "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " + + "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ") + val csvFilePath = s"$resourcesPath/data.csv" + val ex = intercept[Exception] { Review comment: please intercept RuntimeException only ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692563259 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692645252 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4080/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692654382 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2340/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488595453 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala ########## @@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter } } + test("test load with partition column having bad record value") { + sql("drop table if exists dataloadOptionTests") + sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " + + "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " + + "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " + + "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ") + val csvFilePath = s"$resourcesPath/data.csv" + val ex = intercept[Exception] { + sql("LOAD DATA local inpath '" + csvFilePath + + "' INTO TABLE dataloadOptionTests OPTIONS ('bad_records_action'='FAIL', 'DELIMITER'= '," + + "', 'QUOTECHAR'= '\"', 'dateformat'='DD-MM-YYYY','timestampformat'='DD-MM-YYYY')"); + } + assert(ex.getMessage.contains( + "DataLoad failure: Data load failed due to bad record: The value with column name " + + "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type.Please " + + "enable bad record logger to know the detail reason.")) + } Review comment: Ok ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692670191 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #3919: URL: https://github.com/apache/carbondata/pull/3919 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |