shenh062326 opened a new pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546 Before change, when string length exceed 32000, the error message is ``` Previous exception in task: Dataload failed, String length cannot exceed 32000 characters org.apache.carbondata.streaming.parser.FieldConverter$.objectToString(FieldConverter.scala:53) org.apache.carbondata.spark.util.CarbonScalaUtil$.getString(CarbonScalaUtil.scala:71) org.apache.carbondata.spark.rdd.NewRddIterator$$anonfun$next$1.apply$mcVI$sp(NewCarbonDataLoadRDD.scala:360) scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) org.apache.carbondata.spark.rdd.NewRddIterator.next(NewCarbonDataLoadRDD.scala:359) org.apache.carbondata.spark.load.DataLoadProcessorStepOnSpark$$anon$1.next(DataLoadProcessorStepOnSpark.scala:66) ... ... ``` After change, when string length exceed 32000, the error message is ``` Previous exception in task: Column idx 49 too long org.apache.carbondata.spark.util.CarbonScalaUtil$.getString(CarbonScalaUtil.scala:80) org.apache.carbondata.spark.rdd.NewRddIterator$$anonfun$next$1.apply$mcVI$sp(NewCarbonDataLoadRDD.scala:360) scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) org.apache.carbondata.spark.rdd.NewRddIterator.next(NewCarbonDataLoadRDD.scala:359) org.apache.carbondata.spark.load.DataLoadProcessorStepOnSpark$$anon$1.next(DataLoadProcessorStepOnSpark.scala:66) ... ... Caused by: java.lang.Exception: Dataload failed, String length cannot exceed 32000 characters at org.apache.carbondata.streaming.parser.FieldConverter$.objectToString(FieldConverter.scala:54) at org.apache.carbondata.spark.util.CarbonScalaUtil$.getString(CarbonScalaUtil.scala:74) ... 31 more ``` Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Yes - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
jackylk commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r361907045 ########## File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonGlobalDictionaryRDD.scala ########## @@ -297,7 +297,8 @@ class CarbonBlockDistinctValuesCombineRDD( val complexDelimiters = new util.ArrayList[String] model.delimiters.foreach(x => complexDelimiters.add(x)) for (i <- 0 until dimNum) { - dimensionParsers(i).parseString(CarbonScalaUtil.getString(row.get(i), + dimensionParsers(i).parseString(CarbonScalaUtil.getString(row, Review comment: move ` CarbonScalaUtil.getString(row,` to next line like `CarbonScalaUtil.getString(row, i),` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r361907089 ########## File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala ########## @@ -60,17 +60,27 @@ object CarbonScalaUtil { private val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getCanonicalName) - def getString(value: Any, + def getString(row: Row, Review comment: move `row:Row` to next line ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r361907260 ########## File path: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala ########## @@ -50,7 +51,7 @@ object FieldConverter { value match { case s: String => if (!isVarcharType && !isComplexType && s.length > CarbonCommonConstants.MAX_CHARS_PER_COLUMN_DEFAULT) { - throw new Exception("Dataload failed, String length cannot exceed " + + throw new Exception( exceedErrorMsg + Review comment: suggest to use `IllegalArgumentException` can catch it in CarbonScalaUtil.scala ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r361907260 ########## File path: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala ########## @@ -50,7 +51,7 @@ object FieldConverter { value match { case s: String => if (!isVarcharType && !isComplexType && s.length > CarbonCommonConstants.MAX_CHARS_PER_COLUMN_DEFAULT) { - throw new Exception("Dataload failed, String length cannot exceed " + + throw new Exception( exceedErrorMsg + Review comment: suggest to use `IllegalArgumentException` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r361907409 ########## File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala ########## @@ -60,17 +60,27 @@ object CarbonScalaUtil { private val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getCanonicalName) - def getString(value: Any, + def getString(row: Row, + idx: Int, serializationNullFormat: String, complexDelimiters: util.ArrayList[String], timeStampFormat: SimpleDateFormat, dateFormat: SimpleDateFormat, isVarcharType: Boolean = false, isComplexType: Boolean = false, level: Int = 0): String = { - FieldConverter.objectToString(value, serializationNullFormat, complexDelimiters, - timeStampFormat, dateFormat, isVarcharType = isVarcharType, isComplexType = isComplexType, - level) + try { + FieldConverter.objectToString(row.get(idx), serializationNullFormat, complexDelimiters, + timeStampFormat, dateFormat, isVarcharType = isVarcharType, isComplexType = isComplexType, + level) + } catch { + case e: Exception => + if (e.getMessage.startsWith(FieldConverter.exceedErrorMsg)) { + throw new Exception("Column idx " + idx + " too long", e) Review comment: Why change the content and throw again? Why not throw it directly? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569596183 Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1349/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569603453 Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1359/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569605699 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1370/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
shenh062326 commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r362154198 ########## File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala ########## @@ -60,17 +60,27 @@ object CarbonScalaUtil { private val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getCanonicalName) - def getString(value: Any, + def getString(row: Row, + idx: Int, serializationNullFormat: String, complexDelimiters: util.ArrayList[String], timeStampFormat: SimpleDateFormat, dateFormat: SimpleDateFormat, isVarcharType: Boolean = false, isComplexType: Boolean = false, level: Int = 0): String = { - FieldConverter.objectToString(value, serializationNullFormat, complexDelimiters, - timeStampFormat, dateFormat, isVarcharType = isVarcharType, isComplexType = isComplexType, - level) + try { + FieldConverter.objectToString(row.get(idx), serializationNullFormat, complexDelimiters, + timeStampFormat, dateFormat, isVarcharType = isVarcharType, isComplexType = isComplexType, + level) + } catch { + case e: Exception => + if (e.getMessage.startsWith(FieldConverter.exceedErrorMsg)) { + throw new Exception("Column idx " + idx + " too long", e) Review comment: I want to add column idx into the error message. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
shenh062326 commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r362154286 ########## File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonGlobalDictionaryRDD.scala ########## @@ -297,7 +297,8 @@ class CarbonBlockDistinctValuesCombineRDD( val complexDelimiters = new util.ArrayList[String] model.delimiters.foreach(x => complexDelimiters.add(x)) for (i <- 0 until dimNum) { - dimensionParsers(i).parseString(CarbonScalaUtil.getString(row.get(i), + dimensionParsers(i).parseString(CarbonScalaUtil.getString(row, Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
shenh062326 commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r362154300 ########## File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala ########## @@ -60,17 +60,27 @@ object CarbonScalaUtil { private val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getCanonicalName) - def getString(value: Any, + def getString(row: Row, Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
shenh062326 commented on a change in pull request #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#discussion_r362154321 ########## File path: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala ########## @@ -50,7 +51,7 @@ object FieldConverter { value match { case s: String => if (!isVarcharType && !isComplexType && s.length > CarbonCommonConstants.MAX_CHARS_PER_COLUMN_DEFAULT) { - throw new Exception("Dataload failed, String length cannot exceed " + + throw new Exception( exceedErrorMsg + Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569880578 Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1372/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569887598 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1392/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569889012 Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1382/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569907489 Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1377/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569913567 Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1387/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-569919488 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1398/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3546: [CARBONDATA-3642] Add column idx in error msg when string length exceed 32000
URL: https://github.com/apache/carbondata/pull/3546#issuecomment-570111631 Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1387/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |