ajantha-bhat commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-589498137 @QiangCai , @kunal642 , @jackylk :PR is ready. please review ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-589501574 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/378/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-589519466 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2080/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590150487 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590153543 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/414/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590163981 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2115/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#discussion_r383369387 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala ########## @@ -176,8 +176,21 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String], convertedStaticPartition) scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd if (logicalPartitionRelation != null) { - logicalPartitionRelation = - getReArrangedSchemaLogicalRelation(reArrangedIndex, logicalPartitionRelation) + if (selectedColumnSchema.length != logicalPartitionRelation.output.length) { + throw new RuntimeException(" schema length doesn't match partition length") + } + var isAlreadyReArranged = true + var index = 0 + for (col: ColumnSchema <- selectedColumnSchema) { Review comment: Please use lambda function instead of `for` loop, findFirst? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#discussion_r383369776 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala ########## @@ -728,7 +728,11 @@ object CommonLoadUtils { } val updatedRdd: RDD[InternalRow] = rdd.map { internalRow => for (index <- timeStampIndex) { - internalRow.setLong(index, internalRow.getLong(index) / 1000) + if (internalRow.getLong(index) == 0) { + internalRow.setNullAt(index) + } else { + internalRow.setLong(index, internalRow.getLong(index) / 1000) Review comment: What does 1000 stands for? It is magic number ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#discussion_r383371226 ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java ########## @@ -99,9 +99,12 @@ noDictDimensionPages = new ColumnPage[model.getNoDictionaryCount()]; int tmpNumDictDimIdx = 0; int tmpNumNoDictDimIdx = 0; - for (int i = 0; i < dictDimensionPages.length + noDictDimensionPages.length; i++) { + for (int i = 0; i < tableSpec.getNumDimensions(); i++) { TableSpec.DimensionSpec spec = tableSpec.getDimensionSpec(i); - ColumnType columnType = tableSpec.getDimensionSpec(i).getColumnType(); + if (spec.getSchemaDataType().isComplexType()) { + // partition columns are placed at the end. so, might present after complex columns Review comment: Do you mean to skip all complex column and go to the last dimension? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#discussion_r383377457 ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java ########## @@ -99,9 +99,12 @@ noDictDimensionPages = new ColumnPage[model.getNoDictionaryCount()]; int tmpNumDictDimIdx = 0; int tmpNumNoDictDimIdx = 0; - for (int i = 0; i < dictDimensionPages.length + noDictDimensionPages.length; i++) { + for (int i = 0; i < tableSpec.getNumDimensions(); i++) { TableSpec.DimensionSpec spec = tableSpec.getDimensionSpec(i); - ColumnType columnType = tableSpec.getDimensionSpec(i).getColumnType(); + if (spec.getSchemaDataType().isComplexType()) { + // partition columns are placed at the end. so, might present after complex columns Review comment: yes, initially also it was skipping. I will make it more easy to understand ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#discussion_r383378326 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala ########## @@ -728,7 +728,11 @@ object CommonLoadUtils { } val updatedRdd: RDD[InternalRow] = rdd.map { internalRow => for (index <- timeStampIndex) { - internalRow.setLong(index, internalRow.getLong(index) / 1000) + if (internalRow.getLong(index) == 0) { + internalRow.setNullAt(index) + } else { + internalRow.setLong(index, internalRow.getLong(index) / 1000) Review comment: It is a time stamp local granularity , Let me define it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590424436 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/442/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590451473 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/446/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590464854 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2142/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590466281 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2146/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590636742 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590642360 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/448/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#discussion_r383615563 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala ########## @@ -728,7 +728,13 @@ object CommonLoadUtils { } val updatedRdd: RDD[InternalRow] = rdd.map { internalRow => for (index <- timeStampIndex) { - internalRow.setLong(index, internalRow.getLong(index) / 1000) + if (internalRow.getLong(index) == 0) { Review comment: why is 0, not DIRECT_DICT_VALUE_NULL? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#issuecomment-590661153 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2148/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3615: [CARBONDATA-3637] Use optimized insert flow for MV and insert stage command
URL: https://github.com/apache/carbondata/pull/3615#discussion_r383657139 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala ########## @@ -728,7 +728,13 @@ object CommonLoadUtils { } val updatedRdd: RDD[InternalRow] = rdd.map { internalRow => for (index <- timeStampIndex) { - internalRow.setLong(index, internalRow.getLong(index) / 1000) + if (internalRow.getLong(index) == 0) { Review comment: because timestamp is not direct dictionary, only date is direct dictionary. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |