ajantha-bhat opened a new pull request #3634: [WIP] Support altertable scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
CarbonDataQA1 commented on issue #3634: [WIP] Support altertable scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590030086 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/404/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [WIP] Support altertable scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590031019 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2105/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [WIP] Support altertable scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590047904 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/408/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [WIP] Support altertable scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590049330 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2109/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [WIP] Support altertable scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590154207 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/415/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [WIP] Support altertable scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590164266 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2116/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590342755 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/434/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590393218 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2134/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590429664 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/443/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#discussion_r383388985 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala ########## @@ -449,67 +443,48 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String], } } - private def isAlteredSchema(tableSchema: TableSchema): Boolean = { - if (tableInfo.getFactTable.getSchemaEvolution != null) { - tableInfo - .getFactTable - .getSchemaEvolution - .getSchemaEvolutionEntryList.asScala.exists(entry => - (entry.getAdded != null && entry.getAdded.size() > 0) || - (entry.getRemoved != null && entry.getRemoved.size() > 0) - ) - } else { - false - } - } - def getReArrangedIndexAndSelectedSchema( tableInfo: TableInfo, partitionColumnSchema: mutable.Buffer[ColumnSchema]): (Seq[Int], Seq[ColumnSchema]) = { var reArrangedIndex: Seq[Int] = Seq() var selectedColumnSchema: Seq[ColumnSchema] = Seq() - var complexChildCount: Int = 0 var partitionIndex: Seq[Int] = Seq() - val columnSchema = tableInfo.getFactTable.getListOfColumns.asScala + val internalOrderColumns: util.ArrayList[CarbonColumn] = new util.ArrayList[CarbonColumn]() + internalOrderColumns.addAll(table.getVisibleDimensions) + internalOrderColumns.addAll(table.getVisibleMeasures) + val columnSchema = internalOrderColumns.asScala.map(col => col.getColumnSchema) val partitionColumnNames = if (partitionColumnSchema != null) { partitionColumnSchema.map(x => x.getColumnName).toSet } else { null } - // get invisible column indexes, alter table scenarios can have it before or after new column - // dummy measure will have ordinal -1 and it is invisible, ignore that column. - // alter table old columns are just invisible columns with proper ordinal - val invisibleIndex = columnSchema.filter(col => col.isInvisible && col.getSchemaOrdinal != -1) - .map(col => col.getSchemaOrdinal) - columnSchema.filterNot(col => col.isInvisible).foreach { + var createOrderColumns = table.getCreateOrderColumn.asScala + val createOrderMap = mutable.Map[String, Int]() + var createOrderPosition = 0 + if (partitionColumnNames != null) { + // For alter table drop/add column scenarios, partition column may not be in the end. + // Need to keep it in the end. + createOrderColumns = createOrderColumns.filterNot(col => + partitionColumnNames.contains(col.getColumnSchema.getColumnName)) ++ + createOrderColumns.filter(col => + partitionColumnNames.contains(col.getColumnSchema.getColumnName)) + } + for (col: CarbonColumn <- createOrderColumns) { Review comment: use foreach instead ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#discussion_r383389281 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala ########## @@ -449,67 +443,48 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String], } } - private def isAlteredSchema(tableSchema: TableSchema): Boolean = { - if (tableInfo.getFactTable.getSchemaEvolution != null) { - tableInfo - .getFactTable - .getSchemaEvolution - .getSchemaEvolutionEntryList.asScala.exists(entry => - (entry.getAdded != null && entry.getAdded.size() > 0) || - (entry.getRemoved != null && entry.getRemoved.size() > 0) - ) - } else { - false - } - } - def getReArrangedIndexAndSelectedSchema( tableInfo: TableInfo, partitionColumnSchema: mutable.Buffer[ColumnSchema]): (Seq[Int], Seq[ColumnSchema]) = { var reArrangedIndex: Seq[Int] = Seq() var selectedColumnSchema: Seq[ColumnSchema] = Seq() - var complexChildCount: Int = 0 var partitionIndex: Seq[Int] = Seq() - val columnSchema = tableInfo.getFactTable.getListOfColumns.asScala + val internalOrderColumns: util.ArrayList[CarbonColumn] = new util.ArrayList[CarbonColumn]() + internalOrderColumns.addAll(table.getVisibleDimensions) + internalOrderColumns.addAll(table.getVisibleMeasures) + val columnSchema = internalOrderColumns.asScala.map(col => col.getColumnSchema) Review comment: Can you do like following? ``` val columnSchema = (table.getVisibleDimensions.asScala ++ table.getVisibleDimensions.asScala).map(_.getColumnSchema) ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590469817 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2143/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#discussion_r383663627 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala ########## @@ -449,67 +443,48 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String], } } - private def isAlteredSchema(tableSchema: TableSchema): Boolean = { - if (tableInfo.getFactTable.getSchemaEvolution != null) { - tableInfo - .getFactTable - .getSchemaEvolution - .getSchemaEvolutionEntryList.asScala.exists(entry => - (entry.getAdded != null && entry.getAdded.size() > 0) || - (entry.getRemoved != null && entry.getRemoved.size() > 0) - ) - } else { - false - } - } - def getReArrangedIndexAndSelectedSchema( tableInfo: TableInfo, partitionColumnSchema: mutable.Buffer[ColumnSchema]): (Seq[Int], Seq[ColumnSchema]) = { var reArrangedIndex: Seq[Int] = Seq() var selectedColumnSchema: Seq[ColumnSchema] = Seq() - var complexChildCount: Int = 0 var partitionIndex: Seq[Int] = Seq() - val columnSchema = tableInfo.getFactTable.getListOfColumns.asScala + val internalOrderColumns: util.ArrayList[CarbonColumn] = new util.ArrayList[CarbonColumn]() + internalOrderColumns.addAll(table.getVisibleDimensions) + internalOrderColumns.addAll(table.getVisibleMeasures) + val columnSchema = internalOrderColumns.asScala.map(col => col.getColumnSchema) Review comment: yes, can be done like this now. previously it was used for something else ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#discussion_r383665499 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala ########## @@ -449,67 +443,48 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String], } } - private def isAlteredSchema(tableSchema: TableSchema): Boolean = { - if (tableInfo.getFactTable.getSchemaEvolution != null) { - tableInfo - .getFactTable - .getSchemaEvolution - .getSchemaEvolutionEntryList.asScala.exists(entry => - (entry.getAdded != null && entry.getAdded.size() > 0) || - (entry.getRemoved != null && entry.getRemoved.size() > 0) - ) - } else { - false - } - } - def getReArrangedIndexAndSelectedSchema( tableInfo: TableInfo, partitionColumnSchema: mutable.Buffer[ColumnSchema]): (Seq[Int], Seq[ColumnSchema]) = { var reArrangedIndex: Seq[Int] = Seq() var selectedColumnSchema: Seq[ColumnSchema] = Seq() - var complexChildCount: Int = 0 var partitionIndex: Seq[Int] = Seq() - val columnSchema = tableInfo.getFactTable.getListOfColumns.asScala + val internalOrderColumns: util.ArrayList[CarbonColumn] = new util.ArrayList[CarbonColumn]() + internalOrderColumns.addAll(table.getVisibleDimensions) + internalOrderColumns.addAll(table.getVisibleMeasures) + val columnSchema = internalOrderColumns.asScala.map(col => col.getColumnSchema) Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#discussion_r383665561 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala ########## @@ -449,67 +443,48 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String], } } - private def isAlteredSchema(tableSchema: TableSchema): Boolean = { - if (tableInfo.getFactTable.getSchemaEvolution != null) { - tableInfo - .getFactTable - .getSchemaEvolution - .getSchemaEvolutionEntryList.asScala.exists(entry => - (entry.getAdded != null && entry.getAdded.size() > 0) || - (entry.getRemoved != null && entry.getRemoved.size() > 0) - ) - } else { - false - } - } - def getReArrangedIndexAndSelectedSchema( tableInfo: TableInfo, partitionColumnSchema: mutable.Buffer[ColumnSchema]): (Seq[Int], Seq[ColumnSchema]) = { var reArrangedIndex: Seq[Int] = Seq() var selectedColumnSchema: Seq[ColumnSchema] = Seq() - var complexChildCount: Int = 0 var partitionIndex: Seq[Int] = Seq() - val columnSchema = tableInfo.getFactTable.getListOfColumns.asScala + val internalOrderColumns: util.ArrayList[CarbonColumn] = new util.ArrayList[CarbonColumn]() + internalOrderColumns.addAll(table.getVisibleDimensions) + internalOrderColumns.addAll(table.getVisibleMeasures) + val columnSchema = internalOrderColumns.asScala.map(col => col.getColumnSchema) val partitionColumnNames = if (partitionColumnSchema != null) { partitionColumnSchema.map(x => x.getColumnName).toSet } else { null } - // get invisible column indexes, alter table scenarios can have it before or after new column - // dummy measure will have ordinal -1 and it is invisible, ignore that column. - // alter table old columns are just invisible columns with proper ordinal - val invisibleIndex = columnSchema.filter(col => col.isInvisible && col.getSchemaOrdinal != -1) - .map(col => col.getSchemaOrdinal) - columnSchema.filterNot(col => col.isInvisible).foreach { + var createOrderColumns = table.getCreateOrderColumn.asScala + val createOrderMap = mutable.Map[String, Int]() + var createOrderPosition = 0 + if (partitionColumnNames != null) { + // For alter table drop/add column scenarios, partition column may not be in the end. + // Need to keep it in the end. + createOrderColumns = createOrderColumns.filterNot(col => + partitionColumnNames.contains(col.getColumnSchema.getColumnName)) ++ + createOrderColumns.filter(col => + partitionColumnNames.contains(col.getColumnSchema.getColumnName)) + } + for (col: CarbonColumn <- createOrderColumns) { Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590695653 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/452/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590716487 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2153/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590718925 @jackylk : Handled the comments. Build passed. PR is ready. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3634: [CARBONDATA-3720] Support alter table scenario for new insert into flow
URL: https://github.com/apache/carbondata/pull/3634#issuecomment-590928581 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |