Apache CarbonData Dev Mailing List archive

[GitHub] [carbondata] akkio-97 opened a new pull request #4129: [WIP] alter rename complex types

Classic

List

136 messages Options

Options

12345 ... 7

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638737516

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -217,20 +310,52 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
.setPrecision(newColumnPrecision)
columnSchema.setScale(newColumnScale)
}
- if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
- columnSchema.getColumnProperties.put(
- CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- } else if (newColumnComment.isDefined) {
- val newColumnProperties = new util.HashMap[String, String]
- newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- columnSchema.setColumnProperties(newColumnProperties)
+ // only table columns are eligible to have comment
+ if (!isComplexChild(columnSchema)) {
+ if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
+ columnSchema.getColumnProperties.put(
+ CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ } else if (newColumnComment.isDefined) {
+ val newColumnProperties = new util.HashMap[String, String]
+ newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ columnSchema.setColumnProperties(newColumnProperties)
+ }
+ addColumnSchema = columnSchema
+ timeStamp = System.currentTimeMillis()
+ // make a new schema evolution entry after column rename or datatype change
+ schemaEvolutionEntry = AlterTableUtil
+ .addNewSchemaEvolutionEntry(timeStamp, addColumnSchema, deletedColumnSchema)
+ }
+ }
+
+ if (alteredColumnNamesMap.nonEmpty) {
+ // if complex-child or its children has been renamed
+ if (alteredColumnNamesMap.contains(columnSchemaName)) {
+ // matches exactly
+ val newComplexChildName = alteredColumnNamesMap(columnSchemaName)
+ columnSchema.setColumn_name(newComplexChildName)
+ } else {
+ val oldParent = checkIfParentIsAltered(columnSchemaName)

Review comment:
alteredColumnNamesMap only contains those columns whose names have been changed. Its children's names should also be changed according to the parent name change.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638737516

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -217,20 +310,52 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
.setPrecision(newColumnPrecision)
columnSchema.setScale(newColumnScale)
}
- if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
- columnSchema.getColumnProperties.put(
- CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- } else if (newColumnComment.isDefined) {
- val newColumnProperties = new util.HashMap[String, String]
- newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- columnSchema.setColumnProperties(newColumnProperties)
+ // only table columns are eligible to have comment
+ if (!isComplexChild(columnSchema)) {
+ if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
+ columnSchema.getColumnProperties.put(
+ CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ } else if (newColumnComment.isDefined) {
+ val newColumnProperties = new util.HashMap[String, String]
+ newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ columnSchema.setColumnProperties(newColumnProperties)
+ }
+ addColumnSchema = columnSchema
+ timeStamp = System.currentTimeMillis()
+ // make a new schema evolution entry after column rename or datatype change
+ schemaEvolutionEntry = AlterTableUtil
+ .addNewSchemaEvolutionEntry(timeStamp, addColumnSchema, deletedColumnSchema)
+ }
+ }
+
+ if (alteredColumnNamesMap.nonEmpty) {
+ // if complex-child or its children has been renamed
+ if (alteredColumnNamesMap.contains(columnSchemaName)) {
+ // matches exactly
+ val newComplexChildName = alteredColumnNamesMap(columnSchemaName)
+ columnSchema.setColumn_name(newComplexChildName)
+ } else {
+ val oldParent = checkIfParentIsAltered(columnSchemaName)

Review comment:
alteredColumnNamesMap only contains those columns whose names have been changed. so if parent name has changed, children names will also be changed. But since this is an indirect change we are not adding them to the map.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638738643

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -279,6 +404,23 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
Seq.empty
}

+ private def isComplexChild(columnSchema: ColumnSchema): Boolean = {
+ columnSchema.column_name.contains(CarbonCommonConstants.POINT)
+ }
+
+ private def isChildOfOldColumn(columnSchemaName: String, oldColumnName: String): Boolean = {
+ columnSchemaName.startsWith(oldColumnName + CarbonCommonConstants.POINT)
+ }
+
+ private def checkIfParentIsAltered(columnSchemaName: String): String = {

Review comment:
an incoming child name will be altered if its parent has been altered. This method checks this scenario

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638737516

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -217,20 +310,52 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
.setPrecision(newColumnPrecision)
columnSchema.setScale(newColumnScale)
}
- if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
- columnSchema.getColumnProperties.put(
- CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- } else if (newColumnComment.isDefined) {
- val newColumnProperties = new util.HashMap[String, String]
- newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- columnSchema.setColumnProperties(newColumnProperties)
+ // only table columns are eligible to have comment
+ if (!isComplexChild(columnSchema)) {
+ if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
+ columnSchema.getColumnProperties.put(
+ CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ } else if (newColumnComment.isDefined) {
+ val newColumnProperties = new util.HashMap[String, String]
+ newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ columnSchema.setColumnProperties(newColumnProperties)
+ }
+ addColumnSchema = columnSchema
+ timeStamp = System.currentTimeMillis()
+ // make a new schema evolution entry after column rename or datatype change
+ schemaEvolutionEntry = AlterTableUtil
+ .addNewSchemaEvolutionEntry(timeStamp, addColumnSchema, deletedColumnSchema)
+ }
+ }
+
+ if (alteredColumnNamesMap.nonEmpty) {
+ // if complex-child or its children has been renamed
+ if (alteredColumnNamesMap.contains(columnSchemaName)) {
+ // matches exactly
+ val newComplexChildName = alteredColumnNamesMap(columnSchemaName)
+ columnSchema.setColumn_name(newComplexChildName)
+ } else {
+ val oldParent = checkIfParentIsAltered(columnSchemaName)

Review comment:
alteredColumnNamesMap only contains those columns whose names have been changed (could be tableColumn or children). so if parent name has changed, children names will also be changed. But since this is an indirect change we are not adding them to the map.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638743477

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -195,18 +275,31 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
var deletedColumnSchema: ColumnSchema = null
var schemaEvolutionEntry: SchemaEvolutionEntry = null
val columnSchemaList = tableInfo.fact_table.table_columns.asScala.filter(!_.isInvisible)
+ // to validate duplicate children columns
+ var UniqueColumnSet: mutable.Set[String] = mutable.Set()

Review comment:
I feel this this place was okay to validate it. In validColumnsForRenaming() we only have a list of table columns. Also in validateComplexStructure() we traverse only the specific column that is being altered.
Only columnSchemaList contains all columns (not only altered column but also other columns) as a flat structure,

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638743477

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -195,18 +275,31 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
var deletedColumnSchema: ColumnSchema = null
var schemaEvolutionEntry: SchemaEvolutionEntry = null
val columnSchemaList = tableInfo.fact_table.table_columns.asScala.filter(!_.isInvisible)
+ // to validate duplicate children columns
+ var UniqueColumnSet: mutable.Set[String] = mutable.Set()

Review comment:
I feel this this place was okay to validate it. In validColumnsForRenaming() we only have a list of table columns(no children columns). Also in validateComplexStructure() we traverse only the specific column that is being altered.
Only columnSchemaList contains all columns (not only altered column but also other columns) as a flat structure,

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638746382

##########
File path: integration/spark/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala
##########
@@ -48,6 +52,262 @@ class AlterTableColumnRenameTestCase extends QueryTest with BeforeAndAfterAll {
sql("drop table simple_table")
}

+ test("Rename more than one column at a time in one operation") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str struct<a:struct<b:int, d:int>, c:int>) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',12,'d',12), 'c', 12))")

Review comment:
Insertion is done in order to test the query flow part. Where Table-column is matched with the Query-column. I suggest letting them be.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638746382

##########
File path: integration/spark/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala
##########
@@ -48,6 +52,262 @@ class AlterTableColumnRenameTestCase extends QueryTest with BeforeAndAfterAll {
sql("drop table simple_table")
}

+ test("Rename more than one column at a time in one operation") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str struct<a:struct<b:int, d:int>, c:int>) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',12,'d',12), 'c', 12))")

Review comment:
Insertion is done in order to test the query flow part. Where Table-column is matched with the Query-column. Also in this test case we have multiple rename operations, which is prone to throw errors. So I suggest letting them be.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r638754972

##########
File path: integration/spark/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala
##########
@@ -48,6 +52,262 @@ class AlterTableColumnRenameTestCase extends QueryTest with BeforeAndAfterAll {
sql("drop table simple_table")
}

+ test("Rename more than one column at a time in one operation") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str struct<a:struct<b:int, d:int>, c:int>) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',12,'d',12), 'c', 12))")
+ sql("alter table test_rename change str str22 struct<a11:struct<b2:int, d:int>, c:int>")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',24,'d',24), 'c', 24))")
+
+ val rows = sql("select str22.a11.b2 from test_rename").collect()
+ assert(rows(0).equals(Row(12)) && rows(1).equals(Row(24)))
+ // check if old column names are still present
+ val ex1 = intercept[AnalysisException] {
+ sql("select str from test_rename").show(false)
+ }
+ assert(ex1.getMessage
+ .contains("cannot resolve '`str`' given input columns: [test_rename.str22]"))
+
+ val ex2 = intercept[AnalysisException] {
+ sql("select str.a from test_rename").show(false)
+ }
+ assert(ex2.getMessage
+ .contains("cannot resolve '`str.a`' given input columns: [test_rename.str22]"))
+
+ // check un-altered columns
+ val rows1 = sql("select str22.c from test_rename").collect()
+ val rows2 = sql("select str22.a11.d from test_rename").collect()
+ assert(rows1.sameElements(Array(Row(12), Row(24))))
+ assert(rows2.sameElements(Array(Row(12), Row(24))))
+ }
+
+ test("rename complex columns with invalid structure/duplicate names/Map type") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (str struct<a:int,b:long>, str2 struct<a:int,b:long>, map1 " +
+ "map<string, string>, str3 struct<a:int, b:map<string, string>>) STORED AS carbondata")
+
+ val ex1 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:array<int>,b:long>")
+ }
+ assert(ex1.getMessage
+ .contains(
+ "column rename operation failed: because datatypes of complex children cannot be altered"))
+
+ val ex2 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:int,b:long,c:int>")
+ }
+ assert(ex2.getMessage
+ .contains(
+ "column rename operation failed: because number of children of old and new complex " +
+ "columns are not the same"))
+
+ val ex3 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str int")
+ }
+ assert(ex3.getMessage
+ .contains(
+ "column rename operation failed: because old and new complex columns are not compatible " +
+ "in structure"))
+
+ val ex4 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:int,a:long>")
+ }
+ assert(ex4.getMessage
+ .contains(
+ "column rename operation failed: because duplicate columns are present"))
+
+ val ex5 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str2 struct<a:int,b:long>")
+ }
+ assert(ex5.getMessage
+ .contains(
+ "Column Rename Operation failed. New column name str2 already exists in table test_rename"))
+
+ val ex6 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change map1 map2 map<string, struct<a:int>>")
+ }
+ assert(ex6.getMessage
+ .contains("rename operation failed: cannot alter map type column"))
+
+ val ex7 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str3 str33 struct<a:int, bc:map<string, string>>")
+ }
+ assert(ex7.getMessage
+ .contains(
+ "rename operation failed: cannot alter complex structure that includes map type column"))
+ }
+
+ def checkAnswerUtil1(df1: DataFrame, df2: DataFrame, df3: DataFrame) {
+ checkAnswer(df1, Seq(Row(Row(Row(2)))))
+ checkAnswer(df2, Seq(Row(Row(2))))
+ checkAnswer(df3, Seq(Row(2)))
+ }
+
+ def checkAnswerUtil2(df1: DataFrame, df2: DataFrame, df3: DataFrame) {
+ checkAnswer(df1, Seq(Row(Row(Row(2))), Row(Row(Row(3)))))
+ checkAnswer(df2, Seq(Row(Row(2)), Row(Row(3))))
+ checkAnswer(df3, Seq(Row(2), Row(3)))
+ }
+
+ test("test alter rename struct of (primitive/struct/array)") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str1 struct<a:int>, str2 struct<a:struct<b:int>>, str3 " +
+ "struct<a:struct<b:struct<c:int>>>, intfield int) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a', 2), " +
+ "named_struct('a', named_struct('b', 2)), named_struct('a', named_struct('b', " +
+ "named_struct('c', 2))), 1)")
+
+ // rename parent column from str2 to str22 and read old rows
+ sql("alter table test_rename change str2 str22 struct<a:struct<b:int>>")
+ var df1 = sql("select str22 from test_rename")
+ var df2 = sql("select str22.a from test_rename")
+ var df3 = sql("select str22.a.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // rename child column from a to a11
+ sql("alter table test_rename change str22 str22 struct<a11:struct<b:int>>")
+ df1 = sql("select str22 from test_rename")
+ df2 = sql("select str22.a11 from test_rename")
+ df3 = sql("select str22.a11.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // rename parent column from str22 to str33
+ sql("alter table test_rename change str22 str33 struct<a11:struct<b:int>>")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a11 from test_rename")
+ df3 = sql("select str33.a11.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // insert new rows
+ sql("insert into test_rename values(named_struct('a', 3), " +
+ "named_struct('a', named_struct('b', 3)), named_struct('a', named_struct('b', " +
+ "named_struct('c', 3))), 2)")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a11 from test_rename")
+ df3 = sql("select str33.a11.b from test_rename")
+ assert(df1.collect().size == 2 && df2.collect().size == 2 && df3.collect().size == 2)
+ checkAnswerUtil2(df1, df2, df3)
+
+ sql("alter table test_rename change str33 str33 struct<a11:struct<b11:int>>")
+ sql("alter table test_rename change str33 str33 struct<a22:struct<b11:int>>")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a22 from test_rename")
+ df3 = sql("select str33.a22.b11 from test_rename")
+ assert(df1.collect().size == 2 && df2.collect().size == 2 && df3.collect().size == 2)
+ checkAnswerUtil2(df1, df2, df3)
+
+ val desc = sql("desc table test_rename").collect()
+ assert(desc(0)(0).equals("str1"))
+ assert(desc(1)(0).equals("str33"))
+ assert(desc(1)(1).equals("struct<a22:struct<b11:int>>"))
+ assert(desc(2)(0).equals("str3"))
+ }
+
+ test("test alter rename array of (primitive/array/struct)") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (arr1 array<int>, arr2 array<array<int>>, arr3 array<string>, " +
+ "arr4 array<struct<a:int>>) STORED AS carbondata")
+ sql(
+ "insert into test_rename values (array(1,2,3), array(array(1,2),array(3,4)), array('hello'," +
+ "'world'), array(named_struct('a',45)))")
+
+ sql("alter table test_rename change arr1 arr11 array<int>")

Review comment:
parent name is changed. We are using it to query right?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#issuecomment-848318679

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3690/

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#issuecomment-848319146

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5435/

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r639462982

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala
##########
@@ -160,7 +159,18 @@ case class DropPartitionCallableModel(carbonLoadModel: CarbonLoadModel,
carbonTable: CarbonTable,
sqlContext: SQLContext)

-case class DataTypeInfo(dataType: String, precision: Int = 0, scale: Int = 0)
+case class DataTypeInfo(dataType: String,
+ name: String,
+ precision: Int = 0,
+ scale: Int = 0) {
+ private var children: List[DataTypeInfo] = null

Review comment:
done

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r639463175

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -217,20 +310,52 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
.setPrecision(newColumnPrecision)
columnSchema.setScale(newColumnScale)
}
- if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
- columnSchema.getColumnProperties.put(
- CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- } else if (newColumnComment.isDefined) {
- val newColumnProperties = new util.HashMap[String, String]
- newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
- columnSchema.setColumnProperties(newColumnProperties)
+ // only table columns are eligible to have comment
+ if (!isComplexChild(columnSchema)) {
+ if (newColumnComment.isDefined && columnSchema.getColumnProperties != null) {
+ columnSchema.getColumnProperties.put(
+ CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ } else if (newColumnComment.isDefined) {
+ val newColumnProperties = new util.HashMap[String, String]
+ newColumnProperties.put(CarbonCommonConstants.COLUMN_COMMENT, newColumnComment.get)
+ columnSchema.setColumnProperties(newColumnProperties)
+ }
+ addColumnSchema = columnSchema
+ timeStamp = System.currentTimeMillis()
+ // make a new schema evolution entry after column rename or datatype change
+ schemaEvolutionEntry = AlterTableUtil

Review comment:
done

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -195,18 +275,31 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
var deletedColumnSchema: ColumnSchema = null
var schemaEvolutionEntry: SchemaEvolutionEntry = null
val columnSchemaList = tableInfo.fact_table.table_columns.asScala.filter(!_.isInvisible)
+ // to validate duplicate children columns
+ var UniqueColumnSet: mutable.Set[String] = mutable.Set()
+

columnSchemaList.foreach { columnSchema =>
- if (columnSchema.column_name.equalsIgnoreCase(oldColumnName)) {
+ val columnSchemaName = columnSchema.column_name
+ // column to be renamed is a parent/table column or complex child column
+ if (columnSchemaName.equalsIgnoreCase(oldColumnName) ||
+ isChildOfOldColumn(columnSchemaName, oldColumnName)) {
deletedColumnSchema = columnSchema.deepCopy()
- if (alterTableColRenameAndDataTypeChangeModel.isColumnRename) {
- // if only column rename, just get the column schema and rename, make a
+ // if the table column name has been altered
+ if (isColumnRename) {
+ // if only table column rename, just get the column schema and rename, make a
// schemaEvolutionEntry
- columnSchema.setColumn_name(newColumnName)
+ if (isChildOfOldColumn(columnSchemaName, oldColumnName)) {

Review comment:
done

##########
File path: integration/spark/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala
##########
@@ -48,6 +52,262 @@ class AlterTableColumnRenameTestCase extends QueryTest with BeforeAndAfterAll {
sql("drop table simple_table")
}

+ test("Rename more than one column at a time in one operation") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str struct<a:struct<b:int, d:int>, c:int>) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',12,'d',12), 'c', 12))")
+ sql("alter table test_rename change str str22 struct<a11:struct<b2:int, d:int>, c:int>")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',24,'d',24), 'c', 24))")
+
+ val rows = sql("select str22.a11.b2 from test_rename").collect()
+ assert(rows(0).equals(Row(12)) && rows(1).equals(Row(24)))
+ // check if old column names are still present
+ val ex1 = intercept[AnalysisException] {
+ sql("select str from test_rename").show(false)
+ }
+ assert(ex1.getMessage
+ .contains("cannot resolve '`str`' given input columns: [test_rename.str22]"))
+
+ val ex2 = intercept[AnalysisException] {
+ sql("select str.a from test_rename").show(false)
+ }
+ assert(ex2.getMessage
+ .contains("cannot resolve '`str.a`' given input columns: [test_rename.str22]"))
+
+ // check un-altered columns
+ val rows1 = sql("select str22.c from test_rename").collect()
+ val rows2 = sql("select str22.a11.d from test_rename").collect()
+ assert(rows1.sameElements(Array(Row(12), Row(24))))
+ assert(rows2.sameElements(Array(Row(12), Row(24))))
+ }
+
+ test("rename complex columns with invalid structure/duplicate names/Map type") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (str struct<a:int,b:long>, str2 struct<a:int,b:long>, map1 " +
+ "map<string, string>, str3 struct<a:int, b:map<string, string>>) STORED AS carbondata")
+
+ val ex1 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:array<int>,b:long>")
+ }
+ assert(ex1.getMessage
+ .contains(
+ "column rename operation failed: because datatypes of complex children cannot be altered"))
+
+ val ex2 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:int,b:long,c:int>")
+ }
+ assert(ex2.getMessage
+ .contains(
+ "column rename operation failed: because number of children of old and new complex " +
+ "columns are not the same"))
+
+ val ex3 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str int")
+ }
+ assert(ex3.getMessage
+ .contains(
+ "column rename operation failed: because old and new complex columns are not compatible " +
+ "in structure"))
+
+ val ex4 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:int,a:long>")
+ }
+ assert(ex4.getMessage
+ .contains(
+ "column rename operation failed: because duplicate columns are present"))
+
+ val ex5 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str2 struct<a:int,b:long>")
+ }
+ assert(ex5.getMessage
+ .contains(
+ "Column Rename Operation failed. New column name str2 already exists in table test_rename"))
+
+ val ex6 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change map1 map2 map<string, struct<a:int>>")
+ }
+ assert(ex6.getMessage
+ .contains("rename operation failed: cannot alter map type column"))
+
+ val ex7 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str3 str33 struct<a:int, bc:map<string, string>>")
+ }
+ assert(ex7.getMessage
+ .contains(
+ "rename operation failed: cannot alter complex structure that includes map type column"))
+ }
+
+ def checkAnswerUtil1(df1: DataFrame, df2: DataFrame, df3: DataFrame) {
+ checkAnswer(df1, Seq(Row(Row(Row(2)))))
+ checkAnswer(df2, Seq(Row(Row(2))))
+ checkAnswer(df3, Seq(Row(2)))
+ }
+
+ def checkAnswerUtil2(df1: DataFrame, df2: DataFrame, df3: DataFrame) {
+ checkAnswer(df1, Seq(Row(Row(Row(2))), Row(Row(Row(3)))))
+ checkAnswer(df2, Seq(Row(Row(2)), Row(Row(3))))
+ checkAnswer(df3, Seq(Row(2), Row(3)))
+ }
+
+ test("test alter rename struct of (primitive/struct/array)") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str1 struct<a:int>, str2 struct<a:struct<b:int>>, str3 " +
+ "struct<a:struct<b:struct<c:int>>>, intfield int) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a', 2), " +
+ "named_struct('a', named_struct('b', 2)), named_struct('a', named_struct('b', " +
+ "named_struct('c', 2))), 1)")
+
+ // rename parent column from str2 to str22 and read old rows
+ sql("alter table test_rename change str2 str22 struct<a:struct<b:int>>")
+ var df1 = sql("select str22 from test_rename")
+ var df2 = sql("select str22.a from test_rename")
+ var df3 = sql("select str22.a.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // rename child column from a to a11
+ sql("alter table test_rename change str22 str22 struct<a11:struct<b:int>>")
+ df1 = sql("select str22 from test_rename")
+ df2 = sql("select str22.a11 from test_rename")
+ df3 = sql("select str22.a11.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // rename parent column from str22 to str33
+ sql("alter table test_rename change str22 str33 struct<a11:struct<b:int>>")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a11 from test_rename")
+ df3 = sql("select str33.a11.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // insert new rows
+ sql("insert into test_rename values(named_struct('a', 3), " +
+ "named_struct('a', named_struct('b', 3)), named_struct('a', named_struct('b', " +
+ "named_struct('c', 3))), 2)")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a11 from test_rename")
+ df3 = sql("select str33.a11.b from test_rename")
+ assert(df1.collect().size == 2 && df2.collect().size == 2 && df3.collect().size == 2)
+ checkAnswerUtil2(df1, df2, df3)
+
+ sql("alter table test_rename change str33 str33 struct<a11:struct<b11:int>>")
+ sql("alter table test_rename change str33 str33 struct<a22:struct<b11:int>>")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a22 from test_rename")
+ df3 = sql("select str33.a22.b11 from test_rename")
+ assert(df1.collect().size == 2 && df2.collect().size == 2 && df3.collect().size == 2)
+ checkAnswerUtil2(df1, df2, df3)
+
+ val desc = sql("desc table test_rename").collect()
+ assert(desc(0)(0).equals("str1"))
+ assert(desc(1)(0).equals("str33"))
+ assert(desc(1)(1).equals("struct<a22:struct<b11:int>>"))
+ assert(desc(2)(0).equals("str3"))

Review comment:
done

##########
File path: integration/spark/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala
##########
@@ -48,6 +52,262 @@ class AlterTableColumnRenameTestCase extends QueryTest with BeforeAndAfterAll {
sql("drop table simple_table")
}

+ test("Rename more than one column at a time in one operation") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str struct<a:struct<b:int, d:int>, c:int>) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',12,'d',12), 'c', 12))")
+ sql("alter table test_rename change str str22 struct<a11:struct<b2:int, d:int>, c:int>")
+ sql("insert into test_rename values(named_struct('a11',named_struct('b2',24,'d',24), 'c', 24))")
+
+ val rows = sql("select str22.a11.b2 from test_rename").collect()
+ assert(rows(0).equals(Row(12)) && rows(1).equals(Row(24)))
+ // check if old column names are still present
+ val ex1 = intercept[AnalysisException] {
+ sql("select str from test_rename").show(false)
+ }
+ assert(ex1.getMessage
+ .contains("cannot resolve '`str`' given input columns: [test_rename.str22]"))
+
+ val ex2 = intercept[AnalysisException] {
+ sql("select str.a from test_rename").show(false)
+ }
+ assert(ex2.getMessage
+ .contains("cannot resolve '`str.a`' given input columns: [test_rename.str22]"))
+
+ // check un-altered columns
+ val rows1 = sql("select str22.c from test_rename").collect()
+ val rows2 = sql("select str22.a11.d from test_rename").collect()
+ assert(rows1.sameElements(Array(Row(12), Row(24))))
+ assert(rows2.sameElements(Array(Row(12), Row(24))))
+ }
+
+ test("rename complex columns with invalid structure/duplicate names/Map type") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (str struct<a:int,b:long>, str2 struct<a:int,b:long>, map1 " +
+ "map<string, string>, str3 struct<a:int, b:map<string, string>>) STORED AS carbondata")
+
+ val ex1 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:array<int>,b:long>")
+ }
+ assert(ex1.getMessage
+ .contains(
+ "column rename operation failed: because datatypes of complex children cannot be altered"))
+
+ val ex2 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:int,b:long,c:int>")
+ }
+ assert(ex2.getMessage
+ .contains(
+ "column rename operation failed: because number of children of old and new complex " +
+ "columns are not the same"))
+
+ val ex3 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str int")
+ }
+ assert(ex3.getMessage
+ .contains(
+ "column rename operation failed: because old and new complex columns are not compatible " +
+ "in structure"))
+
+ val ex4 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:int,a:long>")
+ }
+ assert(ex4.getMessage
+ .contains(
+ "column rename operation failed: because duplicate columns are present"))
+
+ val ex5 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str2 struct<a:int,b:long>")
+ }
+ assert(ex5.getMessage
+ .contains(
+ "Column Rename Operation failed. New column name str2 already exists in table test_rename"))
+
+ val ex6 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change map1 map2 map<string, struct<a:int>>")
+ }
+ assert(ex6.getMessage
+ .contains("rename operation failed: cannot alter map type column"))
+
+ val ex7 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str3 str33 struct<a:int, bc:map<string, string>>")
+ }
+ assert(ex7.getMessage
+ .contains(
+ "rename operation failed: cannot alter complex structure that includes map type column"))
+ }
+
+ def checkAnswerUtil1(df1: DataFrame, df2: DataFrame, df3: DataFrame) {
+ checkAnswer(df1, Seq(Row(Row(Row(2)))))
+ checkAnswer(df2, Seq(Row(Row(2))))
+ checkAnswer(df3, Seq(Row(2)))
+ }
+
+ def checkAnswerUtil2(df1: DataFrame, df2: DataFrame, df3: DataFrame) {
+ checkAnswer(df1, Seq(Row(Row(Row(2))), Row(Row(Row(3)))))
+ checkAnswer(df2, Seq(Row(Row(2)), Row(Row(3))))
+ checkAnswer(df3, Seq(Row(2), Row(3)))
+ }
+
+ test("test alter rename struct of (primitive/struct/array)") {
+ sql("drop table if exists test_rename")
+ sql("CREATE TABLE test_rename (str1 struct<a:int>, str2 struct<a:struct<b:int>>, str3 " +
+ "struct<a:struct<b:struct<c:int>>>, intfield int) STORED AS carbondata")
+ sql("insert into test_rename values(named_struct('a', 2), " +
+ "named_struct('a', named_struct('b', 2)), named_struct('a', named_struct('b', " +
+ "named_struct('c', 2))), 1)")
+
+ // rename parent column from str2 to str22 and read old rows
+ sql("alter table test_rename change str2 str22 struct<a:struct<b:int>>")
+ var df1 = sql("select str22 from test_rename")
+ var df2 = sql("select str22.a from test_rename")
+ var df3 = sql("select str22.a.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // rename child column from a to a11
+ sql("alter table test_rename change str22 str22 struct<a11:struct<b:int>>")
+ df1 = sql("select str22 from test_rename")
+ df2 = sql("select str22.a11 from test_rename")
+ df3 = sql("select str22.a11.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // rename parent column from str22 to str33
+ sql("alter table test_rename change str22 str33 struct<a11:struct<b:int>>")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a11 from test_rename")
+ df3 = sql("select str33.a11.b from test_rename")
+ assert(df1.collect().size == 1 && df2.collect().size == 1 && df3.collect().size == 1)
+ checkAnswerUtil1(df1, df2, df3)
+
+ // insert new rows
+ sql("insert into test_rename values(named_struct('a', 3), " +
+ "named_struct('a', named_struct('b', 3)), named_struct('a', named_struct('b', " +
+ "named_struct('c', 3))), 2)")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a11 from test_rename")
+ df3 = sql("select str33.a11.b from test_rename")
+ assert(df1.collect().size == 2 && df2.collect().size == 2 && df3.collect().size == 2)
+ checkAnswerUtil2(df1, df2, df3)
+
+ sql("alter table test_rename change str33 str33 struct<a11:struct<b11:int>>")
+ sql("alter table test_rename change str33 str33 struct<a22:struct<b11:int>>")
+ df1 = sql("select str33 from test_rename")
+ df2 = sql("select str33.a22 from test_rename")
+ df3 = sql("select str33.a22.b11 from test_rename")
+ assert(df1.collect().size == 2 && df2.collect().size == 2 && df3.collect().size == 2)
+ checkAnswerUtil2(df1, df2, df3)
+
+ val desc = sql("desc table test_rename").collect()
+ assert(desc(0)(0).equals("str1"))
+ assert(desc(1)(0).equals("str33"))
+ assert(desc(1)(1).equals("struct<a22:struct<b11:int>>"))
+ assert(desc(2)(0).equals("str3"))
+ }
+
+ test("test alter rename array of (primitive/array/struct)") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (arr1 array<int>, arr2 array<array<int>>, arr3 array<string>, " +
+ "arr4 array<struct<a:int>>) STORED AS carbondata")
+ sql(
+ "insert into test_rename values (array(1,2,3), array(array(1,2),array(3,4)), array('hello'," +
+ "'world'), array(named_struct('a',45)))")
+
+ sql("alter table test_rename change arr1 arr11 array<int>")
+ val df1 = sql("select arr11 from test_rename")
+ assert(df1.collect.size == 1)
+ checkAnswer(df1, Seq(Row(make(Array(1, 2, 3)))))
+
+ sql("alter table test_rename change arr2 arr22 array<array<int>>")
+ val df2 = sql("select arr22 from test_rename")
+ assert(df2.collect.size == 1)
+ checkAnswer(df2, Seq(Row(make(Array(make(Array(1, 2)), make(Array(3, 4)))))))
+
+ sql("alter table test_rename change arr3 arr33 array<string>")
+ val df3 = sql("select arr33 from test_rename")
+ assert(df3.collect.size == 1)
+ checkAnswer(sql("select arr33 from test_rename"), Seq(Row(make(Array("hello", "world")))))
+
+ sql("alter table test_rename change arr4 arr44 array<struct<a:int>>")
+ sql("alter table test_rename change arr44 arr44 array<struct<a11:int>>")
+
+ val df4 = sql("select arr44.a11 from test_rename")
+ assert(df4.collect.size == 1)
+ checkAnswer(df4, Seq(Row(make(Array(45)))))
+
+ // test for new inserted row
+ sql(
+ "insert into test_rename values (array(11,22,33), array(array(11,22),array(33,44)), array" +
+ "('hello11', 'world11'), array(named_struct('a',4555)))")
+ val rows = sql("select arr11, arr22, arr33, arr44.a11 from test_rename").collect
+ assert(rows.size == 2)
+ val secondRow = rows(1)
+ assert(secondRow(0).equals(make(Array(11, 22, 33))) &&
+ secondRow(1).equals(make(Array(make(Array(11, 22)), make(Array(33, 44))))) &&
+ secondRow(2).equals(make(Array("hello11", "world11"))))
+ }
+
+ test("validate alter change datatype for complex children columns") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (str struct<a:int,b:long>) STORED AS carbondata")
+
+ val ex1 = intercept[ProcessMetaDataException] {
+ sql("alter table test_rename change str str struct<a:long,b:long>")
+ }
+ assert(ex1.getMessage
+ .contains(
+ "column rename operation failed: because datatypes of complex children cannot be altered"))
+ }
+
+ test("test change comment in case of complex types") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (str struct<a:int> comment 'comment') STORED AS carbondata")
+ sql("alter table test_rename change str str struct<a:int> comment 'new comment'")
+ var describe = sql("desc table test_rename")
+ var count = describe.filter("col_name='str' and comment = 'new comment'").count()
+ assertResult(1)(count)
+
+ sql("alter table test_rename change str str struct<a1:int> comment 'new comment 2'")
+ describe = sql("desc table test_rename")
+ count = describe.filter("col_name='str' and comment = 'new comment 2'").count()
+ assertResult(1)(count)
+ }
+
+ test("check schema evolution after renaming complex column") {
+ sql("drop table if exists test_rename")
+ sql(
+ "CREATE TABLE test_rename (str1 struct<a:int>) STORED AS carbondata")
+ sql("alter table test_rename change str1 str2 struct<abc:int>")
+ val carbonTable = CarbonEnv.getCarbonTable(None, "test_rename")(sqlContext.sparkSession)
+ val schemaEvolutionList = carbonTable.getTableInfo
+ .getFactTable
+ .getSchemaEvolution()
+ .getSchemaEvolutionEntryList()
+ var addedColumns = Seq[ColumnSchema]()
+ var removedColumns = Seq[ColumnSchema]()
+ for (i <- 0 until schemaEvolutionList.size()) {
+ addedColumns ++=
+ JavaConverters
+ .asScalaIteratorConverter(schemaEvolutionList.get(i).getAdded.iterator())
+ .asScala
+ .toSeq
+
+ removedColumns ++=
+ JavaConverters
+ .asScalaIteratorConverter(schemaEvolutionList.get(i).getRemoved.iterator())
+ .asScala
+ .toSeq
+ }
+ assert(addedColumns.size == 1 && removedColumns.size == 1)
+ assert(addedColumns(0).getColumnName.equals("str2"))
+ assert(removedColumns(0).getColumnName.equals("str1"))
+ }
+

Review comment:
have added asserts in testcase that handles invalid structures.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r639481683

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -83,6 +76,69 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
childTableColumnRename: Boolean = false)
extends CarbonAlterTableColumnRenameCommand(alterTableColRenameAndDataTypeChangeModel.columnName,
alterTableColRenameAndDataTypeChangeModel.newColumnName) {
+ // stores mapping of altered children column names:
+ // new_column_name -> column_datatype
+ val alteredColumnNamesMap = collection.mutable.HashMap.empty[String, String]
+
+ // stores mapping of: column_name -> new_column_datatype
+ val alteredColumnDatatypesMap = collection.mutable.HashMap.empty[String, String]
+
+ /**
+ * This method checks the structure of the old and new complex columns, and-
+ * 1. throws exception if the number of complex-levels in both columns does not match
+ * 2. throws exception if the number of children of both columns does not match
+ * 3. creates alteredColumnNamesMap: new_column_name -> datatype. Here new_column_name are those
+ * names of the columns that are altered.
+ * 4. creates alteredColumnDatatypesMap: column_name -> new_datatype.
+ * These maps will later be used while altering the table schema
+ */
+ def validateComplexStructure(dimension: List[CarbonDimension],
+ newDataTypeInfo: List[DataTypeInfo]): Unit = {
+ if (dimension == null && newDataTypeInfo == null) {

Review comment:
they are required as the parser does not handle them.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#issuecomment-848616642

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5438/

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#issuecomment-848621030

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3693/

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akkio-97 commented on pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

akkio-97 commented on pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#issuecomment-851298897

retest this please

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#issuecomment-851368454

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3705/

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#issuecomment-851370671

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5449/

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r640293211

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
##########
@@ -186,26 +208,37 @@ public static boolean isColumnMatches(boolean isTransactionalTable,
}

/**
- * In case of Multilevel Complex column - Struct/StructOfStruct, traverse all the child dimension
- * to check column Id
+ * In case of Multilevel Complex column - Struct/StructOfStruct, traverse all the child dimensions
+ * of tableColumn to check if any of its column Id has matched with that of queryColumn .
*
- * @param tableColumn
- * @param queryColumn
+ * @param tableColumn - column entity that is present in the table block or in the segment
+ * properties.
+ * @param queryColumn - column entity that is present in the fired query or in the query model.
+ * tableColumn name and queryColumn name may or may not be the same in case schema has evolved.
+ * Hence matching happens based on the column ID
* @return
*/
private static boolean isColumnMatchesStruct(CarbonColumn tableColumn, CarbonColumn queryColumn) {
if (tableColumn instanceof CarbonDimension) {
- List<CarbonDimension> parentDimension =
+ List<CarbonDimension> childrenDimensions =
((CarbonDimension) tableColumn).getListOfChildDimensions();
- CarbonDimension carbonDimension = null;
+ CarbonDimension carbonDimension;
String[] colSplits = queryColumn.getColName().split("\\.");
StringBuffer tempColName = new StringBuffer(colSplits[0]);
for (String colSplit : colSplits) {
if (!tempColName.toString().equalsIgnoreCase(colSplit)) {
- tempColName = tempColName.append(".").append(colSplit);
+ tempColName = tempColName.append(CarbonCommonConstants.POINT).append(colSplit);
}
- carbonDimension = CarbonTable.getCarbonDimension(tempColName.toString(), parentDimension);
- if (carbonDimension != null) {
+ carbonDimension =
+ CarbonTable.getCarbonDimension(tempColName.toString(), childrenDimensions);
+ if (carbonDimension == null) {
+ // Avoid returning true in case of SDK as the column name contains the id.
+ if (existingTableColumnIDMap.containsKey(queryColumn.getColumnId())
+ && !existingTableColumnIDMap.get(queryColumn.getColumnId())
+ .contains(queryColumn.getColumnId())) {

Review comment:
should the validation should be with tablecolumn here?

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -1132,10 +1135,60 @@ object CarbonParserUtil {
} else if (scale < 0 || scale > 38) {
throw new MalformedCarbonCommandException("Invalid value for scale")
}
- DataTypeInfo("decimal", precision, scale)
+ DataTypeInfo(columnName, "decimal", precision, scale)
+ case _ =>
+ DataTypeInfo(columnName,
+ DataTypeConverterUtil.convertToCarbonType(dataType).getName.toLowerCase)
+ }
+ }
+
+ /**
+ * This method will return the instantiated DataTypeInfo by parsing the column
+ */
+ def parseColumn(columnName: String, dataType: DataType,
+ values: Option[List[(Int, Int)]]): DataTypeInfo = {
+ // creates parent dataTypeInfo first
+ val dataTypeInfo = CarbonParserUtil.parseDataType(
+ columnName,
+ DataTypeConverterUtil
+ .convertToCarbonType(dataType.typeName)
+ .getName
+ .toLowerCase,
+ values)
+ // check which child type is present and create children dataTypeInfo accordingly
+ dataType match {
+ case arrayType: ArrayType =>
+ val childType: DataType = arrayType.elementType
+ val childName = columnName + ".val"
+ val childValues = childType match {
+ case d: DecimalType => Some(List((d.precision, d.scale)))
+ case _ => None
+ }
+ var childTypeInfoList: List[DataTypeInfo] = null
+ val childDatatypeInfo = parseColumn(childName, childType, childValues)
+ childTypeInfoList = List(childDatatypeInfo)
+ dataTypeInfo.setChildren(childTypeInfoList)
+ case childrenTypeList: StructType =>

Review comment:
change to structType

##########
File path: docs/ddl-of-carbondata.md
##########
@@ -841,11 +843,23 @@ Users can specify which columns to include and exclude for local dictionary gene
```
ALTER TABLE test_db.carbon CHANGE a3 a4 STRING
```
- Example3:Change column a3's comment to "col_comment".
+ Example4:Change column a3's comment to "col_comment".

```
ALTER TABLE test_db.carbon CHANGE a3 a3 STRING COMMENT 'col_comment'
```
+
+ Example5:Change child column name in column: structField struct\<age:int> from age to id.

Review comment:
```suggestion
Example5: Change child column name in column: structField struct<age:int> from age to id.
```
same for below

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -143,27 +141,49 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
val newColumnPrecision = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.precision
val newColumnScale = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.scale
// set isDataTypeChange flag
- if (oldCarbonColumn.head.getDataType.getName
- .equalsIgnoreCase(alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.dataType)) {
+ val oldDatatype = oldCarbonColumn.head.getDataType
+ val newDatatype = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.dataType
+ if (isColumnRename && (DataTypes.isMapType(oldDatatype) ||
+ newDatatype.equalsIgnoreCase(CarbonCommonConstants.MAP))) {
+ throw new UnsupportedOperationException(
+ "Alter rename is unsupported for Map datatype column")
+ }
+ if (oldDatatype.getName.equalsIgnoreCase(newDatatype)) {
val newColumnPrecision =
alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.precision
val newColumnScale = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.scale
// if the source datatype is decimal and there is change in precision and scale, then
// along with rename, datatype change is also required for the command, so set the
// isDataTypeChange flag to true in this case
- if (oldCarbonColumn.head.getDataType.getName.equalsIgnoreCase("decimal") &&
- (oldCarbonColumn.head.getDataType.asInstanceOf[DecimalType].getPrecision !=
+ if (DataTypes.isDecimal(oldDatatype) &&
+ (oldDatatype.asInstanceOf[DecimalType].getPrecision !=
newColumnPrecision ||
- oldCarbonColumn.head.getDataType.asInstanceOf[DecimalType].getScale !=
+ oldDatatype.asInstanceOf[DecimalType].getScale !=
newColumnScale)) {
isDataTypeChange = true
}
+ if (DataTypes.isArrayType(oldDatatype) || DataTypes.isStructType(oldDatatype)) {
+ val oldParent = oldCarbonColumn.head
+ val oldChildren = oldParent.asInstanceOf[CarbonDimension].getListOfChildDimensions.asScala
+ .toList
+ AlterTableUtil.validateComplexStructure(oldChildren,
+ alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.getChildren(),
+ alteredColumnNamesMap)
+ }
} else {
+ if (oldDatatype.isComplexType ||
+ newDatatype.equalsIgnoreCase(CarbonCommonConstants.ARRAY) ||
+ newDatatype.equalsIgnoreCase(CarbonCommonConstants.STRUCT)) {
+ throw new UnsupportedOperationException(
+ "Old and new complex columns are not compatible in structure")
+ }
isDataTypeChange = true
}
+

Review comment:
can revert this change

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -83,6 +75,9 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
childTableColumnRename: Boolean = false)
extends CarbonAlterTableColumnRenameCommand(alterTableColRenameAndDataTypeChangeModel.columnName,
alterTableColRenameAndDataTypeChangeModel.newColumnName) {
+ // stores mapping of altered column names: old-column-name -> new-column-name.
+ // Including both parent/table and children columns

Review comment:
here, you mean parent column?

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
##########
@@ -50,6 +52,10 @@
* Utility class for restructuring
*/
public class RestructureUtil {
+ // if table column is of complex type- this look up stores the column id of the parent and their
+ // children. This helps to determine the existence of incoming query column by matching based
+ // on id.
+ private static Map<String, String> existingTableColumnIDMap = new HashMap<>();

Review comment:
then, can remove initialization from line:58

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

12345 ... 7