Login  Register

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

Posted by GitBox on May 31, 2021; 1:28pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-akkio-97-opened-a-new-pull-request-4129-WIP-alter-rename-complex-types-tp108015p108468.html


akkio-97 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r642489361



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -83,6 +75,9 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
     childTableColumnRename: Boolean = false)
   extends CarbonAlterTableColumnRenameCommand(alterTableColRenameAndDataTypeChangeModel.columnName,
     alterTableColRenameAndDataTypeChangeModel.newColumnName) {
+  // stores mapping of altered column names: old-column-name -> new-column-name.
+  // Including both parent/table and children columns

Review comment:
       I meant rename can be done on any type of column -
   1) parent or child in case of complex columns
   2) or just primitive column / table column

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -1132,10 +1135,60 @@ object CarbonParserUtil {
         } else if (scale < 0 || scale > 38) {
           throw new MalformedCarbonCommandException("Invalid value for scale")
         }
-        DataTypeInfo("decimal", precision, scale)
+        DataTypeInfo(columnName, "decimal", precision, scale)
+      case _ =>
+        DataTypeInfo(columnName,
+          DataTypeConverterUtil.convertToCarbonType(dataType).getName.toLowerCase)
+    }
+  }
+
+  /**
+   * This method will return the instantiated DataTypeInfo by parsing the column
+   */
+  def parseColumn(columnName: String, dataType: DataType,
+      values: Option[List[(Int, Int)]]): DataTypeInfo = {
+    // creates parent dataTypeInfo first
+    val dataTypeInfo = CarbonParserUtil.parseDataType(
+      columnName,
+      DataTypeConverterUtil
+        .convertToCarbonType(dataType.typeName)
+        .getName
+        .toLowerCase,
+      values)
+    // check which child type is present and create children dataTypeInfo accordingly
+    dataType match {
+      case arrayType: ArrayType =>
+        val childType: DataType = arrayType.elementType
+        val childName = columnName + ".val"
+        val childValues = childType match {
+          case d: DecimalType => Some(List((d.precision, d.scale)))
+          case _ => None
+        }
+        var childTypeInfoList: List[DataTypeInfo] = null
+        val childDatatypeInfo = parseColumn(childName, childType, childValues)
+        childTypeInfoList = List(childDatatypeInfo)
+        dataTypeInfo.setChildren(childTypeInfoList)
+      case childrenTypeList: StructType =>

Review comment:
       done

##########
File path: docs/ddl-of-carbondata.md
##########
@@ -841,11 +843,23 @@ Users can specify which columns to include and exclude for local dictionary gene
      ```
      ALTER TABLE test_db.carbon CHANGE a3 a4 STRING
      ```
-     Example3:Change column a3's comment to "col_comment".
+     Example4:Change column a3's comment to "col_comment".
     
      ```
      ALTER TABLE test_db.carbon CHANGE a3 a3 STRING COMMENT 'col_comment'
      ```
+    
+     Example5:Change child column name in column: structField struct\<age:int> from age to id.

Review comment:
       done

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
##########
@@ -186,26 +208,37 @@ public static boolean isColumnMatches(boolean isTransactionalTable,
   }
 
   /**
-   * In case of Multilevel Complex column - Struct/StructOfStruct, traverse all the child dimension
-   * to check column Id
+   * In case of Multilevel Complex column - Struct/StructOfStruct, traverse all the child dimensions
+   * of tableColumn to check if any of its column Id has matched with that of queryColumn .
    *
-   * @param tableColumn
-   * @param queryColumn
+   * @param tableColumn - column entity that is present in the table block or in the segment
+   *                      properties.
+   * @param queryColumn - column entity that is present in the fired query or in the query model.
+   * tableColumn name and queryColumn name may or may not be the same in case schema has evolved.
+   * Hence matching happens based on the column ID
    * @return
    */
   private static boolean isColumnMatchesStruct(CarbonColumn tableColumn, CarbonColumn queryColumn) {
     if (tableColumn instanceof CarbonDimension) {
-      List<CarbonDimension> parentDimension =
+      List<CarbonDimension> childrenDimensions =
           ((CarbonDimension) tableColumn).getListOfChildDimensions();
-      CarbonDimension carbonDimension = null;
+      CarbonDimension carbonDimension;
       String[] colSplits = queryColumn.getColName().split("\\.");
       StringBuffer tempColName = new StringBuffer(colSplits[0]);
       for (String colSplit : colSplits) {
         if (!tempColName.toString().equalsIgnoreCase(colSplit)) {
-          tempColName = tempColName.append(".").append(colSplit);
+          tempColName = tempColName.append(CarbonCommonConstants.POINT).append(colSplit);
         }
-        carbonDimension = CarbonTable.getCarbonDimension(tempColName.toString(), parentDimension);
-        if (carbonDimension != null) {
+        carbonDimension =
+            CarbonTable.getCarbonDimension(tempColName.toString(), childrenDimensions);
+        if (carbonDimension == null) {
+          // Avoid returning true in case of SDK as the column name contains the id.
+          if (existingTableColumnIDMap.containsKey(queryColumn.getColumnId())
+              && !existingTableColumnIDMap.get(queryColumn.getColumnId())
+              .contains(queryColumn.getColumnId())) {

Review comment:
       the map contains table column details (id -> name). In case of SDK the column id are similar to column names. Hence the second check is put in the if condition.




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]