[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3301: [CARBONDATA-3446] Support read schema of complex data type from carbon file folder path

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3301: [CARBONDATA-3446] Support read schema of complex data type from carbon file folder path

GitBox

Indhumathi27 commented on a change in pull request #3301:
URL: https://github.com/apache/carbondata/pull/3301#discussion_r416476342



##########
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java
##########
@@ -283,6 +280,96 @@ public static Schema readSchemaInIndexFile(String indexFilePath) throws IOExcept
     return readSchema(indexFilePath, false);
   }
 
+  public static StructField getStructChildren(CarbonTable table, String columnName) {
+    List<CarbonDimension> list = table.getChildren(columnName);
+    List<StructField> structFields = new ArrayList<StructField>();
+    for (int i = 0; i < list.size(); i++) {
+      CarbonDimension carbonDimension = list.get(i);
+      if (DataTypes.isStructType(carbonDimension.getDataType())) {
+        return getStructChildren(table, carbonDimension.getColName());
+      } else if (DataTypes.isArrayType(carbonDimension.getDataType())) {
+        return getArrayChildren(table, carbonDimension.getColName());
+      } else if (DataTypes.isMapType(carbonDimension.getDataType())) {
+        //TODO
+      } else {
+        ColumnSchema columnSchema = carbonDimension.getColumnSchema();
+        structFields.add(new StructField(columnSchema.getColumnName(), columnSchema.getDataType()));
+      }
+    }
+    return new StructField(columnName, DataTypes.createStructType(structFields));
+  }
+
+  public static StructField getArrayChildren(CarbonTable table, String columnName) {
+    List<CarbonDimension> list = table.getChildren(columnName);
+    List<StructField> structFields = new ArrayList<StructField>();
+    for (int i = 0; i < list.size(); i++) {
+      CarbonDimension carbonDimension = list.get(i);
+      if (DataTypes.isStructType(carbonDimension.getDataType())) {
+        return getStructChildren(table, carbonDimension.getColName());
+      } else if (DataTypes.isArrayType(carbonDimension.getDataType())) {
+        return getArrayChildren(table, carbonDimension.getColName());
+      } else if (DataTypes.isMapType(carbonDimension.getDataType())) {
+        //TODO
+      } else {
+        ColumnSchema columnSchema = carbonDimension.getColumnSchema();
+        structFields.add(new StructField(columnSchema.getColumnName(), columnSchema.getDataType()));
+      }
+    }
+    return structFields.get(0);
+  }
+
+  /**
+   * Read schema from carbon file folder path
+   *
+   * @param folderPath carbon file folder path
+   * @param conf       hadoop configuration support, can set s3a AK,SK,
+   *                   end point and other conf with this
+   * @return carbon data Schema
+   * @throws IOException
+   */
+  private static Schema readSchemaFromFolder(String folderPath, Configuration conf)
+    throws IOException {
+    String tableName = "UnknownTable" + UUID.randomUUID();
+    CarbonTable table = CarbonTable.buildTable(folderPath, tableName, conf);
+    List<ColumnSchema> columnSchemaList = table.getTableInfo().getFactTable().getListOfColumns();
+    int sum = 0;
+    for (ColumnSchema columnSchema : columnSchemaList) {
+      if (!(columnSchema.getColumnName().contains("."))) {
+        sum++;
+      }
+    }
+    Field[] fields = new Field[sum];
+
+    int indexOfFields = 0;
+    for (ColumnSchema columnSchema : table.getTableInfo().getFactTable().getListOfColumns()) {

Review comment:
       we already have a variable `columnSchemaList`. can use it instead of `table.getTableInfo().getFactTable().getListOfColumns()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]