[GitHub] [carbondata] jackylk opened a new pull request #3598: [WIP] Remove MDK and cardinality in write path

classic Classic list List threaded Threaded
64 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [WIP] Remove MDK and cardinality in write path

GitBox
CarbonDataQA1 commented on issue #3598: [WIP] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583267390
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/175/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [WIP] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3598: [WIP] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583284833
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1878/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583757935
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/183/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583762940
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1886/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583769450
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/186/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583774424
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1889/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583808736
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/189/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#issuecomment-583812575
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1891/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376831843
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
 ##########
 @@ -135,25 +135,6 @@ DataMapExprWrapper chooseDataMap(DataMapLevel level, FilterResolverIntf resolver
     return null;
   }
 
-  /**
-   * Get all datamaps of the table for clearing purpose
-   */
-  public DataMapExprWrapper getAllDataMapsForClear(CarbonTable carbonTable)
 
 Review comment:
   why remove this method?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376832770
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java
 ##########
 @@ -640,15 +377,91 @@ public int getNumberOfSortColumns() {
     return numberOfSortColumns;
   }
 
-  public int getNumberOfNoDictSortColumns() {
-    return numberOfNoDictSortColumns;
+  public int getLastDimensionColOrdinal() {
+    return lastDimensionColOrdinal;
+  }
+
+  public int getNumberOfColumns() {
+    return numberOfColumnsAfterFlatten;
   }
 
-  public int getNumberOfDictSortColumns() {
-    return this.numberOfSortColumns - this.numberOfNoDictSortColumns;
+  public int getNumberOfDictDimensions() {
+    return numberOfDictDimensions;
   }
 
-  public int getLastDimensionColOrdinal() {
-    return lastDimensionColOrdinal;
+  public int getNumberOfSimpleDimensions() {
 
 Review comment:
   primitiveDimension

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376833358
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java
 ##########
 @@ -640,15 +377,91 @@ public int getNumberOfSortColumns() {
     return numberOfSortColumns;
   }
 
-  public int getNumberOfNoDictSortColumns() {
-    return numberOfNoDictSortColumns;
+  public int getLastDimensionColOrdinal() {
+    return lastDimensionColOrdinal;
+  }
+
+  public int getNumberOfColumns() {
+    return numberOfColumnsAfterFlatten;
   }
 
-  public int getNumberOfDictSortColumns() {
-    return this.numberOfSortColumns - this.numberOfNoDictSortColumns;
+  public int getNumberOfDictDimensions() {
+    return numberOfDictDimensions;
   }
 
-  public int getLastDimensionColOrdinal() {
-    return lastDimensionColOrdinal;
+  public int getNumberOfSimpleDimensions() {
+    return numberOfDictDimensions + numberOfNoDictionaryDimension;
+  }
+
+  public int getNumberOfComplexDimensions() {
+    return complexDimensions.size();
+  }
+
+  public int getNumberOfMeasures() {
+    return measures.size();
+  }
+
+  /**
+   * Return column value length in byte for all dimension columns in the table
+   * for dimension it is -1 (for DATE it is 4),
+   */
+  public int[] createDimColumnValueLength() {
+    int[] length = new int[dimensions.size()];
+    int index = 0;
+    for (CarbonDimension dimension : dimensions) {
+      DataType dataType = dimension.getDataType();
+      if (dataType == DataTypes.DATE) {
+        length[index] = 4;
+      } else {
+        length[index] = -1;
+      }
+      index++;
+    }
+    return length;
+  }
+
+  /**
+   * Return column value length in byte for all columns in the table
+   * for dimension and complex column it is -1 (for DATE it is 4),
+   * for measure is 8 (for decimal is -1)
+   */
+  public int[] createColumnValueLength() {
+    int[] length = new int[numberOfColumnsAfterFlatten];
+    int index = 0;
+    for (CarbonDimension dimension : dimensions) {
+      DataType dataType = dimension.getDataType();
+      if (dataType == DataTypes.DATE) {
+        length[index] = 4;
+      } else {
+        length[index] = -1;
+      }
+      index++;
+    }
+    for (CarbonDimension complexDimension : complexDimensions) {
+      int depth = getNumColumnsAfterFlatten(complexDimension);
+      for (int i = 0; i < depth; i++) {
+        length[index++] = -1;
+      }
+    }
+    for (CarbonMeasure measure : measures) {
+      DataType dataType = measure.getDataType();
+      if (DataTypes.isDecimal(dataType)) {
+        length[index++] = -1;
+      } else {
+        length[index++] = 8;
 
 Review comment:
   why the length of other measures are 8?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376834155
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/blocklet/EncodedBlocklet.java
 ##########
 @@ -38,11 +37,6 @@
    */
   private int blockletSize;
 
-  /**
-   * list of page metadata
-   */
-  private List<TablePageKey> pageMetadataList;
 
 Review comment:
   it means it will remove start/end key from loading flow.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376835460
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/column/CarbonDimension.java
 ##########
 @@ -93,15 +86,7 @@ public int getKeyOrdinal() {
     return keyOrdinal;
   }
 
-  /**
-   * @return the complexTypeOrdinal
-   */
-  public int getComplexTypeOrdinal() {
-    return complexTypeOrdinal;
-  }
-
   public void setComplexTypeOridnal(int complexTypeOrdinal) {
 
 Review comment:
   why not remove it ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839201
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java
 ##########
 @@ -217,7 +217,8 @@ public BlockletScannedResult call() throws Exception {
             nextRead.set(true);
             futureIo = readNextBlockletAsync();
           }
-          return blockletScanner.scanBlocklet(rawBlockletColumnChunks);
+          BlockletScannedResult result = blockletScanner.scanBlocklet(rawBlockletColumnChunks);
 
 Review comment:
   not require the change

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376833731
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
 ##########
 @@ -74,31 +72,10 @@
    */
   private Segment segment;
 
-  /**
-   * id of the Blocklet.
-   */
-  private String blockletId;
 
 Review comment:
   why remove blocklet info

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839786
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java
 ##########
 @@ -756,4 +756,45 @@ public static long toLongLittleEndian(byte[] bytes, int offset) {
         ((long) bytes[offset + 3] & 0xff) << 24) | (((long) bytes[offset + 2] & 0xff) << 16) | (
         ((long) bytes[offset + 1] & 0xff) << 8) | (((long) bytes[offset] & 0xff)));
   }
+
+  public static byte[] convertDateToBytes(int date) {
+    return ByteUtil.toBytes(date);
+  }
+
+  public static byte[] convertDateToBytes(long[] date) {
+    byte[] output = new byte[date.length * 4];
+    for (int i = 0; i < date.length; i++) {
+      System.arraycopy(ByteUtil.toBytes(date[i]), 0, output, i * 4, 4);
+    }
+    return output;
+  }
+
+  public static int convertBytesToDate(byte[] date) {
 
 Review comment:
   How about convertDateBytesToInt

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839981
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java
 ##########
 @@ -756,4 +756,45 @@ public static long toLongLittleEndian(byte[] bytes, int offset) {
         ((long) bytes[offset + 3] & 0xff) << 24) | (((long) bytes[offset + 2] & 0xff) << 16) | (
         ((long) bytes[offset + 1] & 0xff) << 8) | (((long) bytes[offset] & 0xff)));
   }
+
+  public static byte[] convertDateToBytes(int date) {
+    return ByteUtil.toBytes(date);
+  }
+
+  public static byte[] convertDateToBytes(long[] date) {
+    byte[] output = new byte[date.length * 4];
+    for (int i = 0; i < date.length; i++) {
+      System.arraycopy(ByteUtil.toBytes(date[i]), 0, output, i * 4, 4);
+    }
+    return output;
+  }
+
+  public static int convertBytesToDate(byte[] date) {
+    return ByteUtil.toInt(date, 0);
+  }
+
+  public static int convertBytesToDate(byte[] date, int offset) {
+    return ByteUtil.toInt(date, offset);
+  }
+
+  public static int dateBytesSize() {
+    return 4;
+  }
+
+  public static int[] convertBytesToDateIntArray(byte[] input) {
 
 Review comment:
   convertDateBytesToInts

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376847677
 
 

 ##########
 File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
 ##########
 @@ -578,15 +562,11 @@ class CarbonMergerRDD[K, V](
       }
     }
     val updatedMaxSegmentColumnList = new util.ArrayList[ColumnSchema]()
-    // update cardinality and column schema list according to master schema
-    val cardinality = CarbonCompactionUtil
 
 Review comment:
   if no need to update cardinality,  also no need to update column schema

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376848072
 
 

 ##########
 File path: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala
 ##########
 @@ -111,8 +111,6 @@ case class CarbonMergerMapping(
     validSegments: Array[Segment],
     tableId: String,
     campactionType: CompactionType,
-    // maxSegmentColCardinality is Cardinality of last segment of compaction
-    var maxSegmentColCardinality: Array[Int],
     // maxSegmentColumnSchemaList is list of column schema of last segment of compaction
     var maxSegmentColumnSchemaList: List[ColumnSchema],
 
 Review comment:
   use CarbonTable directly

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r377104775
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
 ##########
 @@ -135,25 +135,6 @@ DataMapExprWrapper chooseDataMap(DataMapLevel level, FilterResolverIntf resolver
     return null;
   }
 
-  /**
-   * Get all datamaps of the table for clearing purpose
-   */
-  public DataMapExprWrapper getAllDataMapsForClear(CarbonTable carbonTable)
 
 Review comment:
   It is not used anyplace

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
1234