[GitHub] [carbondata] akkio-97 opened a new pull request #3987: [WIP ]local-dictionary

classic Classic list List threaded Threaded
54 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713411001


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2816/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713427073


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4567/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509150967



##########
File path: integration/presto/src/test/prestodb/org/apache/carbondata/presto/server/PrestoTestUtil.scala
##########
@@ -114,4 +114,60 @@ object PrestoTestUtil {
       }
     }
   }
+
+  // this method depends on prestodb jdbc PrestoArray class
+  def validateArrayOfPrimitiveTypeDataWithLocalDict(actualResult: List[Map[String, Any]],

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509151296



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/compress/DirectCompressCodec.java
##########
@@ -325,6 +327,21 @@ private void fillPrimitiveType(byte[] pageData, CarbonColumnVector vector,
       int intSizeInBytes = DataTypes.INT.getSizeInBytes();
       int shortSizeInBytes = DataTypes.SHORT.getSizeInBytes();
       int lengthStoredInBytes;
+      // check if local dictionary is enabled for complex primitve type and call
+      // fillVector eventually
+      if (!vectorInfo.vectorStack.isEmpty()) {
+        CarbonColumnVectorImpl tempVector =
+            (CarbonColumnVectorImpl) (vectorInfo.vectorStack.peek().getColumnVector());
+        if (tempVector.rawColumnChunk != null
+            && tempVector.rawColumnChunk.getLocalDictionary() != null) {
+          DimensionChunkStoreFactory.DimensionStoreType dimStoreType =
+              DimensionChunkStoreFactory.DimensionStoreType.LOCAL_DICT;
+          new VariableLengthDimensionColumnPage(pageData, new int[0], new int[0], pageSize,

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509152696



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -64,6 +66,14 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
     int rowsNum = dataLength / columnValueSize;
     CarbonColumnVector vector = vectorInfo.vector;
+    if (vector.getType().isComplexType()) {
+      vector = vectorInfo.vectorStack.peek();

Review comment:
       We are calling that right below after creating dictionary block.
   on line 69 - vector should be sliceStreamReader before we create dictionaryBlock. Otherwise dictionaryBlock will be null.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509152696



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -64,6 +66,14 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
     int rowsNum = dataLength / columnValueSize;
     CarbonColumnVector vector = vectorInfo.vector;
+    if (vector.getType().isComplexType()) {
+      vector = vectorInfo.vectorStack.peek();

Review comment:
       We are calling that right below after creating dictionary block.
   on line 69 - vector is intialized as sliceStreamReader before we create dictionaryBlock. Otherwise dictionaryBlock will be null, which will lead to NPE




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509152696



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -64,6 +66,14 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
     int rowsNum = dataLength / columnValueSize;
     CarbonColumnVector vector = vectorInfo.vector;
+    if (vector.getType().isComplexType()) {
+      vector = vectorInfo.vectorStack.peek();

Review comment:
       We are calling that right below after creating dictionary block.
   on line 69 - vector is intialized as sliceStreamReader before we create dictionaryVector. Otherwise dictionaryVector will be null, which will lead to NPE




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509152696



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -64,6 +66,14 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
     int rowsNum = dataLength / columnValueSize;
     CarbonColumnVector vector = vectorInfo.vector;
+    if (vector.getType().isComplexType()) {
+      vector = vectorInfo.vectorStack.peek();

Review comment:
       We are calling that right below after creating dictionaryVector.
   on line 69 - vector is intialized as sliceStreamReader before we create dictionaryVector. Otherwise dictionaryVector will be null, which will lead to NPE




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509156660



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
##########
@@ -156,6 +159,14 @@ public void setNumberOfChildElementsForStruct(byte[] parentPageData, int pageSiz
     setNumberOfChildElementsInEachRow(childElementsForEachRow);
   }
 
+  public void setPositionCount(int positionCount) {
+

Review comment:
       Moved this as a default implementation to CarbonColumnVector.java




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509156660



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
##########
@@ -156,6 +159,14 @@ public void setNumberOfChildElementsForStruct(byte[] parentPageData, int pageSiz
     setNumberOfChildElementsInEachRow(childElementsForEachRow);
   }
 
+  public void setPositionCount(int positionCount) {
+

Review comment:
       SliceStreamReader belongs to presto module. I will have to add dependency in order to do that. So I have moved this as a default implementation to CarbonColumnVector.java.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509157828



##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -36,6 +36,7 @@ import org.apache.carbondata.presto.server.{PrestoServer, PrestoTestUtil}
 import org.apache.carbondata.sdk.file.{CarbonWriter, Schema}
 
 
+

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509158326



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/reader/dimension/v3/DimensionChunkReaderV3.java
##########
@@ -296,6 +297,12 @@ protected DimensionColumnPage decodeDimension(DimensionRawColumnChunk rawColumnP
         }
       }
       BitSet nullBitSet = QueryUtil.getNullBitSet(pageMetadata.presence, this.compressor);
+      // store rawColumnChunk for local dictionary
+      if (vectorInfo != null && !vectorInfo.vectorStack.isEmpty()) {

Review comment:
       done

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
##########
@@ -81,6 +82,8 @@
 
   private List<Integer> childElementsForEachRow;
 
+  public DimensionRawColumnChunk rawColumnChunk;

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509152696



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -64,6 +66,14 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
     int rowsNum = dataLength / columnValueSize;
     CarbonColumnVector vector = vectorInfo.vector;
+    if (vector.getType().isComplexType()) {
+      vector = vectorInfo.vectorStack.peek();

Review comment:
       We are calling that right below after creating dictionaryVector.
   Also on line 69 - vector is intialized as sliceStreamReader before we create dictionaryVector. Otherwise dictionaryVector will be null, which will lead to NPE. So I think it is better if we place it above only.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713509134


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4584/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] brijoobopanna commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

brijoobopanna commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714658897


   retest this please
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714721854


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4644/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714733234


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2888/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714924009


   please rebase, compile and push


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat edited a comment on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714924009


   please rebase, compile and push.
   And I hope you have locally compiled prestodb and prestosql both.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-715002753


   > please rebase, compile and push.
   > And I hope you have locally compiled prestodb and prestosql both.
   
   yes


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


123