[GitHub] [carbondata] akkio-97 opened a new pull request #3987: [WIP ]local-dictionary

classic Classic list List threaded Threaded
54 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 opened a new pull request #3987: [WIP ]local-dictionary

GitBox

akkio-97 opened a new pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [WIP ]local-dictionary

GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-709975025


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2733/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [WIP ]local-dictionary

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-709975490


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4487/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [WIP ]local-dictionary

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-712851661


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4540/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [WIP ]local-dictionary

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-712853297


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2786/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713149620


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2788/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713180217


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4542/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ydvpankaj99 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ydvpankaj99 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713228390


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713272161


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4554/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713298067


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2804/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713309055


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 removed a comment on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

akkio-97 removed a comment on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713309055


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-713310868


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509001513



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/reader/dimension/v3/DimensionChunkReaderV3.java
##########
@@ -296,6 +297,12 @@ protected DimensionColumnPage decodeDimension(DimensionRawColumnChunk rawColumnP
         }
       }
       BitSet nullBitSet = QueryUtil.getNullBitSet(pageMetadata.presence, this.compressor);
+      // store rawColumnChunk for local dictionary
+      if (vectorInfo != null && !vectorInfo.vectorStack.isEmpty()) {

Review comment:
       please check if is local dictionary is present, then only store the rawColumnChunk




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509002285



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -64,6 +66,14 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
     int rowsNum = dataLength / columnValueSize;
     CarbonColumnVector vector = vectorInfo.vector;
+    if (vector.getType().isComplexType()) {
+      vector = vectorInfo.vectorStack.peek();

Review comment:
       directly call `getDirectVectorWrapperFactory`, it has vector.getType().isComplexType() inside only, other places like AdaptiveCodec uses the `getDirectVectorWrapperFactory`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509004084



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/compress/DirectCompressCodec.java
##########
@@ -325,6 +327,21 @@ private void fillPrimitiveType(byte[] pageData, CarbonColumnVector vector,
       int intSizeInBytes = DataTypes.INT.getSizeInBytes();
       int shortSizeInBytes = DataTypes.SHORT.getSizeInBytes();
       int lengthStoredInBytes;
+      // check if local dictionary is enabled for complex primitve type and call
+      // fillVector eventually
+      if (!vectorInfo.vectorStack.isEmpty()) {
+        CarbonColumnVectorImpl tempVector =
+            (CarbonColumnVectorImpl) (vectorInfo.vectorStack.peek().getColumnVector());
+        if (tempVector.rawColumnChunk != null
+            && tempVector.rawColumnChunk.getLocalDictionary() != null) {
+          DimensionChunkStoreFactory.DimensionStoreType dimStoreType =
+              DimensionChunkStoreFactory.DimensionStoreType.LOCAL_DICT;
+          new VariableLengthDimensionColumnPage(pageData, new int[0], new int[0], pageSize,

Review comment:
       please move fill vector comment here, because VariableLengthDimensionColumnPage constructor is calling fill vector




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509004845



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
##########
@@ -81,6 +82,8 @@
 
   private List<Integer> childElementsForEachRow;
 
+  public DimensionRawColumnChunk rawColumnChunk;

Review comment:
       please make it private. and add getter and setter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509008144



##########
File path: integration/presto/src/test/prestodb/org/apache/carbondata/presto/server/PrestoTestUtil.scala
##########
@@ -114,4 +114,60 @@ object PrestoTestUtil {
       }
     }
   }
+
+  // this method depends on prestodb jdbc PrestoArray class
+  def validateArrayOfPrimitiveTypeDataWithLocalDict(actualResult: List[Map[String, Any]],

Review comment:
       Add it for prestosql also and compile for both prestodb and prestosql and run the testcase locally once with both the profile




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509008562



##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -36,6 +36,7 @@ import org.apache.carbondata.presto.server.{PrestoServer, PrestoTestUtil}
 import org.apache.carbondata.sdk.file.{CarbonWriter, Schema}
 
 
+

Review comment:
       please revert it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#discussion_r509009526



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
##########
@@ -156,6 +159,14 @@ public void setNumberOfChildElementsForStruct(byte[] parentPageData, int pageSiz
     setNumberOfChildElementsInEachRow(childElementsForEachRow);
   }
 
+  public void setPositionCount(int positionCount) {
+

Review comment:
       revert this and keep an instance of slicestream check where you are setting the value.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


123