[GitHub] [carbondata] akkio-97 opened a new pull request #4055: Handled issue with huge data(exceeding 32K records) after enabling local dictionary

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 opened a new pull request #4055: Handled issue with huge data(exceeding 32K records) after enabling local dictionary

GitBox

akkio-97 opened a new pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: Handled issue with huge data(exceeding 32K records) after enabling local dictionary

GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-746267389


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5184/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: Handled issue with huge data(exceeding 32K records) after enabling local dictionary

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-746279075


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3422/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-746863648


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5188/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-746874212


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3426/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r544852773



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -84,6 +84,9 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     vector = ColumnarVectorWrapperDirectFactory
         .getDirectVectorWrapperFactory(vectorInfo, vector, invertedIndex, nullBitset,
             vectorInfo.deletedRows, false, false);
+    if (dictionaryVector.getIntsSize() < rowsNum) {

Review comment:
       add a comment about when dictionaryVector.getIntsSize() < rowsNum, I think it is only during array of string / varchar with local dictionary .




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r544852993



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -84,6 +84,9 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     vector = ColumnarVectorWrapperDirectFactory
         .getDirectVectorWrapperFactory(vectorInfo, vector, invertedIndex, nullBitset,
             vectorInfo.deletedRows, false, false);
+    if (dictionaryVector.getIntsSize() < rowsNum) {
+      dictionaryVector.increaseIntsLength(rowsNum);

Review comment:
       ```suggestion
         dictionaryVector.increaseIntArrayBufferSize(rowsNum);
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r544852993



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -84,6 +84,9 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     vector = ColumnarVectorWrapperDirectFactory
         .getDirectVectorWrapperFactory(vectorInfo, vector, invertedIndex, nullBitset,
             vectorInfo.deletedRows, false, false);
+    if (dictionaryVector.getIntsSize() < rowsNum) {
+      dictionaryVector.increaseIntsLength(rowsNum);

Review comment:
       ```suggestion
         dictionaryVector.increaseIntArraySize(rowsNum);
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r544853917



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -84,6 +84,9 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     vector = ColumnarVectorWrapperDirectFactory
         .getDirectVectorWrapperFactory(vectorInfo, vector, invertedIndex, nullBitset,
             vectorInfo.deletedRows, false, false);
+    if (dictionaryVector.getIntsSize() < rowsNum) {

Review comment:
       ```suggestion
       if (dictionaryVector.getIntArraySize() < rowsNum) {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r544854194



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/result/vector/CarbonColumnVector.java
##########
@@ -116,6 +117,13 @@
 
   void setLazyPage(LazyPageLoader lazyPage);
 
+  default int getIntsSize() {
+    return CarbonV3DataFormatConstants.NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT;

Review comment:
       please remove from here and keep only in impl. because this guy is not aware about Int array




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-747272545


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5191/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-747277167


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3432/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

akkio-97 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-747282145


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-747294298


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3433/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-747295006


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5193/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-747361259


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5195/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#issuecomment-747363888


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3435/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r545002834



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -84,6 +84,9 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     vector = ColumnarVectorWrapperDirectFactory
         .getDirectVectorWrapperFactory(vectorInfo, vector, invertedIndex, nullBitset,
             vectorInfo.deletedRows, false, false);
+    if (dictionaryVector.getIntsSize() < rowsNum) {

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r545002955



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -84,6 +84,9 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     vector = ColumnarVectorWrapperDirectFactory
         .getDirectVectorWrapperFactory(vectorInfo, vector, invertedIndex, nullBitset,
             vectorInfo.deletedRows, false, false);
+    if (dictionaryVector.getIntsSize() < rowsNum) {
+      dictionaryVector.increaseIntsLength(rowsNum);

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4055: [CARBONDATA-4087] Handled issue with huge data(exceeding 32K records) after enabling local dictionary in Presto

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #4055:
URL: https://github.com/apache/carbondata/pull/4055#discussion_r545003060



##########
File path: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
##########
@@ -84,6 +84,9 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
     vector = ColumnarVectorWrapperDirectFactory
         .getDirectVectorWrapperFactory(vectorInfo, vector, invertedIndex, nullBitset,
             vectorInfo.deletedRows, false, false);
+    if (dictionaryVector.getIntsSize() < rowsNum) {

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12