[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3951: [WIP] SI with cache level blocklet issue

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3951: [WIP] SI with cache level blocklet issue

GitBox

ShreelekhyaG opened a new pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951


    ### Why is this PR needed?
    Select query on SI column returns blank resultset after changing the cache level to blocklet
   
    ### What changes were proposed in this PR?
   In case of CACHE_LEVEL = BLOCKLET, blockId path contains both block id and blocklet id. In `getShortBlockId `and `getShortBlockIdForPartitionTable` , added check for blocklet id file seperator and made change to replace only the compressor name.
   Added example in testcase.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3951: [WIP] SI with cache level blocklet issue

GitBox

Indhumathi27 commented on a change in pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#discussion_r492785607



##########
File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
##########
@@ -648,11 +648,20 @@ public static String getShortBlockId(String blockId) {
             .replace(DATA_PART_PREFIX, "").replace(CARBON_DATA_EXT, "");
     // to remove compressor name
     if (!blockId.equalsIgnoreCase(blockIdWithCompressorName)) {
-      int index = blockIdWithCompressorName.lastIndexOf(".");
+      int index = blockIdWithCompressorName.lastIndexOf(POINT);
+      int fileSeperatorIndex = blockIdWithCompressorName.lastIndexOf(File.separator);
       if (index != -1) {
-        String replace =
-            blockIdWithCompressorName.replace(blockIdWithCompressorName.substring(index), "");
-        return replace;
+        String modifiedBlockId;

Review comment:
       Can extract common code from partition table and non-partition table to get block Id to new method




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#discussion_r492786590



##########
File path: core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
##########
@@ -928,6 +928,11 @@ public void testTupeIDInUpdateScenarios() {
     Assert.assertEquals(CarbonTablePath.getShortBlockId(blockId), "0/0-0_0-0-0-1597409791503");
     blockId = "c3=aa/part-0-100100000100001_batchno0-0-0-1597411003332.snappy.carbondata";
     Assert.assertEquals(CarbonTablePath.getShortBlockIdForPartitionTable(blockId), "c3=aa/0-100100000100001_0-0-0-1597411003332");
+    // CACHE_LEVEL = BLOCKLET case

Review comment:
       Please add a functional testcase for this scenario and check data after setting cache_level=blocklet




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#issuecomment-696812265


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2435/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#issuecomment-696820715


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4178/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#discussion_r492902403



##########
File path: core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
##########
@@ -928,6 +928,11 @@ public void testTupeIDInUpdateScenarios() {
     Assert.assertEquals(CarbonTablePath.getShortBlockId(blockId), "0/0-0_0-0-0-1597409791503");
     blockId = "c3=aa/part-0-100100000100001_batchno0-0-0-1597411003332.snappy.carbondata";
     Assert.assertEquals(CarbonTablePath.getShortBlockIdForPartitionTable(blockId), "c3=aa/0-100100000100001_0-0-0-1597411003332");
+    // CACHE_LEVEL = BLOCKLET case

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#issuecomment-696947129


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2445/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#issuecomment-696947234


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4188/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#issuecomment-696812265






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#discussion_r492785607



##########
File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
##########
@@ -648,11 +648,20 @@ public static String getShortBlockId(String blockId) {
             .replace(DATA_PART_PREFIX, "").replace(CARBON_DATA_EXT, "");
     // to remove compressor name
     if (!blockId.equalsIgnoreCase(blockIdWithCompressorName)) {
-      int index = blockIdWithCompressorName.lastIndexOf(".");
+      int index = blockIdWithCompressorName.lastIndexOf(POINT);
+      int fileSeperatorIndex = blockIdWithCompressorName.lastIndexOf(File.separator);
       if (index != -1) {
-        String replace =
-            blockIdWithCompressorName.replace(blockIdWithCompressorName.substring(index), "");
-        return replace;
+        String modifiedBlockId;

Review comment:
       Can extract common code from partition table and non-partition table to get block Id to new method

##########
File path: core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
##########
@@ -928,6 +928,11 @@ public void testTupeIDInUpdateScenarios() {
     Assert.assertEquals(CarbonTablePath.getShortBlockId(blockId), "0/0-0_0-0-0-1597409791503");
     blockId = "c3=aa/part-0-100100000100001_batchno0-0-0-1597411003332.snappy.carbondata";
     Assert.assertEquals(CarbonTablePath.getShortBlockIdForPartitionTable(blockId), "c3=aa/0-100100000100001_0-0-0-1597411003332");
+    // CACHE_LEVEL = BLOCKLET case

Review comment:
       Please add a functional testcase for this scenario and check data after setting cache_level=blocklet




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3951: [WIP] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#discussion_r492902403



##########
File path: core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
##########
@@ -928,6 +928,11 @@ public void testTupeIDInUpdateScenarios() {
     Assert.assertEquals(CarbonTablePath.getShortBlockId(blockId), "0/0-0_0-0-0-1597409791503");
     blockId = "c3=aa/part-0-100100000100001_batchno0-0-0-1597411003332.snappy.carbondata";
     Assert.assertEquals(CarbonTablePath.getShortBlockIdForPartitionTable(blockId), "c3=aa/0-100100000100001_0-0-0-1597411003332");
+    // CACHE_LEVEL = BLOCKLET case

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3951: [CARBONDATA-4005] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#discussion_r493201285



##########
File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
##########
@@ -648,11 +648,20 @@ public static String getShortBlockId(String blockId) {
             .replace(DATA_PART_PREFIX, "").replace(CARBON_DATA_EXT, "");
     // to remove compressor name
     if (!blockId.equalsIgnoreCase(blockIdWithCompressorName)) {
-      int index = blockIdWithCompressorName.lastIndexOf(".");
+      int index = blockIdWithCompressorName.lastIndexOf(POINT);
+      int fileSeperatorIndex = blockIdWithCompressorName.lastIndexOf(File.separator);
       if (index != -1) {
-        String replace =
-            blockIdWithCompressorName.replace(blockIdWithCompressorName.substring(index), "");
-        return replace;
+        String modifiedBlockId;

Review comment:
       We could use the same method for both partition and non-partition case. So removed `getShortBlockIdForPartitionTable `method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on pull request #3951: [CARBONDATA-4005] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

Indhumathi27 commented on pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#issuecomment-697221985


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akashrn5 commented on pull request #3951: [CARBONDATA-4005] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

akashrn5 commented on pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951#issuecomment-697254002


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3951: [CARBONDATA-4005] SI with cache level blocklet issue

GitBox
In reply to this post by GitBox

asfgit closed pull request #3951:
URL: https://github.com/apache/carbondata/pull/3951


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]