ShreelekhyaG opened a new pull request #4124: URL: https://github.com/apache/carbondata/pull/4124 ### Why is this PR needed? 1. After creating an Inverted index on the dimension column, some of the filter queries give incorrect results. 2. handle exception for higher level non-existing children column in desc column. ### What changes were proposed in this PR? 1. While sorting byte arrays with inverted index, we use `compareTo `method of `ByteArrayColumnWithRowId`. Here, it was sorting based on the last byte only. Made changes to sort properly based on the entire byte length. 2. handled exception and added in testcase. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-825001904 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5235/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-825009182 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3487/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r618988679 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -42,7 +42,7 @@ public short getRowId() { @Override public int compareTo(ByteArrayColumnWithRowId o) { return UnsafeComparer.INSTANCE - .compareTo(column, 2, column.length - 2, o.column, 2, o.column.length - 2); + .compareTo(column, 0, column.length, o.column, 0, o.column.length); Review comment: I can see that they were ignoring first two bytes before as it was LV and L is short length field. So, now you have changed it to compare whole array. that means it is not coming as LV now ? which PR has induced it ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r618988679 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -42,7 +42,7 @@ public short getRowId() { @Override public int compareTo(ByteArrayColumnWithRowId o) { return UnsafeComparer.INSTANCE - .compareTo(column, 2, column.length - 2, o.column, 2, o.column.length - 2); + .compareTo(column, 0, column.length, o.column, 0, o.column.length); Review comment: I can see that they were ignoring first two bytes before as it was LV and L is short length field. So, now you have changed it to compare whole array because it is not coming as LV, only V is present in byte Array? which PR has changed this behavior ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r619010376 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -42,7 +42,7 @@ public short getRowId() { @Override public int compareTo(ByteArrayColumnWithRowId o) { return UnsafeComparer.INSTANCE - .compareTo(column, 2, column.length - 2, o.column, 2, o.column.length - 2); + .compareTo(column, 0, column.length, o.column, 0, o.column.length); Review comment: This ByteArrayColumnWithRowId is common for other flows also. please check the impact. test cases may not fail as it is internal logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r619026501 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -42,7 +42,7 @@ public short getRowId() { @Override public int compareTo(ByteArrayColumnWithRowId o) { return UnsafeComparer.INSTANCE - .compareTo(column, 2, column.length - 2, o.column, 2, o.column.length - 2); + .compareTo(column, 0, column.length, o.column, 0, o.column.length); Review comment: As discussed, I think no dictionary string column comparison will go wrong after this change. no dictionary string to be alphabetically sorted we need to ignore L in LV. so please test and handle compare logic based on the flag. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-825694242 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5246/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-825696505 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3499/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r619294183 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -42,7 +42,7 @@ public short getRowId() { @Override public int compareTo(ByteArrayColumnWithRowId o) { return UnsafeComparer.INSTANCE - .compareTo(column, 2, column.length - 2, o.column, 2, o.column.length - 2); + .compareTo(column, 0, column.length, o.column, 0, o.column.length); Review comment: yes, this flow will be used for string types only. I have added a check for the dictionary and handled the compare logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r619395321 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -51,11 +59,12 @@ public boolean equals(Object obj) { return false; } ByteArrayColumnWithRowId o = (ByteArrayColumnWithRowId)obj; - return Arrays.equals(column, o.column) && rowId == o.rowId; + return Arrays.equals(column, o.column) && rowId == o.rowId + && isNoDictionary == o.isNoDictionary; Review comment: Do we really need to consider `isNoDictionary` here ? when it will be not equal? I think as it is a comparison of data within a column, this condition might always be true. If not required, remove from equals and hashcode -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r619395321 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -51,11 +59,12 @@ public boolean equals(Object obj) { return false; } ByteArrayColumnWithRowId o = (ByteArrayColumnWithRowId)obj; - return Arrays.equals(column, o.column) && rowId == o.rowId; + return Arrays.equals(column, o.column) && rowId == o.rowId + && isNoDictionary == o.isNoDictionary; Review comment: Do we really need to consider `isNoDictionary` here ? when it will be not equal? I think as it is a comparison of data within a column (that too within a column page), this condition might always be true. If not required, remove from equals and hashcode -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r619991184 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -51,11 +59,12 @@ public boolean equals(Object obj) { return false; } ByteArrayColumnWithRowId o = (ByteArrayColumnWithRowId)obj; - return Arrays.equals(column, o.column) && rowId == o.rowId; + return Arrays.equals(column, o.column) && rowId == o.rowId + && isNoDictionary == o.isNoDictionary; Review comment: Done. Removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on a change in pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#discussion_r619991184 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/ByteArrayColumnWithRowId.java ########## @@ -51,11 +59,12 @@ public boolean equals(Object obj) { return false; } ByteArrayColumnWithRowId o = (ByteArrayColumnWithRowId)obj; - return Arrays.equals(column, o.column) && rowId == o.rowId; + return Arrays.equals(column, o.column) && rowId == o.rowId + && isNoDictionary == o.isNoDictionary; Review comment: Done. Removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-826577218 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5251/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-826583885 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3504/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-826606820 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-826681254 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5256/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-826691148 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3509/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #4124: URL: https://github.com/apache/carbondata/pull/4124#issuecomment-826723024 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |