KanakaKumar commented on a change in pull request #3209: [CARBONDATA-3373] Optimize scenes with in numbers in SQL
URL: https://github.com/apache/carbondata/pull/3209#discussion_r281922869 ########## File path: core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java ########## @@ -272,29 +273,21 @@ private BitSet getFilteredIndexesForMeasures(ColumnPage columnPage, // Get the measure values from the chunk. compare sequentially with the // the filter values. The one that matches sets it Bitset. BitSet bitSet = new BitSet(rowsInPage); - Object[] filterValues = msrColumnExecutorInfo.getFilterKeys(); - - SerializableComparator comparator = Comparator.getComparatorByDataTypeForMeasure(msrType); + Set filterValuesSet = msrColumnExecutorInfo.getFilterKeysSet(); BitSet nullBitSet = columnPage.getNullBits(); - for (int i = 0; i < filterValues.length; i++) { - if (filterValues[i] == null) { - for (int j = nullBitSet.nextSetBit(0); j >= 0; j = nullBitSet.nextSetBit(j + 1)) { - bitSet.set(j); - } - continue; - } - for (int startIndex = 0; startIndex < rowsInPage; startIndex++) { - if (!nullBitSet.get(startIndex)) { - // Check if filterValue[i] matches with measure Values. - Object msrValue = DataTypeUtil - .getMeasureObjectBasedOnDataType(columnPage, startIndex, - msrType, msrColumnEvaluatorInfo.getMeasure()); - - if (comparator.compare(msrValue, filterValues[i]) == 0) { - // This is a match. - bitSet.set(startIndex); - } + for (int startIndex = 0; startIndex < rowsInPage; startIndex++) { + if (!nullBitSet.get(startIndex)) { + // Check if filterValue[i] matches with measure Values. + Object msrValue = DataTypeUtil + .getMeasureObjectBasedOnDataType(columnPage, startIndex, + msrType, msrColumnEvaluatorInfo.getMeasure()); + + if (filterValuesSet.contains(msrValue)) { Review comment: HashSet/Map lookup depends on hashcode . I think this may not match incase of decimal & float value Example. 123.000 & 123.0 comparison with equals would work but hashcode would be different. I think the optimal approach would be to use TreeMap with comparator by data type [TreeMap(Comparator<? super K> comparator)] which does binary search or sorted filter values. @ravipesala , @kumarvishal09 please correct me If it was intentional to use Array for any specific reason. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |