"between and" filter query is very slow
Posted by simafengyun on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/between-and-filter-query-is-very-slow-tp8257.html
Hi Dev,
I used carbondata version 0.2 in my local machine, and found that the "between and" filter query is very slow.
the root caused is by the below code in IncludeFilterExecuterImpl.java. It takes about 20s in my test.
The code's time complexity is O(n*m). I think it needs to optimized, please confirm. thanks
private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimensionColumnDataChunk,
intnumerOfRows) {
BitSet bitSet = new BitSet(numerOfRows);
if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk) {
FixedLengthDimensionDataChunk fixedDimensionChunk =
(FixedLengthDimensionDataChunk) dimensionColumnDataChunk;
byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
longstart = System.currentTimeMillis();
for (intk = 0; k < filterValues.length; k++) {
for (intj = 0; j < numerOfRows; j++) {
if (ByteUtil.UnsafeComparer.INSTANCE
.compareTo(fixedDimensionChunk.getCompleteDataChunk(), j * filterValues[k].length,
filterValues[k].length, filterValues[k], 0, filterValues[k].length) == 0) {
bitSet.set(j);
}
}
}
System.out.println("loop time: "+(System.currentTimeMillis() - start));
}
returnbitSet;
}