"between and" filter query is very slow
Posted by simafengyun on Mar 03, 2017; 11:42am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/between-and-filter-query-is-very-slow-tp8257.html
Hi Dev,
I used carbondata version 0.2 in my local machine, and found that the "between and" filter query is very slow.
the root caused is by the below code in IncludeFilterExecuterImpl.java. It takes about 20s in my test.
The code's time complexity is O(n*m). I think it needs to optimized, please confirm. thanks
private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimensionColumnDataChunk,
intnumerOfRows) {
BitSet bitSet = new BitSet(numerOfRows);
if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk) {
FixedLengthDimensionDataChunk fixedDimensionChunk =
(FixedLengthDimensionDataChunk) dimensionColumnDataChunk;
byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
longstart = System.currentTimeMillis();
for (intk = 0; k < filterValues.length; k++) {
for (intj = 0; j < numerOfRows; j++) {
if (ByteUtil.UnsafeComparer.INSTANCE
.compareTo(fixedDimensionChunk.getCompleteDataChunk(), j * filterValues[k].length,
filterValues[k].length, filterValues[k], 0, filterValues[k].length) == 0) {
bitSet.set(j);
}
}
}
System.out.println("loop time: "+(System.currentTimeMillis() - start));
}
returnbitSet;
}