Login  Register

Re:Re: "between and" filter query is very slow

Posted by simafengyun on Mar 07, 2017; 8:19am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/between-and-filter-query-is-very-slow-tp8257p8356.html



Hi Dev,


I have created the jira named CARBONDATA-748 a few days ago.
Today I have fixed it for version 0.2. And created a new pull request.
Please help to confirm. thanks











At 2017-03-03 20:47:51, "Kumar Vishal" <[hidden email]> wrote:

>Hi,
>
>Currently In include and exclude filter case when dimension column does not
>have inverted index it is doing linear search , We can add binary search
>when data for that column is sorted, to get this information we can check
>in carbon table for that column whether user has selected no inverted index
>or not. If user has selected No inverted index while creating a column this
>code is fine, if user has not selected then data will be sorted so we can
>add binary search which will improve the performance.
>
>Please raise a Jira for this improvement
>
>-Regards
>Kumar Vishal
>
>
>On Fri, Mar 3, 2017 at 7:42 PM, 马云 <[hidden email]> wrote:
>
>> Hi Dev,
>>
>>
>> I used carbondata version 0.2 in my local machine, and found that the
>> "between and" filter query is very slow.
>> the root caused is by the below code in IncludeFilterExecuterImpl.java.
>> It takes about 20s in my test.
>>  The code's  time complexity is O(n*m). I think it needs to optimized,
>> please confirm. thanks
>>
>>
>>
>>
>>
>>   private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimens
>> ionColumnDataChunk,
>>
>>       intnumerOfRows) {
>>
>>     BitSet bitSet = new BitSet(numerOfRows);
>>
>>     if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk)
>> {
>>
>>       FixedLengthDimensionDataChunk fixedDimensionChunk =
>>
>>           (FixedLengthDimensionDataChunk) dimensionColumnDataChunk;
>>
>>       byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
>>
>>
>>
>>       longstart = System.currentTimeMillis();
>>
>>       for (intk = 0; k < filterValues.length; k++) {
>>
>>         for (intj = 0; j < numerOfRows; j++) {
>>
>>           if (ByteUtil.UnsafeComparer.INSTANCE
>>
>>               .compareTo(fixedDimensionChunk.getCompleteDataChunk(), j *
>> filterValues[k].length,
>>
>>                   filterValues[k].length, filterValues[k], 0,
>> filterValues[k].length) == 0) {
>>
>>             bitSet.set(j);
>>
>>           }
>>
>>         }
>>
>>       }
>>
>>       System.out.println("loop time: "+(System.currentTimeMillis() -
>> start));
>>
>>     }
>>
>>
>>
>>
>>     returnbitSet;
>>
>>   }