[jira] [Created] (CARBONDATA-748) "between and" filter query is very slow

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (CARBONDATA-748) "between and" filter query is very slow

Akash R Nilugal (Jira)
Jarck created CARBONDATA-748:
--------------------------------

             Summary: "between and" filter query is very slow
                 Key: CARBONDATA-748
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-748
             Project: CarbonData
          Issue Type: Improvement
            Reporter: Jarck


Hi,

Currently In include and exclude filter case when dimension column does not
have inverted index it is doing linear search , We can add binary search
when data for that column is sorted, to get this information we can check
in carbon table for that column whether user has selected no inverted index
or not. If user has selected No inverted index while creating a column this
code is fine, if user has not selected then data will be sorted so we can
add binary search which will improve the performance.

Please raise a Jira for this improvement

-Regards
Kumar Vishal


On Fri, Mar 3, 2017 at 7:42 PM, 马云 <[hidden email]> wrote:

Hi Dev,


I used carbondata version 0.2 in my local machine, and found that the
"between and" filter query is very slow.
the root caused is by the below code in IncludeFilterExecuterImpl.java.
It takes about 20s in my test.
The code's  time complexity is O(n*m). I think it needs to optimized,
please confirm. thanks





 private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimens
ionColumnDataChunk,

     intnumerOfRows) {

   BitSet bitSet = new BitSet(numerOfRows);

   if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk)
{

     FixedLengthDimensionDataChunk fixedDimensionChunk =

         (FixedLengthDimensionDataChunk) dimensionColumnDataChunk;

     byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();



     longstart = System.currentTimeMillis();

     for (intk = 0; k < filterValues.length; k++) {

       for (intj = 0; j < numerOfRows; j++) {

         if (ByteUtil.UnsafeComparer.INSTANCE

             .compareTo(fixedDimensionChunk.getCompleteDataChunk(), j *
filterValues[k].length,

                 filterValues[k].length, filterValues[k], 0,
filterValues[k].length) == 0) {

           bitSet.set(j);

         }

       }

     }

     System.out.println("loop time: "+(System.currentTimeMillis() -
start));

   }






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)