Posted by
bill.zhou on
Mar 14, 2017; 3:10pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSS-add-more-index-for-sort-columns-tp8891.html
hi all
Carbon will add min/max index for sort columns which used for better filter query. So can we add more index for the sort column to make filter faster.
This is one idea which I get from anther database design.
For example this is one student, and the column: score in the student table which will be sorted column. And the score range is from 1 to 100. The table as following:
id | name | score |
---|
1 | bill001 | 83 |
2 | bill002 | 84 |
3 | bill003 | 90 |
4 | bill004 | 89 |
5 | bill005 | 93 |
6 | bill006 | 76 |
7 | bill007 | 87 |
8 | bill008 | 90 |
9 | bill009 | 89 |
10 | bill010 | 96 |
11 | bill011 | 96 |
12 | bill012 | 100 |
13 | bill013 | 84 |
14 | bill014 | 90 |
15 | bill015 | 79 |
16 | bill016 | 1 |
17 | bill017 | 97 |
18 | bill018 | 79 |
19 | bill019 | 88 |
20 | bill068 | 95 |
After load the data into Cabron the score column will sort as following:
1 | 76 | 79 | 79 | 83 | 84 | 84 | 87 | 88 | 89 | 89 | 90 | 90 | 90 | 93 | 95 | 96 | 96 | 97 | 100 |
the min/max index is 1/100.
So for the query as following will take all the block data.
query1:select sum(score) from student when score score > 90
query2:select sum(score) from student when score score > 60 and score < 70.
Following two suggestion to reduce the block scan.
Suggestion 1: according the score range to divide into multiple small range for example 4:

0: meas this block has not the score rang value
1: meas this block has the score rang value
If add this index, for the query1 only need scan 1/4 data of the block and query2 no need scan any data, directly sckip this block
Suggestion 2: record more min/max for the score, for example every 5 rows record one min/max

If add this index for query1 only need scan 1/2 data of the block and query2 only need scan 1/4 data of the block
this is the raw idea, please Jacky, Ravindra and liang correct it whether we can add this feature. thanks
Regards
Bill