Login  Register

Re: [Issue] Bloomfilter datamap

Posted by xuchuanyin on Sep 25, 2018; 8:49am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Issue-Bloomfilter-datamap-tp63254p63364.html

Hi, arron.
Actually your query will not use the time series datamap since the filter
use filed 'product_id' which is not contained in your preagg datamap.
Even I remove the preagg datamap, the query with bloomfilter datamap still
failed with the same error logs as that in your post.

Then I add some logs in `BloomCoarseGrainDataMap.createQueryModel` and print
the input parameter 'expression', and then I found the root cause.

Carbondata parsed the query and in DataMapChooser, it combine the filters in
a tree, which contains an expression 'TRUE'.
This expression is a 'TrueExpression', it has two children, the left is NULL
and the right is a LiteralExpression.
And in BloomFilterDataMap, it tries to dissolve the expression and applies
`createQueryModel` for each child expression in a recursive way.
So at last, it will encounter NPE while applying the function for NULL.

I'm not sure about the reason of TrueExpression, but I'm sure it is the
DataMapChooser that cause this problem.

Actually about one month ago, we want to merge another optimization for
datamap pruning. Current DataMapChooser forward too many expressions to
datamap, even if they are not supported by the datamap. We will optimize
this by only forwarding supported expression to the datamap.

You can apply the PR#2665 and test it again. I've verified this and it is OK
now.

Please give your feedback once you have a result.




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/