Apache CarbonData Dev Mailing List archive

[DISCUSSION] Page Level Bloom Filter

Classic

List

Threaded

16 messages Options

Manhua Jiang

Oct 31, 2019; 12:07pm

[DISCUSSION] Page Level Bloom Filter

Hi Community,
Bloom datamap has been implemented for a while at blocklet level.
One problem of bloom datamap is that the pruning process is
done in driver side and caching the bloom index data is expensive.
So here we are proposing to build bloom filter inside the carbon
data file at page level, such that page skipping can be done in executor
side. And this has no conflict with bloom datamap.

A draft of the implementation could be like this:
1. Setting: create table with specifing which column needs to
build page level bloom in table properties (act like local dictionary)
2. Building: when loading, build the bitmap of bloom as the
way collecting MIN_MAX for these columns
3. Writing: write the bitmaps in DataChunk3. Learned from bloom datamap,
we also compress the bitmaps for storage and IO performance purpose.
4. Usage: (IncludeFilterExecuterImpl.prunePages) prune page with bloom if
not all pages pruned by MIN_MAX, then intersect the bitset for selected pages

A rough test shows this can skip many pages for query in some scenarios
(like phone number filter query which easily hits a blocklet/page).

Please give you inputs if you have any idea or suggestion.

Thanks & Regards,
Manhua

Jacky Li

Nov 04, 2019; 9:15am

Re: [DISCUSSION] Page Level Bloom Filter

Hi Manhua,

+1 for this feature.

One question:
Since one column chunk in one blocklet is carbon's minimum IO unit, why not
create bloom filter in blocklet level? If it is page level, we still need to
read page data into memory, the saving is only for decompression.

Regards,
Jacky

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Manhua Jiang

Nov 05, 2019; 2:30am

Re: [DISCUSSION] Page Level Bloom Filter

Hi Jacky,
If we create bloom filter in blocklet level, maybe too similar to bloom datamap and have to face the same problems bloom datamap facing, except the pruning is running in executor side.
Page level is preferred since page size is KNOWN and this let us get rid of considering how many bit should we need in the bitmap of bloom filter, only the FPP needed to be set.
I checked the problem you mentioned actually exists. This also a problem when pruning pages by page minmax. Although minmax may believes this page does not need to scan, current query logic already loaded both the datachunk3 and column pages. The IO for column page is wasted. Should we change this first? Is this worth for us to separate one IO operation into two?

Anyone interesting in this part is welcomed to share you ideas also.

Thanks.
Manhua

On 2019/11/04 09:15:35, Jacky Li <[hidden email]> wrote:

> Hi Manhua,
>
> +1 for this feature.
>
> One question:
> Since one column chunk in one blocklet is carbon's minimum IO unit, why not
> create bloom filter in blocklet level? If it is page level, we still need to
> read page data into memory, the saving is only for decompression.
>
>
> Regards,
> Jacky
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

... [show rest of quote]

xuchuanyin

Nov 12, 2019; 1:30am

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by Manhua Jiang

+1 for this feature.

Additionally, It is described in your draft that "specify the bloom columns
using table properties", I do recommend that for the first phrase, we should
not use this information from table properties while querying.

We can store the index information in blocklet（or page) level and use it
when we process this blocklet(or page).
After this, later we can automatically decide to build the corresponding
index for some columns or some pages on the fly without the user's help.

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Jacky Li-3

Nov 12, 2019; 3:37am

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by Manhua Jiang

On 2019/11/05 02:30:30, Manhua Jiang <[hidden email]> wrote:
> Hi Jacky,
> If we create bloom filter in blocklet level, maybe too similar to bloom datamap and have to face the same problems bloom datamap facing, except the pruning is running in executor side.
> Page level is preferred since page size is KNOWN and this let us get rid of considering how many bit should we need in the bitmap of bloom filter, only the FPP needed to be set.
> I checked the problem you mentioned actually exists. This also a problem when pruning pages by page minmax. Although minmax may believes this page does not need to scan, current query logic already loaded both the datachunk3 and column pages. The IO for column page is wasted. Should we change this first? Is this worth for us to separate one IO operation into two?

In my opinion, I think yes. We should leverage the datachunk3 and check whether the column pages are needed before reading. This can reduce the IO dramatically for some use case, for example, high selectivity filter query.

>
> Anyone interesting in this part is welcomed to share you ideas also.
>
> Thanks.
> Manhua
>
> On 2019/11/04 09:15:35, Jacky Li <[hidden email]> wrote:
> > Hi Manhua,
> >
> > +1 for this feature.
> >
> > One question:
> > Since one column chunk in one blocklet is carbon's minimum IO unit, why not
> > create bloom filter in blocklet level? If it is page level, we still need to
> > read page data into memory, the saving is only for decompression.
> >
> >
> > Regards,
> > Jacky
> >
> >
> >
> > --
> > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>

... [show rest of quote]

Manhua-2

Nov 12, 2019; 7:05am

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by xuchuanyin

Hi xuchuanyin,
Thanks for you inputs.
Table properties will not be used while querying. We will apply page level bloom filter only when both following two conditions are met.
1. query runs into IncludeFilterExecuter for a column
2. page bloom of that column is set/stored in carbondata file

Automatically choosing column to build page bloom is a good idea, but Carbon should design to collect info from usage history first.

On 2019/11/12 01:30:44, xuchuanyin <[hidden email]> wrote:

> +1 for this feature.
>
> Additionally, It is described in your draft that "specify the bloom columns
> using table properties", I do recommend that for the first phrase, we should
> not use this information from table properties while querying.
>
> We can store the index information in blocklet（or page) level and use it
> when we process this blocklet(or page).
> After this, later we can automatically decide to build the corresponding
> index for some columns or some pages on the fly without the user's help.
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

... [show rest of quote]

Manhua-2

Nov 12, 2019; 7:19am

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by Jacky Li-3

To my understanding, only IO of the *filter columns*' column pages are saved if we do this, in condition of that minmax/pagebloom decides we can *skip* scanning these pages.

On 2019/11/12 03:37:08, Jacky Li <[hidden email]> wrote:

>
>
> On 2019/11/05 02:30:30, Manhua Jiang <[hidden email]> wrote:
> > Hi Jacky,
> > If we create bloom filter in blocklet level, maybe too similar to bloom datamap and have to face the same problems bloom datamap facing, except the pruning is running in executor side.
> > Page level is preferred since page size is KNOWN and this let us get rid of considering how many bit should we need in the bitmap of bloom filter, only the FPP needed to be set.
> > I checked the problem you mentioned actually exists. This also a problem when pruning pages by page minmax. Although minmax may believes this page does not need to scan, current query logic already loaded both the datachunk3 and column pages. The IO for column page is wasted. Should we change this first? Is this worth for us to separate one IO operation into two?
>
> In my opinion, I think yes. We should leverage the datachunk3 and check whether the column pages are needed before reading. This can reduce the IO dramatically for some use case, for example, high selectivity filter query.
>
> >
> > Anyone interesting in this part is welcomed to share you ideas also.
> >
> > Thanks.
> > Manhua
> >
> > On 2019/11/04 09:15:35, Jacky Li <[hidden email]> wrote:
> > > Hi Manhua,
> > >
> > > +1 for this feature.
> > >
> > > One question:
> > > Since one column chunk in one blocklet is carbon's minimum IO unit, why not
> > > create bloom filter in blocklet level? If it is page level, we still need to
> > > read page data into memory, the saving is only for decompression.
> > >
> > >
> > > Regards,
> > > Jacky
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> > >
> >
>

... [show rest of quote]

ravipesala

Nov 26, 2019; 7:14am

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by Manhua Jiang

Hi Manhua,

Main problem with this approach is we cannot save any IO as our IO unit is
blocklet not page. Once it is already to memory I really don’t think we can
get performance with bloom at page level. I feel the solution would be
efficient only the IO is saved somewhere.

Our min/max index is efficient because it can prune the files at driver side
and prune the blocklets and pages at the executor side. It is actually
saving lots of IO.

Supporting bloom at carbondata file and index level is a good approach
rather than just supporting at page level. My intention is that it should
behave just the same as the min/max index. So that we can prune the data at
multiple levels.

The driver side at the block level we can have a bloom with less probability
percentage and fewer hash functions to control the size as we load it to the
memory. And in the blocklet level we can increase the probability and hashes
little more for better pruning, gradually at page level we can increase the
probability further to have a much better pruning ability.

Regards,
Ravindra.

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Manhua-2

Nov 26, 2019; 7:38am

Re: [DISCUSSION] Page Level Bloom Filter

Hi Ravindra,

Yes, you're right. I did test this morning with nmon and found that IO did not decrease but CPU. The time gap between with and without page bloom could be from decompressing and decoding the column page, and number of rows spark needs to process.

For different level bloom, we can do `OR` operation on bitmaps of pages to get blocklet level, and similarly to get block level.

On 2019/11/26 07:14:40, ravipesala <[hidden email]> wrote:

> Hi Manhua,
>
> Main problem with this approach is we cannot save any IO as our IO unit is
> blocklet not page. Once it is already to memory I really don’t think we can
> get performance with bloom at page level. I feel the solution would be
> efficient only the IO is saved somewhere.
>
> Our min/max index is efficient because it can prune the files at driver side
> and prune the blocklets and pages at the executor side. It is actually
> saving lots of IO.
>
> Supporting bloom at carbondata file and index level is a good approach
> rather than just supporting at page level. My intention is that it should
> behave just the same as the min/max index. So that we can prune the data at
> multiple levels.
>
> The driver side at the block level we can have a bloom with less probability
> percentage and fewer hash functions to control the size as we load it to the
> memory. And in the blocklet level we can increase the probability and hashes
> little more for better pruning, gradually at page level we can increase the
> probability further to have a much better pruning ability.
>
>
> Regards,
> Ravindra.
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

... [show rest of quote]

Vimal Das Kammath

Nov 26, 2019; 8:24am

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by ravipesala

I agree with ravindra that having bloom filter at Page level would not save
any IO. Having bloom filter at file level makes sense as it could help to
prune files at the driver side. But, I am concerned on the number of false
positives that would result if we keep bloom filter at an entire file
level. I think we need to experiment to find out the ideal parameters(Bloom
size and number of hash functions) that would work effectively for a file
level bloom filter.

Regards,
Vimal

On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]> wrote:

> Hi Manhua,
>
> Main problem with this approach is we cannot save any IO as our IO unit is
> blocklet not page. Once it is already to memory I really don’t think we can
> get performance with bloom at page level. I feel the solution would be
> efficient only the IO is saved somewhere.
>
> Our min/max index is efficient because it can prune the files at driver
> side
> and prune the blocklets and pages at the executor side. It is actually
> saving lots of IO.
>
> Supporting bloom at carbondata file and index level is a good approach
> rather than just supporting at page level. My intention is that it should
> behave just the same as the min/max index. So that we can prune the data at
> multiple levels.
>
> The driver side at the block level we can have a bloom with less
> probability
> percentage and fewer hash functions to control the size as we load it to
> the
> memory. And in the blocklet level we can increase the probability and
> hashes
> little more for better pruning, gradually at page level we can increase the
> probability further to have a much better pruning ability.
>
>
> Regards,
> Ravindra.
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

... [show rest of quote]

Manhua-2

Nov 26, 2019; 8:52am

Re: [DISCUSSION] Page Level Bloom Filter

Hi Vimal,
For what you concern about, if you have tried bloom datamap, you may know about how difficult it is to configure the bloom parameter. You never know how many (distinct) elements will be added to the bloom filter because blocklet is configure by size. The more bytes of a row is, the less numer of row added in blocklet. And for block level, this will be related to block size configuration too. Also, please mind the size of bloom filter.

On 2019/11/26 08:24:33, Vimal Das Kammath <[hidden email]> wrote:

> I agree with ravindra that having bloom filter at Page level would not save
> any IO. Having bloom filter at file level makes sense as it could help to
> prune files at the driver side. But, I am concerned on the number of false
> positives that would result if we keep bloom filter at an entire file
> level. I think we need to experiment to find out the ideal parameters(Bloom
> size and number of hash functions) that would work effectively for a file
> level bloom filter.
>
> Regards,
> Vimal
>
> On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]> wrote:
>
> > Hi Manhua,
> >
> > Main problem with this approach is we cannot save any IO as our IO unit is
> > blocklet not page. Once it is already to memory I really don’t think we can
> > get performance with bloom at page level. I feel the solution would be
> > efficient only the IO is saved somewhere.
> >
> > Our min/max index is efficient because it can prune the files at driver
> > side
> > and prune the blocklets and pages at the executor side. It is actually
> > saving lots of IO.
> >
> > Supporting bloom at carbondata file and index level is a good approach
> > rather than just supporting at page level. My intention is that it should
> > behave just the same as the min/max index. So that we can prune the data at
> > multiple levels.
> >
> > The driver side at the block level we can have a bloom with less
> > probability
> > percentage and fewer hash functions to control the size as we load it to
> > the
> > memory. And in the blocklet level we can increase the probability and
> > hashes
> > little more for better pruning, gradually at page level we can increase the
> > probability further to have a much better pruning ability.
> >
> >
> > Regards,
> > Ravindra.
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>

... [show rest of quote]

kumarvishal09

Nov 26, 2019; 11:49am

Re: [DISCUSSION] Page Level Bloom Filter

Hi Manhua,

I agree with Ravindra and Vimal adding page level Bloom will not improve
query performance much, because it will not reduce amount of data read from
the disk.
It will only reduce some processing time(Uncompression of pages and
Applying filter on those pages).
Keeping the bloom information in file footer will help in reducing the
IO+Processing.

I agree with you finding the distinct for a column in blocklet will be
complex because blocklet is based on size not based on rows.Page also we
have size based configuration which is by default false now, but this we
are planing to make true to support huge binary/varchar/complex data.

Sol1. We can ask user to pass number of cardinality of column for which he
wants to generate the bloom.
Sol2. Once blocklet cut is done while writing carbondatafile we can
calculate the cardinality of column for which he wants to generate the
bloom. If size is more we can drop the bloom for that blockelt.

I agree with Ravi keeping different FPP for executor and driver will help
in reducing the size.

-Regards
Kumar Vishal

On Tue, Nov 26, 2019 at 2:22 PM Manhua <[hidden email]> wrote:

> Hi Vimal,
> For what you concern about, if you have tried bloom datamap, you may
> know about how difficult it is to configure the bloom parameter. You never
> know how many (distinct) elements will be added to the bloom filter because
> blocklet is configure by size. The more bytes of a row is, the less numer
> of row added in blocklet. And for block level, this will be related to
> block size configuration too. Also, please mind the size of bloom filter.
>
>
> On 2019/11/26 08:24:33, Vimal Das Kammath <[hidden email]>
> wrote:
> > I agree with ravindra that having bloom filter at Page level would not
> save
> > any IO. Having bloom filter at file level makes sense as it could help to
> > prune files at the driver side. But, I am concerned on the number of
> false
> > positives that would result if we keep bloom filter at an entire file
> > level. I think we need to experiment to find out the ideal
> parameters(Bloom
> > size and number of hash functions) that would work effectively for a file
> > level bloom filter.
> >
> > Regards,
> > Vimal
> >
> > On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]>
> wrote:
> >
> > > Hi Manhua,
> > >
> > > Main problem with this approach is we cannot save any IO as our IO
> unit is
> > > blocklet not page. Once it is already to memory I really don’t think
> we can
> > > get performance with bloom at page level. I feel the solution would be
> > > efficient only the IO is saved somewhere.
> > >
> > > Our min/max index is efficient because it can prune the files at driver
> > > side
> > > and prune the blocklets and pages at the executor side. It is actually
> > > saving lots of IO.
> > >
> > > Supporting bloom at carbondata file and index level is a good approach
> > > rather than just supporting at page level. My intention is that it
> should
> > > behave just the same as the min/max index. So that we can prune the
> data at
> > > multiple levels.
> > >
> > > The driver side at the block level we can have a bloom with less
> > > probability
> > > percentage and fewer hash functions to control the size as we load it
> to
> > > the
> > > memory. And in the blocklet level we can increase the probability and
> > > hashes
> > > little more for better pruning, gradually at page level we can
> increase the
> > > probability further to have a much better pruning ability.
> > >
> > >
> > > Regards,
> > > Ravindra.
> > >
> > >
> > >
> > > --
> > > Sent from:
> > >
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> > >
> >
>

... [show rest of quote]

kumar vishal

ravipesala

Nov 26, 2019; 12:03pm

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by Manhua-2

Hi Manhua,

Even at page level the row count will not be available probably from the
next version. It would be decided as per size, not per count. Already code
got merged and we are keeping the count based page configuration temporarily
for backward compatibility.

So at any place, we will not get the cardinality of the column beforehand.
Either we need to estimate the count from the history or take the
approximate value from the user.

And one more thing is the generation of bloom should follow the datamap
interfaces, not in the min/max generation flow. But we can change datamap
interfaces to add the generated datamap index to carbondata file and index
file instead of separate files. Otherwise, we will loose the index interface
capabilities and makes our code complex. This was already discussed earlier
with Jacky. @Jacky please comment on it.

Regards,
Ravindra.

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Manhua-2

Nov 26, 2019; 12:30pm

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by kumarvishal09

Hi Vishal,
I want to ask a question. For supporting huge binary/varchar/complex data, the row number in a page will be larger or smaller than 32000?
Thanks.

On 2019/11/26 11:49:54, Kumar Vishal <[hidden email]> wrote:

> Hi Manhua,
>
> I agree with Ravindra and Vimal adding page level Bloom will not improve
> query performance much, because it will not reduce amount of data read from
> the disk.
> It will only reduce some processing time(Uncompression of pages and
> Applying filter on those pages).
> Keeping the bloom information in file footer will help in reducing the
> IO+Processing.
>
> I agree with you finding the distinct for a column in blocklet will be
> complex because blocklet is based on size not based on rows.Page also we
> have size based configuration which is by default false now, but this we
> are planing to make true to support huge binary/varchar/complex data.
>
> Sol1. We can ask user to pass number of cardinality of column for which he
> wants to generate the bloom.
> Sol2. Once blocklet cut is done while writing carbondatafile we can
> calculate the cardinality of column for which he wants to generate the
> bloom. If size is more we can drop the bloom for that blockelt.
>
> I agree with Ravi keeping different FPP for executor and driver will help
> in reducing the size.
>
> -Regards
> Kumar Vishal
>
>
>
>
>
>
>
>
> On Tue, Nov 26, 2019 at 2:22 PM Manhua <[hidden email]> wrote:
>
> > Hi Vimal,
> > For what you concern about, if you have tried bloom datamap, you may
> > know about how difficult it is to configure the bloom parameter. You never
> > know how many (distinct) elements will be added to the bloom filter because
> > blocklet is configure by size. The more bytes of a row is, the less numer
> > of row added in blocklet. And for block level, this will be related to
> > block size configuration too. Also, please mind the size of bloom filter.
> >
> >
> > On 2019/11/26 08:24:33, Vimal Das Kammath <[hidden email]>
> > wrote:
> > > I agree with ravindra that having bloom filter at Page level would not
> > save
> > > any IO. Having bloom filter at file level makes sense as it could help to
> > > prune files at the driver side. But, I am concerned on the number of
> > false
> > > positives that would result if we keep bloom filter at an entire file
> > > level. I think we need to experiment to find out the ideal
> > parameters(Bloom
> > > size and number of hash functions) that would work effectively for a file
> > > level bloom filter.
> > >
> > > Regards,
> > > Vimal
> > >
> > > On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]>
> > wrote:
> > >
> > > > Hi Manhua,
> > > >
> > > > Main problem with this approach is we cannot save any IO as our IO
> > unit is
> > > > blocklet not page. Once it is already to memory I really don’t think
> > we can
> > > > get performance with bloom at page level. I feel the solution would be
> > > > efficient only the IO is saved somewhere.
> > > >
> > > > Our min/max index is efficient because it can prune the files at driver
> > > > side
> > > > and prune the blocklets and pages at the executor side. It is actually
> > > > saving lots of IO.
> > > >
> > > > Supporting bloom at carbondata file and index level is a good approach
> > > > rather than just supporting at page level. My intention is that it
> > should
> > > > behave just the same as the min/max index. So that we can prune the
> > data at
> > > > multiple levels.
> > > >
> > > > The driver side at the block level we can have a bloom with less
> > > > probability
> > > > percentage and fewer hash functions to control the size as we load it
> > to
> > > > the
> > > > memory. And in the blocklet level we can increase the probability and
> > > > hashes
> > > > little more for better pruning, gradually at page level we can
> > increase the
> > > > probability further to have a much better pruning ability.
> > > >
> > > >
> > > > Regards,
> > > > Ravindra.
> > > >
> > > >
> > > >
> > > > --
> > > > Sent from:
> > > >
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> > > >
> > >
> >
>

... [show rest of quote]

Jacky Li

Nov 28, 2019; 3:05pm

Re: [DISCUSSION] Page Level Bloom Filter

In reply to this post by ravipesala

Hi,

Since the new bloom filter integration is designed to work in the executor
side, so in term of keeping it simple, I actually prefer to keep it inside
the data file itself instead of keep in a separated index file. So in
executor side, only read one file is enough. And it is always better if we
can follow the datamap interface.

Regards,
Jacky

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

kumarvishal09

Nov 28, 2019; 3:50pm

Re: [DISCUSSION] Page Level Bloom Filter

Hi Manhua,

Page size will be smaller/Equal to 32k but here point is how much IO
benefit will get during query. Since you are adding bloom information
inside page metadata, it might benefit in processing time with little bit
of IO overhead. So its better to keep Bloom information at blocklet level
in footer.

-Regards
Kumar Vishal

On Thu, Nov 28, 2019 at 8:21 PM Jacky Li <[hidden email]> wrote:

> Hi,
>
> Since the new bloom filter integration is designed to work in the executor
> side, so in term of keeping it simple, I actually prefer to keep it inside
> the data file itself instead of keep in a separated index file. So in
> executor side, only read one file is enough. And it is always better if we
> can follow the datamap interface.
>
> Regards,
> Jacky
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

... [show rest of quote]

kumar vishal