Hi Community,
Bloom datamap has been implemented for a while at blocklet level. One problem of bloom datamap is that the pruning process is done in driver side and caching the bloom index data is expensive. So here we are proposing to build bloom filter inside the carbon data file at page level, such that page skipping can be done in executor side. And this has no conflict with bloom datamap. A draft of the implementation could be like this: 1. Setting: create table with specifing which column needs to build page level bloom in table properties (act like local dictionary) 2. Building: when loading, build the bitmap of bloom as the way collecting MIN_MAX for these columns 3. Writing: write the bitmaps in DataChunk3. Learned from bloom datamap, we also compress the bitmaps for storage and IO performance purpose. 4. Usage: (IncludeFilterExecuterImpl.prunePages) prune page with bloom if not all pages pruned by MIN_MAX, then intersect the bitset for selected pages A rough test shows this can skip many pages for query in some scenarios (like phone number filter query which easily hits a blocklet/page). Please give you inputs if you have any idea or suggestion. Thanks & Regards, Manhua |
Hi Manhua,
+1 for this feature. One question: Since one column chunk in one blocklet is carbon's minimum IO unit, why not create bloom filter in blocklet level? If it is page level, we still need to read page data into memory, the saving is only for decompression. Regards, Jacky -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi Jacky,
If we create bloom filter in blocklet level, maybe too similar to bloom datamap and have to face the same problems bloom datamap facing, except the pruning is running in executor side. Page level is preferred since page size is KNOWN and this let us get rid of considering how many bit should we need in the bitmap of bloom filter, only the FPP needed to be set. I checked the problem you mentioned actually exists. This also a problem when pruning pages by page minmax. Although minmax may believes this page does not need to scan, current query logic already loaded both the datachunk3 and column pages. The IO for column page is wasted. Should we change this first? Is this worth for us to separate one IO operation into two? Anyone interesting in this part is welcomed to share you ideas also. Thanks. Manhua On 2019/11/04 09:15:35, Jacky Li <[hidden email]> wrote: > Hi Manhua, > > +1 for this feature. > > One question: > Since one column chunk in one blocklet is carbon's minimum IO unit, why not > create bloom filter in blocklet level? If it is page level, we still need to > read page data into memory, the saving is only for decompression. > > > Regards, > Jacky > > > > -- > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
In reply to this post by Manhua Jiang
+1 for this feature.
Additionally, It is described in your draft that "specify the bloom columns using table properties", I do recommend that for the first phrase, we should not use this information from table properties while querying. We can store the index information in blocklet(or page) level and use it when we process this blocklet(or page). After this, later we can automatically decide to build the corresponding index for some columns or some pages on the fly without the user's help. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by Manhua Jiang
On 2019/11/05 02:30:30, Manhua Jiang <[hidden email]> wrote: > Hi Jacky, > If we create bloom filter in blocklet level, maybe too similar to bloom datamap and have to face the same problems bloom datamap facing, except the pruning is running in executor side. > Page level is preferred since page size is KNOWN and this let us get rid of considering how many bit should we need in the bitmap of bloom filter, only the FPP needed to be set. > I checked the problem you mentioned actually exists. This also a problem when pruning pages by page minmax. Although minmax may believes this page does not need to scan, current query logic already loaded both the datachunk3 and column pages. The IO for column page is wasted. Should we change this first? Is this worth for us to separate one IO operation into two? In my opinion, I think yes. We should leverage the datachunk3 and check whether the column pages are needed before reading. This can reduce the IO dramatically for some use case, for example, high selectivity filter query. > > Anyone interesting in this part is welcomed to share you ideas also. > > Thanks. > Manhua > > On 2019/11/04 09:15:35, Jacky Li <[hidden email]> wrote: > > Hi Manhua, > > > > +1 for this feature. > > > > One question: > > Since one column chunk in one blocklet is carbon's minimum IO unit, why not > > create bloom filter in blocklet level? If it is page level, we still need to > > read page data into memory, the saving is only for decompression. > > > > > > Regards, > > Jacky > > > > > > > > -- > > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > |
In reply to this post by xuchuanyin
Hi xuchuanyin,
Thanks for you inputs. Table properties will not be used while querying. We will apply page level bloom filter only when both following two conditions are met. 1. query runs into IncludeFilterExecuter for a column 2. page bloom of that column is set/stored in carbondata file Automatically choosing column to build page bloom is a good idea, but Carbon should design to collect info from usage history first. On 2019/11/12 01:30:44, xuchuanyin <[hidden email]> wrote: > +1 for this feature. > > Additionally, It is described in your draft that "specify the bloom columns > using table properties", I do recommend that for the first phrase, we should > not use this information from table properties while querying. > > We can store the index information in blocklet(or page) level and use it > when we process this blocklet(or page). > After this, later we can automatically decide to build the corresponding > index for some columns or some pages on the fly without the user's help. > > > > -- > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
In reply to this post by Jacky Li-3
To my understanding, only IO of the *filter columns*' column pages are saved if we do this, in condition of that minmax/pagebloom decides we can *skip* scanning these pages.
On 2019/11/12 03:37:08, Jacky Li <[hidden email]> wrote: > > > On 2019/11/05 02:30:30, Manhua Jiang <[hidden email]> wrote: > > Hi Jacky, > > If we create bloom filter in blocklet level, maybe too similar to bloom datamap and have to face the same problems bloom datamap facing, except the pruning is running in executor side. > > Page level is preferred since page size is KNOWN and this let us get rid of considering how many bit should we need in the bitmap of bloom filter, only the FPP needed to be set. > > I checked the problem you mentioned actually exists. This also a problem when pruning pages by page minmax. Although minmax may believes this page does not need to scan, current query logic already loaded both the datachunk3 and column pages. The IO for column page is wasted. Should we change this first? Is this worth for us to separate one IO operation into two? > > In my opinion, I think yes. We should leverage the datachunk3 and check whether the column pages are needed before reading. This can reduce the IO dramatically for some use case, for example, high selectivity filter query. > > > > > Anyone interesting in this part is welcomed to share you ideas also. > > > > Thanks. > > Manhua > > > > On 2019/11/04 09:15:35, Jacky Li <[hidden email]> wrote: > > > Hi Manhua, > > > > > > +1 for this feature. > > > > > > One question: > > > Since one column chunk in one blocklet is carbon's minimum IO unit, why not > > > create bloom filter in blocklet level? If it is page level, we still need to > > > read page data into memory, the saving is only for decompression. > > > > > > > > > Regards, > > > Jacky > > > > > > > > > > > > -- > > > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > > |
In reply to this post by Manhua Jiang
Hi Manhua,
Main problem with this approach is we cannot save any IO as our IO unit is blocklet not page. Once it is already to memory I really don’t think we can get performance with bloom at page level. I feel the solution would be efficient only the IO is saved somewhere. Our min/max index is efficient because it can prune the files at driver side and prune the blocklets and pages at the executor side. It is actually saving lots of IO. Supporting bloom at carbondata file and index level is a good approach rather than just supporting at page level. My intention is that it should behave just the same as the min/max index. So that we can prune the data at multiple levels. The driver side at the block level we can have a bloom with less probability percentage and fewer hash functions to control the size as we load it to the memory. And in the blocklet level we can increase the probability and hashes little more for better pruning, gradually at page level we can increase the probability further to have a much better pruning ability. Regards, Ravindra. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi Ravindra,
Yes, you're right. I did test this morning with nmon and found that IO did not decrease but CPU. The time gap between with and without page bloom could be from decompressing and decoding the column page, and number of rows spark needs to process. For different level bloom, we can do `OR` operation on bitmaps of pages to get blocklet level, and similarly to get block level. On 2019/11/26 07:14:40, ravipesala <[hidden email]> wrote: > Hi Manhua, > > Main problem with this approach is we cannot save any IO as our IO unit is > blocklet not page. Once it is already to memory I really don’t think we can > get performance with bloom at page level. I feel the solution would be > efficient only the IO is saved somewhere. > > Our min/max index is efficient because it can prune the files at driver side > and prune the blocklets and pages at the executor side. It is actually > saving lots of IO. > > Supporting bloom at carbondata file and index level is a good approach > rather than just supporting at page level. My intention is that it should > behave just the same as the min/max index. So that we can prune the data at > multiple levels. > > The driver side at the block level we can have a bloom with less probability > percentage and fewer hash functions to control the size as we load it to the > memory. And in the blocklet level we can increase the probability and hashes > little more for better pruning, gradually at page level we can increase the > probability further to have a much better pruning ability. > > > Regards, > Ravindra. > > > > -- > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
In reply to this post by ravipesala
I agree with ravindra that having bloom filter at Page level would not save
any IO. Having bloom filter at file level makes sense as it could help to prune files at the driver side. But, I am concerned on the number of false positives that would result if we keep bloom filter at an entire file level. I think we need to experiment to find out the ideal parameters(Bloom size and number of hash functions) that would work effectively for a file level bloom filter. Regards, Vimal On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]> wrote: > Hi Manhua, > > Main problem with this approach is we cannot save any IO as our IO unit is > blocklet not page. Once it is already to memory I really don’t think we can > get performance with bloom at page level. I feel the solution would be > efficient only the IO is saved somewhere. > > Our min/max index is efficient because it can prune the files at driver > side > and prune the blocklets and pages at the executor side. It is actually > saving lots of IO. > > Supporting bloom at carbondata file and index level is a good approach > rather than just supporting at page level. My intention is that it should > behave just the same as the min/max index. So that we can prune the data at > multiple levels. > > The driver side at the block level we can have a bloom with less > probability > percentage and fewer hash functions to control the size as we load it to > the > memory. And in the blocklet level we can increase the probability and > hashes > little more for better pruning, gradually at page level we can increase the > probability further to have a much better pruning ability. > > > Regards, > Ravindra. > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
Hi Vimal,
For what you concern about, if you have tried bloom datamap, you may know about how difficult it is to configure the bloom parameter. You never know how many (distinct) elements will be added to the bloom filter because blocklet is configure by size. The more bytes of a row is, the less numer of row added in blocklet. And for block level, this will be related to block size configuration too. Also, please mind the size of bloom filter. On 2019/11/26 08:24:33, Vimal Das Kammath <[hidden email]> wrote: > I agree with ravindra that having bloom filter at Page level would not save > any IO. Having bloom filter at file level makes sense as it could help to > prune files at the driver side. But, I am concerned on the number of false > positives that would result if we keep bloom filter at an entire file > level. I think we need to experiment to find out the ideal parameters(Bloom > size and number of hash functions) that would work effectively for a file > level bloom filter. > > Regards, > Vimal > > On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]> wrote: > > > Hi Manhua, > > > > Main problem with this approach is we cannot save any IO as our IO unit is > > blocklet not page. Once it is already to memory I really don’t think we can > > get performance with bloom at page level. I feel the solution would be > > efficient only the IO is saved somewhere. > > > > Our min/max index is efficient because it can prune the files at driver > > side > > and prune the blocklets and pages at the executor side. It is actually > > saving lots of IO. > > > > Supporting bloom at carbondata file and index level is a good approach > > rather than just supporting at page level. My intention is that it should > > behave just the same as the min/max index. So that we can prune the data at > > multiple levels. > > > > The driver side at the block level we can have a bloom with less > > probability > > percentage and fewer hash functions to control the size as we load it to > > the > > memory. And in the blocklet level we can increase the probability and > > hashes > > little more for better pruning, gradually at page level we can increase the > > probability further to have a much better pruning ability. > > > > > > Regards, > > Ravindra. > > > > > > > > -- > > Sent from: > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > |
Hi Manhua,
I agree with Ravindra and Vimal adding page level Bloom will not improve query performance much, because it will not reduce amount of data read from the disk. It will only reduce some processing time(Uncompression of pages and Applying filter on those pages). Keeping the bloom information in file footer will help in reducing the IO+Processing. I agree with you finding the distinct for a column in blocklet will be complex because blocklet is based on size not based on rows.Page also we have size based configuration which is by default false now, but this we are planing to make true to support huge binary/varchar/complex data. Sol1. We can ask user to pass number of cardinality of column for which he wants to generate the bloom. Sol2. Once blocklet cut is done while writing carbondatafile we can calculate the cardinality of column for which he wants to generate the bloom. If size is more we can drop the bloom for that blockelt. I agree with Ravi keeping different FPP for executor and driver will help in reducing the size. -Regards Kumar Vishal On Tue, Nov 26, 2019 at 2:22 PM Manhua <[hidden email]> wrote: > Hi Vimal, > For what you concern about, if you have tried bloom datamap, you may > know about how difficult it is to configure the bloom parameter. You never > know how many (distinct) elements will be added to the bloom filter because > blocklet is configure by size. The more bytes of a row is, the less numer > of row added in blocklet. And for block level, this will be related to > block size configuration too. Also, please mind the size of bloom filter. > > > On 2019/11/26 08:24:33, Vimal Das Kammath <[hidden email]> > wrote: > > I agree with ravindra that having bloom filter at Page level would not > save > > any IO. Having bloom filter at file level makes sense as it could help to > > prune files at the driver side. But, I am concerned on the number of > false > > positives that would result if we keep bloom filter at an entire file > > level. I think we need to experiment to find out the ideal > parameters(Bloom > > size and number of hash functions) that would work effectively for a file > > level bloom filter. > > > > Regards, > > Vimal > > > > On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]> > wrote: > > > > > Hi Manhua, > > > > > > Main problem with this approach is we cannot save any IO as our IO > unit is > > > blocklet not page. Once it is already to memory I really don’t think > we can > > > get performance with bloom at page level. I feel the solution would be > > > efficient only the IO is saved somewhere. > > > > > > Our min/max index is efficient because it can prune the files at driver > > > side > > > and prune the blocklets and pages at the executor side. It is actually > > > saving lots of IO. > > > > > > Supporting bloom at carbondata file and index level is a good approach > > > rather than just supporting at page level. My intention is that it > should > > > behave just the same as the min/max index. So that we can prune the > data at > > > multiple levels. > > > > > > The driver side at the block level we can have a bloom with less > > > probability > > > percentage and fewer hash functions to control the size as we load it > to > > > the > > > memory. And in the blocklet level we can increase the probability and > > > hashes > > > little more for better pruning, gradually at page level we can > increase the > > > probability further to have a much better pruning ability. > > > > > > > > > Regards, > > > Ravindra. > > > > > > > > > > > > -- > > > Sent from: > > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > >
kumar vishal
|
In reply to this post by Manhua-2
Hi Manhua,
Even at page level the row count will not be available probably from the next version. It would be decided as per size, not per count. Already code got merged and we are keeping the count based page configuration temporarily for backward compatibility. So at any place, we will not get the cardinality of the column beforehand. Either we need to estimate the count from the history or take the approximate value from the user. And one more thing is the generation of bloom should follow the datamap interfaces, not in the min/max generation flow. But we can change datamap interfaces to add the generated datamap index to carbondata file and index file instead of separate files. Otherwise, we will loose the index interface capabilities and makes our code complex. This was already discussed earlier with Jacky. @Jacky please comment on it. Regards, Ravindra. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by kumarvishal09
Hi Vishal,
I want to ask a question. For supporting huge binary/varchar/complex data, the row number in a page will be larger or smaller than 32000? Thanks. On 2019/11/26 11:49:54, Kumar Vishal <[hidden email]> wrote: > Hi Manhua, > > I agree with Ravindra and Vimal adding page level Bloom will not improve > query performance much, because it will not reduce amount of data read from > the disk. > It will only reduce some processing time(Uncompression of pages and > Applying filter on those pages). > Keeping the bloom information in file footer will help in reducing the > IO+Processing. > > I agree with you finding the distinct for a column in blocklet will be > complex because blocklet is based on size not based on rows.Page also we > have size based configuration which is by default false now, but this we > are planing to make true to support huge binary/varchar/complex data. > > Sol1. We can ask user to pass number of cardinality of column for which he > wants to generate the bloom. > Sol2. Once blocklet cut is done while writing carbondatafile we can > calculate the cardinality of column for which he wants to generate the > bloom. If size is more we can drop the bloom for that blockelt. > > I agree with Ravi keeping different FPP for executor and driver will help > in reducing the size. > > -Regards > Kumar Vishal > > > > > > > > > On Tue, Nov 26, 2019 at 2:22 PM Manhua <[hidden email]> wrote: > > > Hi Vimal, > > For what you concern about, if you have tried bloom datamap, you may > > know about how difficult it is to configure the bloom parameter. You never > > know how many (distinct) elements will be added to the bloom filter because > > blocklet is configure by size. The more bytes of a row is, the less numer > > of row added in blocklet. And for block level, this will be related to > > block size configuration too. Also, please mind the size of bloom filter. > > > > > > On 2019/11/26 08:24:33, Vimal Das Kammath <[hidden email]> > > wrote: > > > I agree with ravindra that having bloom filter at Page level would not > > save > > > any IO. Having bloom filter at file level makes sense as it could help to > > > prune files at the driver side. But, I am concerned on the number of > > false > > > positives that would result if we keep bloom filter at an entire file > > > level. I think we need to experiment to find out the ideal > > parameters(Bloom > > > size and number of hash functions) that would work effectively for a file > > > level bloom filter. > > > > > > Regards, > > > Vimal > > > > > > On Tue, Nov 26, 2019 at 12:30 PM ravipesala <[hidden email]> > > wrote: > > > > > > > Hi Manhua, > > > > > > > > Main problem with this approach is we cannot save any IO as our IO > > unit is > > > > blocklet not page. Once it is already to memory I really don’t think > > we can > > > > get performance with bloom at page level. I feel the solution would be > > > > efficient only the IO is saved somewhere. > > > > > > > > Our min/max index is efficient because it can prune the files at driver > > > > side > > > > and prune the blocklets and pages at the executor side. It is actually > > > > saving lots of IO. > > > > > > > > Supporting bloom at carbondata file and index level is a good approach > > > > rather than just supporting at page level. My intention is that it > > should > > > > behave just the same as the min/max index. So that we can prune the > > data at > > > > multiple levels. > > > > > > > > The driver side at the block level we can have a bloom with less > > > > probability > > > > percentage and fewer hash functions to control the size as we load it > > to > > > > the > > > > memory. And in the blocklet level we can increase the probability and > > > > hashes > > > > little more for better pruning, gradually at page level we can > > increase the > > > > probability further to have a much better pruning ability. > > > > > > > > > > > > Regards, > > > > Ravindra. > > > > > > > > > > > > > > > > -- > > > > Sent from: > > > > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > > > > > > |
In reply to this post by ravipesala
Hi,
Since the new bloom filter integration is designed to work in the executor side, so in term of keeping it simple, I actually prefer to keep it inside the data file itself instead of keep in a separated index file. So in executor side, only read one file is enough. And it is always better if we can follow the datamap interface. Regards, Jacky -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi Manhua,
Page size will be smaller/Equal to 32k but here point is how much IO benefit will get during query. Since you are adding bloom information inside page metadata, it might benefit in processing time with little bit of IO overhead. So its better to keep Bloom information at blocklet level in footer. -Regards Kumar Vishal On Thu, Nov 28, 2019 at 8:21 PM Jacky Li <[hidden email]> wrote: > Hi, > > Since the new bloom filter integration is designed to work in the executor > side, so in term of keeping it simple, I actually prefer to keep it inside > the data file itself instead of keep in a separated index file. So in > executor side, only read one file is enough. And it is always better if we > can follow the datamap interface. > > Regards, > Jacky > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ >
kumar vishal
|
Free forum by Nabble | Edit this page |