[Discussion] Bloom memory and pruning optimisation using hierarchical pruning.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion] Bloom memory and pruning optimisation using hierarchical pruning.

ravipesala
Hi,

Problem:
 Current bloom filter is calculated at the
blocklet level and if the cardinality of a
column is more and number of blocklets loaded are more then the bloom
size will become bigger. In few of our use
cases, it is grown till 60 GB also, and it might
increase when data grows or add more bloom datamap columns.
 It is not practical to keep this large amounts of bloom data in
driver memory.  So currently we have option to  launch a distributed
job to prune the data using bloom datamap, but it takes more time as
it needs to load bloom data to all executor memories and then prune
it. And also there is no guarantee that subsequent queries will reuse
the same loaded bloom data from executor as spark scheduler does not
guarantee it.

Solution: Create hierarchal bloom index and pruning.

   - We can create a bloom in a
   hierarchal way, it means maintain bloom at the task (carbonindex)
   and at blocklet level.
   - While loading
   the data we can create bloom also at task level along with blocklet level.
   Bloom at task level is very small compare to bloom at a
   blocklet level so we can load it to the driver memory.
   - Maintain a footer in current blocklet level bloom file to get the
   blocklet blooms offset information for a block.
   This information will be used in executor to get the blocklet
blooms for the corresponding block during the
   query. This footer information also we load to driver memory.
   - During pruning,
   first level pruning happens at task level using task level bloom
and get the pruned blocks. Launch the main job along with respective
block bloom information which is already available in the
   footer of the file.
   - In AbstractQueryExecutor first read the blooms for respective
blocks using footer information sent from the
   driver and prune the blocklets.

 In this way, we maintain only less information in memory and also
avoid launching multiple jobs
for pruning. And also reads only necessary blocklet booms during pruning
instead of reading all.

This is a draft discussion proposal and we need to change the pruning flow
design for datamap to implement it. But I feel this types of coarse-grained
datamaps we should avoid launching job for pruning.  I can output the
design document after the initial discussion.

--
Thanks & Regards,
Ravindra.
Reply | Threaded
Open this post in threaded view
|

RE: [Discussion] Bloom memory and pruning optimisation using hierarchical pruning.

xuchuanyin
Hi, Ravindra

Using hierarchical index was in our previous plan too, We wanted to build Block/task level index at the same time,  but we postponed this feature due to the following reasons:
1. It requires different configurations (bloom_size, bloom_fpp) for different index level and it will requires more configurations form the user.
2. The size of high level index file may also be big if the configuration is configured to meet the query performance in some scenarios such as all values are unique.
3. If improper configuration is set, the high level index may not work well as expected.

I‘m wondering your thoughts on these points. Besides, I’ve other thoughts about this BloomFilter issue:

I  think the following methods can be taken to optimize bloomfilter index:
1. Carbon manages the configurations silently which means
a. We can set the configurations on the fly. For example, we can know the unique values for the index column per blocklet, then we can set the corresponding bloomfilter configurations properly.
b. We can introduce other implementation of bloomfilter that supports dynamic growth
2. We can integrate the bloomfilter index as a native index by
a. Keeping the index in carbondata fact files just like ORC/Parquet do.
b. Leaving the index files separately from the carbondata fact files. And before we open a blocklet for filtering in executor side, we can read the corresponding bloomindex files and do some judgement.

Moreover, another potential feature about bloomfilter datamap may be that:
1. Support Array datatype as  BloomFilter index column.
a. Support ‘in‘ operator for Array datatype.      

Sent from laptop

From: Ravindra Pesala
Sent: Wednesday, November 28, 2018 20:45
To: dev
Subject: [Discussion] Bloom memory and pruning optimisation using hierarchical pruning.

Hi,

Problem:
 Current bloom filter is calculated at the
blocklet level and if the cardinality of a
column is more and number of blocklets loaded are more then the bloom
size will become bigger. In few of our use
cases, it is grown till 60 GB also, and it might
increase when data grows or add more bloom datamap columns.
 It is not practical to keep this large amounts of bloom data in
driver memory.  So currently we have option to  launch a distributed
job to prune the data using bloom datamap, but it takes more time as
it needs to load bloom data to all executor memories and then prune
it. And also there is no guarantee that subsequent queries will reuse
the same loaded bloom data from executor as spark scheduler does not
guarantee it.

Solution: Create hierarchal bloom index and pruning.

   - We can create a bloom in a
   hierarchal way, it means maintain bloom at the task (carbonindex)
   and at blocklet level.
   - While loading
   the data we can create bloom also at task level along with blocklet level.
   Bloom at task level is very small compare to bloom at a
   blocklet level so we can load it to the driver memory.
   - Maintain a footer in current blocklet level bloom file to get the
   blocklet blooms offset information for a block.
   This information will be used in executor to get the blocklet
blooms for the corresponding block during the
   query. This footer information also we load to driver memory.
   - During pruning,
   first level pruning happens at task level using task level bloom
and get the pruned blocks. Launch the main job along with respective
block bloom information which is already available in the
   footer of the file.
   - In AbstractQueryExecutor first read the blooms for respective
blocks using footer information sent from the
   driver and prune the blocklets.

 In this way, we maintain only less information in memory and also
avoid launching multiple jobs
for pruning. And also reads only necessary blocklet booms during pruning
instead of reading all.

This is a draft discussion proposal and we need to change the pruning flow
design for datamap to implement it. But I feel this types of coarse-grained
datamaps we should avoid launching job for pruning.  I can output the
design document after the initial discussion.

--
Thanks & Regards,
Ravindra.


Reply | Threaded
Open this post in threaded view
|

RE: [Discussion] Bloom memory and pruning optimisation using hierarchical pruning.

ravipesala
Hi xuchuanyin,

1. There is no need to maintain separate bloom configurations for task level
bloom as we use same configuration (size and fpp) provided by user. We just
create task level bloom with the same configuration along with blocklet
bloom.

2. Size of bloom is much smaller compared to blocklet level bloom, but yes
if data or tasks increases it will also increase over the time. But still,
we can use it in driver lru cache as we may not query all the data all time
so it keeps only most recently used data only.  And also we can skip driver
side bloom pruning and do only at executor side if the bloom is very large.


Yes, we can maintain bloom at carbondata footer level like parquet/orc but
we will lose the datamap framework features like lazy datamap loading or
creating.  Instead, we can maintain bloom in separate files but maintain the
footer to the file as mentioned in my earlier mail.

Regards,
Ravindra.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

RE: [Discussion] Bloom memory and pruning optimisation using hierarchical pruning.

xuchuanyin
create task level bloom with the same configuration along with blocklet
bloom.
===
Since the number of distinct values in the task level is much bigger than
that in blocklet level, using the same configuration may cause the task
level bloomfilter work inefficiently.
This is just what I’m concerned about.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/