Hi,
Problem: Current bloom filter is calculated at the blocklet level and if the cardinality of a column is more and number of blocklets loaded are more then the bloom size will become bigger. In few of our use cases, it is grown till 60 GB also, and it might increase when data grows or add more bloom datamap columns. It is not practical to keep this large amounts of bloom data in driver memory. So currently we have option to launch a distributed job to prune the data using bloom datamap, but it takes more time as it needs to load bloom data to all executor memories and then prune it. And also there is no guarantee that subsequent queries will reuse the same loaded bloom data from executor as spark scheduler does not guarantee it. Solution: Create hierarchal bloom index and pruning. - We can create a bloom in a hierarchal way, it means maintain bloom at the task (carbonindex) and at blocklet level. - While loading the data we can create bloom also at task level along with blocklet level. Bloom at task level is very small compare to bloom at a blocklet level so we can load it to the driver memory. - Maintain a footer in current blocklet level bloom file to get the blocklet blooms offset information for a block. This information will be used in executor to get the blocklet blooms for the corresponding block during the query. This footer information also we load to driver memory. - During pruning, first level pruning happens at task level using task level bloom and get the pruned blocks. Launch the main job along with respective block bloom information which is already available in the footer of the file. - In AbstractQueryExecutor first read the blooms for respective blocks using footer information sent from the driver and prune the blocklets. In this way, we maintain only less information in memory and also avoid launching multiple jobs for pruning. And also reads only necessary blocklet booms during pruning instead of reading all. This is a draft discussion proposal and we need to change the pruning flow design for datamap to implement it. But I feel this types of coarse-grained datamaps we should avoid launching job for pruning. I can output the design document after the initial discussion. -- Thanks & Regards, Ravindra. |
Hi, Ravindra
Using hierarchical index was in our previous plan too, We wanted to build Block/task level index at the same time, but we postponed this feature due to the following reasons: 1. It requires different configurations (bloom_size, bloom_fpp) for different index level and it will requires more configurations form the user. 2. The size of high level index file may also be big if the configuration is configured to meet the query performance in some scenarios such as all values are unique. 3. If improper configuration is set, the high level index may not work well as expected. I‘m wondering your thoughts on these points. Besides, I’ve other thoughts about this BloomFilter issue: I think the following methods can be taken to optimize bloomfilter index: 1. Carbon manages the configurations silently which means a. We can set the configurations on the fly. For example, we can know the unique values for the index column per blocklet, then we can set the corresponding bloomfilter configurations properly. b. We can introduce other implementation of bloomfilter that supports dynamic growth 2. We can integrate the bloomfilter index as a native index by a. Keeping the index in carbondata fact files just like ORC/Parquet do. b. Leaving the index files separately from the carbondata fact files. And before we open a blocklet for filtering in executor side, we can read the corresponding bloomindex files and do some judgement. Moreover, another potential feature about bloomfilter datamap may be that: 1. Support Array datatype as BloomFilter index column. a. Support ‘in‘ operator for Array datatype. Sent from laptop From: Ravindra Pesala Sent: Wednesday, November 28, 2018 20:45 To: dev Subject: [Discussion] Bloom memory and pruning optimisation using hierarchical pruning. Hi, Problem: Current bloom filter is calculated at the blocklet level and if the cardinality of a column is more and number of blocklets loaded are more then the bloom size will become bigger. In few of our use cases, it is grown till 60 GB also, and it might increase when data grows or add more bloom datamap columns. It is not practical to keep this large amounts of bloom data in driver memory. So currently we have option to launch a distributed job to prune the data using bloom datamap, but it takes more time as it needs to load bloom data to all executor memories and then prune it. And also there is no guarantee that subsequent queries will reuse the same loaded bloom data from executor as spark scheduler does not guarantee it. Solution: Create hierarchal bloom index and pruning. - We can create a bloom in a hierarchal way, it means maintain bloom at the task (carbonindex) and at blocklet level. - While loading the data we can create bloom also at task level along with blocklet level. Bloom at task level is very small compare to bloom at a blocklet level so we can load it to the driver memory. - Maintain a footer in current blocklet level bloom file to get the blocklet blooms offset information for a block. This information will be used in executor to get the blocklet blooms for the corresponding block during the query. This footer information also we load to driver memory. - During pruning, first level pruning happens at task level using task level bloom and get the pruned blocks. Launch the main job along with respective block bloom information which is already available in the footer of the file. - In AbstractQueryExecutor first read the blooms for respective blocks using footer information sent from the driver and prune the blocklets. In this way, we maintain only less information in memory and also avoid launching multiple jobs for pruning. And also reads only necessary blocklet booms during pruning instead of reading all. This is a draft discussion proposal and we need to change the pruning flow design for datamap to implement it. But I feel this types of coarse-grained datamaps we should avoid launching job for pruning. I can output the design document after the initial discussion. -- Thanks & Regards, Ravindra. |
Hi xuchuanyin,
1. There is no need to maintain separate bloom configurations for task level bloom as we use same configuration (size and fpp) provided by user. We just create task level bloom with the same configuration along with blocklet bloom. 2. Size of bloom is much smaller compared to blocklet level bloom, but yes if data or tasks increases it will also increase over the time. But still, we can use it in driver lru cache as we may not query all the data all time so it keeps only most recently used data only. And also we can skip driver side bloom pruning and do only at executor side if the bloom is very large. Yes, we can maintain bloom at carbondata footer level like parquet/orc but we will lose the datamap framework features like lazy datamap loading or creating. Instead, we can maintain bloom in separate files but maintain the footer to the file as mentioned in my earlier mail. Regards, Ravindra. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
create task level bloom with the same configuration along with blocklet
bloom. === Since the number of distinct values in the task level is much bigger than that in blocklet level, using the same configuration may cause the task level bloomfilter work inefficiently. This is just what I’m concerned about. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |