Hi all,
Let's start the discussion regarding the partition table. To support partition table, what we should do? 1. create table with partition to support Range Partitioning, Hash Partitioning, List Partitioning and Composite Partitioning, write the partition info to schema. 2. during data loading, re-partition the input data, start a task process a partition, write partition information to footer and index file. 3. during data query, prune B+Tree by partition if the filter contain the partition column. or prune data blocks by partition when there is only partition column predicate. 4. optimizer the join performance of two partition tables if partition column is the join column. Any thoughts, comments and questions ? Thanks! Best Regards David
Best Regards
David Cai |
additinal suggestion:
1、support at least two level partition 2、build the B+Tree by partition column shoud split the segment and make it small and may speed load data in carbondata 3、delete data by partition column best regards fish At 2017-03-31 23:42:07, "QiangCai" <[hidden email]> wrote: >Hi all, > > Let's start the discussion regarding the partition table. > > To support partition table, what we should do? > > 1. create table with partition to support Range Partitioning, Hash >Partitioning, List Partitioning and Composite Partitioning, write the >partition info to schema. > > 2. during data loading, re-partition the input data, start a task process >a partition, write partition information to footer and index file. > > 3. during data query, prune B+Tree by partition if the filter contain the >partition column. or prune data blocks by partition when there is only >partition column predicate. > > 4. optimizer the join performance of two partition tables if partition >column is the join column. > > Any thoughts, comments and questions ? > > Thanks! > >Best Regards >David > > > >-- >View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-support-new-feature-Partition-Table-tp9935.html >Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com. |
comments inline
> 在 2017年4月1日,下午5:06,a <[hidden email]> 写道: > > additinal suggestion: > 1、support at least two level partition I think we can let user specify the partition columns, it can be multiple columns together to form a partition key. Is this what you mean by two level partition? Generally speaking, partition on multiple columns usually leads to small file issues, which we may want to avoid. > 2、build the B+Tree by partition column shoud split the segment and make it small and may speed load data in carbondata When using partitioning, it will slower down the loading process as it needs shuffle. But benefit is that queries have filter column on partition key will be faster. > 3、delete data by partition column > This could be a future feature in our roadmap after partition feature is supported. > > > best regards > fish > > At 2017-03-31 23:42:07, "QiangCai" <[hidden email]> wrote: >> Hi all, >> >> Let's start the discussion regarding the partition table. >> >> To support partition table, what we should do? >> >> 1. create table with partition to support Range Partitioning, Hash >> Partitioning, List Partitioning and Composite Partitioning, write the >> partition info to schema. >> >> 2. during data loading, re-partition the input data, start a task process >> a partition, write partition information to footer and index file. >> >> 3. during data query, prune B+Tree by partition if the filter contain the >> partition column. or prune data blocks by partition when there is only >> partition column predicate. >> >> 4. optimizer the join performance of two partition tables if partition >> column is the join column. >> >> Any thoughts, comments and questions ? >> >> Thanks! >> >> Best Regards >> David >> >> >> >> -- >> View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-support-new-feature-Partition-Table-tp9935.html >> Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com. |
Free forum by Nabble | Edit this page |