Apache CarbonData Dev Mailing List archive

Re: Support SI at Segment level

Posted by maheshrajus on Mar 23, 2021; 7:32am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Support-SI-at-Segment-level-tp106256p107184.html

Hi,

+1 for the feature.
It will make the query faster.

1) With design discussion about the feature(SI to prune as a data frame)
has one property to set.
If the data engine wants to use SI as datamap then need to set. if not
set then it will use plan re-write flow.

So we have to handle this feature in two cases. Can you please check and
update the design as per this?

References:
SI to prune as a data frame
https://docs.google.com/document/d/1VZlRYqydjzBXmZcFLQ4Ty-lK8RQlYVDoEfIId7vOaxk/edit?usp=sharing

Thanks & Regards
Mahesh Raju Somalaraju

On Wed, Feb 17, 2021 at 4:05 PM Nihal <[hidden email]> wrote:

> Hi all,
>
> Currently, if the parent(main) table and SI table don’t have the same valid
> segments then we disable the SI table. And then from the next query
> onwards,
> we scan and prune only the parent table until we trigger the next load or
> REINDEX command (as these commands will make the parent and SI table
> segments in sync). Because of this, queries take more time to give the
> result when SI is disabled.
>
> To solve this problem we are planning to support SI at the segment level.
> It
> means we will not disable SI if the parent and SI table don’t have the same
> segments, while we will do the pruning on Si for all valid segments, and
> for
> the rest of the segments, we will do the pruning on main/parent table.
>
>
> At the time of pruning with the main table in TableIndex.prune, if SI
> exists
> for the corresponding filter then all segments which are not present in the
> SI table will be pruned on the corresponding parent table segment.
>
> Please let me know your thought and input about the same.
>
> Regards
> Nihal kumar ojha
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>