Re: Size control of minot compaction

Posted by Ajantha Bhat on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Size-control-of-minor-compaction-tp103414p103437.html

Hi Zhangshunyu,

For this scenario specific cases, the user can use custom compaction by
mentioning the segment id which needs to be considered for compaction.

Also if you just want to do size based, major compaction can be used.

So, why are you thinking to support size based minor compaction? It will
basically lose the meaning of combining files based on number.

If you are using minor compaction for this scenario just because it
supports auto compaction, then may be we can check about supporting
"auto_compaction_type" = "minor/major"
option or the user can write some script to trigger major compaction
automatically.

Thanks,
Ajantha


On Mon, 23 Nov, 2020, 12:11 pm Zhangshunyu, <[hidden email]> wrote:

> Hi dev,
> Currentlly, minor compaction only consider the num of segments and major
> compaction only consider the SUM size of segments, but consider a scenario
> that the user want to use minor compaction by the num of segments but he
> dont want to merge the segment whose datasize larger the threshold for
> example 2GB, as it is no need to merge so much big segment and it is time
> costly.
> so we need to add a parameter to control the threshold of segment included
> in minor compaction, so that the user can specify the segment not included
> in minor compaction once the datasize exeed the threshold, of course
> default
> value must be threre.
>
> So, what's your opinion about this?
>
>
>
> -----
> My English name is Sunday
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>