Re: [DISCUSSION] Forceful minor Compaction

Posted by Liang Chen on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Forceful-minor-Compaction-tp10924p11319.html

Hi Kunal

Thank you for taking the good topic for discussion.
First , let us think about : why users want to do forceful minor compaction, which cases?
Current "MAJOR compaction" whether can cover "forceful MINOR compaction" scenarios ?

As we know, compaction is mainly for optimizing index effectiveness by merging for multiple times data loading(segments), so current system provides two options for users to choose :"soft compaction(minor) and strong compaction(major)"。

So i could not find the typical cases to add a new strong compaction(forceful minor)。

Regards
Liang

Kunal Kapoor wrote
Hi all,
I was looking into compaction and had a query regarding the same.
If you have auto compaction turned on and the threshold level is 4,3
Now try loading data 7 times which will make 7 segments. Now as the auto
compaction was on therefore 4 segments will be merged to 0.1.
The segments visible will be 0.1, 4, 5, 6.

When i try to run the compaction command nothing happens because the
threshold level is not reached for compaction to happen.
What if i want to merge the 3 segments i.e (4, 5, 6) and make a level-1
compacted segment.

The proposed solution would be to add a new option to the compaction
command which contains what level of compaction the user would like to do

Example:- alter table carbon_table compact 'minor' level '1'.
This would forcefully combine the segments(4, 5, 6) in to a level 1
compacted segment called 4.1 giving me 2 level-1 compacted segment (0.1 and
4.1)
Similar operation can be done with level-1 compacted segments.