[Discussion]support user specified segments in major compation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion]support user specified segments in major compation

Jin Zhou
Hi community,
Carbondata currently support two types of compaction: Minor and Major
compaction.
CarbonData will do major compaction according to the user defined segment
size. But which segments to be merged are transparent to users.
We plan to extend major compaction to support user specified segments, this
will be useful in cases below:
1) we can precisely control which part of table to be merged when table is
very large.
2) each table can has its own compaction strategy which controlled by user
app.

the proposed syntax:
ALTER TABLE [db_name].table_name COMPACT [SEGMENT seg_id1,seg_id2] 'MAJOR'
in which [SEGMENT seg_id1,seg_id2] is optional and compatible with original
syntax.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion]support user specified segments in major compation

xm_zzc
+1, sounds good about this feature.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion]support user specified segments in major compation

Liang Chen
Administrator
In reply to this post by Jin Zhou
Hi Jin Zhou

Thanks for starting this discussion.
1. For your first proposal : Currently , segment is the system internal
concept, not expose to outside.
Can you provide what exact problems do you encounter?  we can find the
alternative solution for your problems.
----------------------------------------------------------------------------------------
1) we can precisely control which part of table to be merged when table is
very large.

2. For your second proposal, my comment is +1, agree. can you please create
an apache jira for this ?
We would like to invite you to participate in implementing this feature
together :)
-----------------------------------------------------------------------------------------
2) each table can has its own compaction strategy which controlled by user
app.

Regards
Liang


Jin Zhou wrote

> Hi community,
> Carbondata currently support two types of compaction: Minor and Major
> compaction.
> CarbonData will do major compaction according to the user defined segment
> size. But which segments to be merged are transparent to users.
> We plan to extend major compaction to support user specified segments,
> this
> will be useful in cases below:
> 1) we can precisely control which part of table to be merged when table is
> very large.
> 2) each table can has its own compaction strategy which controlled by user
> app.
>
> the proposed syntax:
> ALTER TABLE [db_name].table_name COMPACT [SEGMENT seg_id1,seg_id2] 'MAJOR'
> in which [SEGMENT seg_id1,seg_id2] is optional and compatible with
> original
> syntax.
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion]support user specified segments in major compation

Jin Zhou
@Liang Chen, thank you for your reply.


After seriously thinking about your suggestion, I also think the two
problems should be considered separately.
For problem 2, User specified compaction segments is not a good solution
indeed. I am glad to do some work for this.

For problem 1, I agree with you that segment is not proper to be exposed to
most users in standard APIs because segment is a internal concept to some
extent.
But as we have segment management commands like "show segments for table"
and "alter table compact", it seems that we can not call it a completely
internal concept.
So I think maybe we can support user specified segments only in management
functions like compaction and take it as a hidden advanced usage which is
not recommended in general cases.

Regards
Jin Zhou





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion]support user specified segments in major compation

Liang Chen
Administrator
Hi Jin Zhou

OK, Thanks for your proposal. can you raise one PRs to support the two
features?

Regards
Liang


Jin Zhou wrote

> @Liang Chen, thank you for your reply.
>
>
> After seriously thinking about your suggestion, I also think the two
> problems should be considered separately.
> For problem 2, User specified compaction segments is not a good solution
> indeed. I am glad to do some work for this.
>
> For problem 1, I agree with you that segment is not proper to be exposed
> to
> most users in standard APIs because segment is a internal concept to some
> extent.
> But as we have segment management commands like "show segments for table"
> and "alter table compact", it seems that we can not call it a completely
> internal concept.
> So I think maybe we can support user specified segments only in management
> functions like compaction and take it as a hidden advanced usage which is
> not recommended in general cases.
>
> Regards
> Jin Zhou
>
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/