Apache CarbonData Dev Mailing List archive

when plan to implemnt merge operation

Classic

List

Threaded

4 messages Options

孙而焓

May 26, 2017; 4:47am

when plan to implemnt merge operation

hello，
my team is trying to implement merge operation,
merge scenario like the following:
compare records in two tables(same structure,different amout of records)and modify big one ,
1. if small.id=big.id and small.date<big.data then update bigtable,
2. if small.id not in big then insert bigtable.
our solution to this scenario is:
1. append the smalltable into bigtable
2. delete records from bigtable which have have the same id and leave the one which have the biggest date in bigtable in the back concurrently.
My question is if Apache Carbon Community has plan to implement similar operation?

[hidden email]

孙而焓【FFCS研究院】

Liang Chen-2

May 26, 2017; 3:50pm

Re: when plan to implemnt merge operation

Hi

1. Can you give a specific example, let us first understand your
requirement exactly. Like below, to provide some fact data.

ID date name age
1 2017-05-1 carbon 21
2 2017-05-23 spark 30
......

2. I would like to kindly invite your team guys to participate in
contributing this feature if it is confirmed by dev community.

Regards
Liang

2017-05-26 12:47 GMT+08:00 [hidden email] <[hidden email]>:

> hello，
> my team is trying to implement merge operation,
> merge scenario like the following:
> compare records in two tables(same structure,different amout
> of records)and modify big one ,
> 1. if small.id=big.id and small.date<big.data then update
> bigtable,
> 2. if small.id not in big then insert bigtable.
> our solution to this scenario is:
> 1. append the smalltable into bigtable
> 2. delete records from bigtable which have have the same id
> and leave the one which have the biggest date in bigtable in the back
> concurrently.
> My question is if Apache Carbon Community has plan to implement similar
> operation?
>
>
>
> [hidden email]
>

... [show rest of quote]

孙而焓

May 27, 2017; 1:45am

Re: when plan to implemnt merge operation

merge example like this:
small:
id updatatime
1 9:00
2 8:00
6 9:00

big:
id updatetime
1 10:00
2 7:00
3 9:00
4 9:00
5 9:00

as for record in small:
id=1,small.update<big.update,do nothing;
id=2,small.update>bigdate.update,update big;
id=6,big doesn't have that record,insert big;

for our solution:
append all small record to big,
big:
id updatetime
1 10:00
2 7:00(to be delete)
3 9:00
4 9:00
5 9:00
1 9:00(to be deleted)
2 8:00
6 9:00
then,for records in big which have the same id,max updatetime stays.

孙而焓【FFCS研究院】

Liang Chen-2

May 29, 2017; 1:13pm

Re: when plan to implemnt merge operation

Hi

For your this case, use delete and append whether can meet your
requirements?

Obviously , merge would impact index, so we should find out one best way to
implement this feature.
please other people give some comment also.

Regards
Liang

2017-05-27 9:45 GMT+08:00 Mic Sun <[hidden email]>:

> merge example like this:
> small:
> id updatatime
> 1 9:00
> 2 8:00
> 6 9:00
>
> big:
> id updatetime
> 1 10:00
> 2 7:00
> 3 9:00
> 4 9:00
> 5 9:00
>
> as for record in small:
> id=1,small.update<big.update,do nothing;
> id=2,small.update>bigdate.update,update big;
> id=6,big doesn't have that record,insert big;
>
> for our solution:
> append all small record to big,
> big:
> id updatetime
> 1 10:00
> 2 7:00(to be delete)
> 3 9:00
> 4 9:00
> 5 9:00
> 1 9:00(to be deleted)
> 2 8:00
> 6 9:00
> then,for records in big which have the same id,max updatetime stays.
>
>
>
> --
> View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/when-plan-to-
> implemnt-merge-operation-tp13228p13288.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>

... [show rest of quote]