[Discussion] Implement delete and update feature in carbondata SDK.

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion] Implement delete and update feature in carbondata SDK.

Karan-c980
This post was updated on .
This feature will support the carbondata SDK to delete and update data from
carbondata files.

Details of solution and implementation are mentioned in the document
attached to JIRA.
https://issues.apache.org/jira/browse/CARBONDATA-3865

Thanks
Karanpreet Singh



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Implement delete and update feature in carbondata SDK.

xubo245
+1。

This is neccsarry requirement for users.

Suggestion:

change CarbonSDKUID to common name.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Implement delete and update feature in carbondata SDK.

ravipesala
In reply to this post by Karan-c980
+1
But it should be part of CarbonOutputFormat, not just SDK.  we are planning
to implement even for simpler updates from spark. SDK should call
outputformat to update/delete the records.
Please @[hidden email] <[hidden email]>  comment on it, we
already had a discussion on it.

Regards,
Ravindra.

On Tue, 23 Jun 2020 at 02:15, Karan-c980 <[hidden email]> wrote:

> This feature will support the carbondata SDK to delete and update data from
> carbondata files.
>
> Details of solution and implementation are mentioned in the document
> attached to JIRA.
> https://issues.apache.org/jira/browse/CARBONDATA-3865
>
> Thanks
> Karanpreet Singh
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>


--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Implement delete and update feature in carbondata SDK.

Indhumathi
In reply to this post by Karan-c980
+ 1

I have a question.
For each update and delete operation, carbon will create a delta file to
keep deleted row ids. For sequential update and delete operation using
CarbonSDK, will these delta files will be compacted to single delta file
using horizontal compaction?

Regards,
Indhumathi





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Implement delete and update feature in carbondata SDK.

akashrn5
In reply to this post by ravipesala
+1
As Ravindra said, we should do it at outputFormat level, so that the same
implementation will be used to improve the simpler update performance in
the future.
So design should be like SDK will call update and delete of Outputformat to
do the operations.

Regards,
Akash

On Tue, Jun 23, 2020 at 5:17 PM Ravindra Pesala <[hidden email]>
wrote:

> +1
> But it should be part of CarbonOutputFormat, not just SDK.  we are planning
> to implement even for simpler updates from spark. SDK should call
> outputformat to update/delete the records.
> Please @[hidden email] <[hidden email]>  comment on it, we
> already had a discussion on it.
>
> Regards,
> Ravindra.
>
> On Tue, 23 Jun 2020 at 02:15, Karan-c980 <[hidden email]>
> wrote:
>
> > This feature will support the carbondata SDK to delete and update data
> from
> > carbondata files.
> >
> > Details of solution and implementation are mentioned in the document
> > attached to JIRA.
> > https://issues.apache.org/jira/browse/CARBONDATA-3865
> >
> > Thanks
> > Karanpreet Singh
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>
>
> --
> Thanks & Regards,
> Ravi
>
Reply | Threaded
Open this post in threaded view
|

Re: Implement delete and update feature in carbondata SDK.

Karan-c980
In reply to this post by ravipesala
Hi Ravi,

Thanks for suggesting this change. We will add an API to
CarbonTableOutputFormat to get DeleteDeltaRecordWriter which should be
called from SDK. Please have a look at updated document in jira.

Thanks,
Karanpreet Singh



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Implement delete and update feature in carbondata SDK.

VenuReddy
In reply to this post by Karan-c980
+1

Have a small query regaridng this update API -
public CarbonSDKUID update(String path, String column, String value, String
updColumn,
String updValue);
I believe column argument is column to be matched for the given value
argument. so it is matchColumn & matchValue. Upon match we update updColumn
with the given updValue argument. Question is why not we have map of
updateColumnToValue ? I mean, like the one similar to another update
API(public void update(String path, Expression expression, Map<String,
String> columnToValue);) that you have added in it ?

Thanks,
Venu





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Implement delete and update feature in carbondata SDK.

Karan-c980
Hi Venu,

In public CarbonSDKUID update(String path, String column, String value,
String updColumn, String updValue); Api. We will preapre filterExpression
from arguments column and value and updateColumnToValue mapping from
arguments updColumn and updValue. After preparing this information we will
call this API (public void update(String path, Expression expression,
Map<String,
String> columnToValue)) internally. User can directly call this API (public
void update(String path, Expression expression, Map<String, String>
columnToValue)) by providing filterExpression and UpdateMapping or he can
just pass the column name and value and we will prepare this information
internally.

Thanks,
Karan



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Implement delete and update feature in carbondata SDK.

David CaiQiang
In reply to this post by Karan-c980
+1

Can we add a commit method to support multiple operations at once?

CarbonSDKUID
  .delete(...)
  .delete(...)
  .update(...)
  .commit



-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Best Regards
David Cai