http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Improve-Simple-updates-and-delete-performance-in-carbondata-tp103294p104644.html
well.
Thank you.
> Hi Community,
>
> Carbondata supports update and delete using spark. So basically update is
> delete + Insert, and delete is just delete
> But we use spark APIs or actions on collections that use spark jobs to do
> them, like map, partition etc
> So Spark adds overhead of task serialization cost, total job execution in
> remote nodes, shuffle etc
> So even just for simple updates, Carbon takes a lot of time, and the same
> for delete as well due to these overheads.
>
> Carbondata 2.1.0 supports update and delete for SDK. This is implemented at
> the carbon file format level
>
> so we can reuse the same for simple updates and deletes and avoid spark
> completely and can perform simple update
>
> and delete on transactional tables using simple java code. This helps to
> avoid all the overhead of spark and make
>
> updates and deletes faster.
>
> I have added an initial V1 design document, please check and give
> comments/inputs/suggestions.
>
>
>
https://docs.google.com/document/d/1-M6xPKZG8l6yAu0c9qo3jdUKhpXHWgUR-h8HeUUmk8M/edit?usp=sharing>
> Thanks,
>
> Regards,
> Akash R Nilugal
>