Login  Register

About bucket feature in carbon

Posted by Jacky Li on Feb 09, 2018; 7:44am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/About-bucket-feature-in-carbon-tp39109.html

Hi,

One year ago, CarbonData 1.0.0 has introduced bucket table feature, it was expected to improve join performance by avoiding shuffling if both tables are bucketed on same column with same number of buckets.

However, after this feature was introduced, personally speaking it was not widely used in the community and it creates maintenance overhead for the developers in the community (for very new Pull Request, all bucket related testcase need to be fixed)

And now carbon has integrated with spark standard partition, developer can add bucket support using spark bucketed table feature in future if it requires.

So, I propose to remove bucket feature after CarbonData 1.3.0 version.
What do you think?

Regards,
Jacky