Re: [DISCUSSION] Initiating Apache CarbonData-1.1.0 incubating Release

Posted by Liang Chen on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Initiating-Apache-CarbonData-1-1-0-incubating-Release-tp9623p9648.html

Hi

+1 for starting to prepare new release 1.1
Great progress, new file format V3 would significantly improve performance.

Regards
Liang

2017-03-26 10:46 GMT+05:30 Ravindra Pesala <[hidden email]>:

> Hi All,
>
> As planned we are going to release Apache CarbonData-1.1.0. Please discuss
> and vote for it to initiate 1.1.0 release, i will start to prepare the
> release after 3-days of discussion. It will have following features.
>
>  1. Introduced new data format called V3(version 3).
>
>   Improves the sequential IO by keeping larger size blocklets.So read
> larger data at once to memory.
>   Introduced pages with size of 32000 each for every column inside
> blocklet. And min/max is maintained for each page to improve the filter
> queries.
>   Improved compression/decompression of row pages.
> Our all performance is improved by 50% compare to old format as per TPC-H
> benchmark results.
>
>
> 2. Alter table support in carbondata. (Only for Spark 2.1)
>
>    Support renaming of existing table.
>    Support adding of new column.
>    Support removing of new column.
>    Support Upcasting(Ex: from smallint to int) of datatype
>
>
> 3. Supported Batch Sort to improve dataloading performance.
>
>    It makes sort step as non blocking step and capable of sorting whole
> batch in memory and converts to carbondata file.
>
>
> 4. Improved Single pass load by upgrading to latest netty framework and
> launched dictionary client for each loading
>
> 5. Supported range filters to combine the between filters to one filter to
> improve the filter performance.
>
> 6. Apart from features many bugs and improvements are done in this release.
>
> --
> Thanks & Regards,
> Ravindra
>



--
Regards
Liang