Hi All,
As planned we are going to release Apache CarbonData-1.1.0. Please discuss and vote for it to initiate 1.1.0 release, i will start to prepare the release after 3-days of discussion. It will have following features. 1. Introduced new data format called V3(version 3). Improves the sequential IO by keeping larger size blocklets.So read larger data at once to memory. Introduced pages with size of 32000 each for every column inside blocklet. And min/max is maintained for each page to improve the filter queries. Improved compression/decompression of row pages. Our all performance is improved by 50% compare to old format as per TPC-H benchmark results. 2. Alter table support in carbondata. (Only for Spark 2.1) Support renaming of existing table. Support adding of new column. Support removing of new column. Support Upcasting(Ex: from smallint to int) of datatype 3. Supported Batch Sort to improve dataloading performance. It makes sort step as non blocking step and capable of sorting whole batch in memory and converts to carbondata file. 4. Improved Single pass load by upgrading to latest netty framework and launched dictionary client for each loading 5. Supported range filters to combine the between filters to one filter to improve the filter performance. 6. Apart from features many bugs and improvements are done in this release. -- Thanks & Regards, Ravindra |
Administrator
|
Hi
+1 for starting to prepare new release 1.1 Great progress, new file format V3 would significantly improve performance. Regards Liang 2017-03-26 10:46 GMT+05:30 Ravindra Pesala <[hidden email]>: > Hi All, > > As planned we are going to release Apache CarbonData-1.1.0. Please discuss > and vote for it to initiate 1.1.0 release, i will start to prepare the > release after 3-days of discussion. It will have following features. > > 1. Introduced new data format called V3(version 3). > > Improves the sequential IO by keeping larger size blocklets.So read > larger data at once to memory. > Introduced pages with size of 32000 each for every column inside > blocklet. And min/max is maintained for each page to improve the filter > queries. > Improved compression/decompression of row pages. > Our all performance is improved by 50% compare to old format as per TPC-H > benchmark results. > > > 2. Alter table support in carbondata. (Only for Spark 2.1) > > Support renaming of existing table. > Support adding of new column. > Support removing of new column. > Support Upcasting(Ex: from smallint to int) of datatype > > > 3. Supported Batch Sort to improve dataloading performance. > > It makes sort step as non blocking step and capable of sorting whole > batch in memory and converts to carbondata file. > > > 4. Improved Single pass load by upgrading to latest netty framework and > launched dictionary client for each loading > > 5. Supported range filters to combine the between filters to one filter to > improve the filter performance. > > 6. Apart from features many bugs and improvements are done in this release. > > -- > Thanks & Regards, > Ravindra > -- Regards Liang |
Free forum by Nabble | Edit this page |