[
https://issues.apache.org/jira/browse/CARBONDATA-436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jihong MA updated CARBONDATA-436:
---------------------------------
Description: Currently, the blocklet size is based on the row counts within the blocklet. The default value(120000) is small for hdfs io. If we increase the value, which may cause too many Young-GC when we scan many columns, instead, we can extend the configuration with respect to the actual size of the blocklet. (was: Currently, the blocklet size is the row counts in the blocklet. The default value(120000) is small for hdfs io. If we increase the value, which may cause too many Young-GC when we scan many columns. Like parquet, its row group size can be configed, and using hdfs block size as its default value.)
Summary: Make blocklet size configuration respect to the actual size (in terms of byte) of the blocklet (was: change blocklet size related to the raw size of data )
> Make blocklet size configuration respect to the actual size (in terms of byte) of the blocklet
> ----------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-436
> URL:
https://issues.apache.org/jira/browse/CARBONDATA-436> Project: CarbonData
> Issue Type: Sub-task
> Reporter: suo tong
> Assignee: QiangCai
>
> Currently, the blocklet size is based on the row counts within the blocklet. The default value(120000) is small for hdfs io. If we increase the value, which may cause too many Young-GC when we scan many columns, instead, we can extend the configuration with respect to the actual size of the blocklet.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)