+1
Best regards!
Yuhai Cen
在2017年10月30日 13:07,Jacky Li<
[hidden email]> 写道:
Hi All,
Currently in carbondata spark integration module CarbonScanRDD, carbon is overriding spark task distribution mechanism. This is required in older version of carbon, because in carbon V1 and V2 format the blocklet size in the file is small, by distributing spark task as per number of blocklet it can improve task parallelism.
However, this feature is not required for V3 format, since the blocklet size now is much bigger, so it is not much benefit we can get from this feature and it makes code very complex. Furthermore, it is not good to manipulate even the executor allocation in carbon layer.
So I suggest to remove this feature.
Regards,
Jacky Li