[
https://issues.apache.org/jira/browse/CARBONDATA-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jacky Li updated CARBONDATA-307:
--------------------------------
Description:
Currently, there are two read path in carbon-spark module:
1. CarbonContext => CarbonDatasourceRelation => CarbonScanRDD => QueryExecutor
In this case, CarbonScanRDD uses CarbonInputFormat to get the split, and use QueryExecutor for scan.
2. SqlContext => CarbonDatasourceHadoopRelation => CarbonHadoopFSRDD => CarbonRecordReader => QueryExecutor
In this case, CarbonHadoopFSRDD uses CarbonInputFormat to do both get split and scan
Because of this, there are unnecessary duplicate code, they need to be unified.
was:
Currently, there are two read path in carbon-spark module:
1. CarbonContext => CarbonDatasourceRelation => CarbonScanRDD => QueryExecutor
In this case, CarbonScanRDD uses CarbonInputFormat to get the split, and use QueryExecutor for scan.
2. SqlContext => CarbonDatasourceHadoopRelation => CarbonHadoopFSRDD => CarbonRecordReader
In this case, CarbonHadoopFSRDD uses CarbonInputFormat to do both get split and scan
Because of this, there are unnecessary duplicate code, they need to be unified.
> Support executor side scan using CarbonInputFormat
> --------------------------------------------------
>
> Key: CARBONDATA-307
> URL:
https://issues.apache.org/jira/browse/CARBONDATA-307> Project: CarbonData
> Issue Type: Improvement
> Components: spark-integration
> Affects Versions: 0.1.0-incubating
> Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Currently, there are two read path in carbon-spark module:
> 1. CarbonContext => CarbonDatasourceRelation => CarbonScanRDD => QueryExecutor
> In this case, CarbonScanRDD uses CarbonInputFormat to get the split, and use QueryExecutor for scan.
> 2. SqlContext => CarbonDatasourceHadoopRelation => CarbonHadoopFSRDD => CarbonRecordReader => QueryExecutor
> In this case, CarbonHadoopFSRDD uses CarbonInputFormat to do both get split and scan
> Because of this, there are unnecessary duplicate code, they need to be unified.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)