xubo245 commented on a change in pull request #3194: [CARBONDATA-3363] SDK supports read carbon data by given file lists, file or folder
URL:
https://github.com/apache/carbondata/pull/3194#discussion_r285448891
##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
##########
@@ -208,6 +215,18 @@ public CarbonTable getOrCreateCarbonTable(Configuration configuration) throws IO
return carbonFiles;
}
+ private List<CarbonFile> getAllCarbonDataFiles(List fileLists) {
+ List<CarbonFile> carbonFiles = new LinkedList<CarbonFile>();
+ try {
+ for (int i = 0; i < fileLists.size(); i++) {
+ carbonFiles.add(FileFactory.getCarbonFile(fileLists.get(i).toString()));
Review comment:
1. No hadoopConf can work, and old code also haven't hadoopConf: getAllCarbonDataFiles(carbonTable.getTablePath())
2. folder also need list file and getCarbonFIle one by one: org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile#listFiles(boolean, org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter)
```
public List<CarbonFile> listFiles(boolean recursive, CarbonFileFilter fileFilter)
throws IOException {
List<CarbonFile> carbonFiles = new ArrayList<>();
if (null != fileStatus && fileStatus.isDirectory()) {
RemoteIterator<LocatedFileStatus> listStatus = fs.listFiles(fileStatus.getPath(), recursive);
while (listStatus.hasNext()) {
LocatedFileStatus locatedFileStatus = listStatus.next();
CarbonFile carbonFile = FileFactory.getCarbonFile(locatedFileStatus.getPath().toString());
if (fileFilter.accept(carbonFile)) {
carbonFiles.add(carbonFile);
}
}
}
return carbonFiles;
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[hidden email]
With regards,
Apache Git Services