[GitHub] [carbondata] Zhangshunyu opened a new pull request #3600: insert stages list files using iterator

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
kunal642 commented on a change in pull request #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#discussion_r375654504
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java
 ##########
 @@ -162,6 +163,20 @@ public boolean delete() {
     return carbonFiles;
   }
 
+  @Override
+  public CarbonFile[] listFiles(boolean recursive, int maxCount)
+      throws IOException {
+    List<CarbonFile> carbonFiles = new ArrayList<>();
 
 Review comment:
   Why not just call the existing listFiles() and ignore the maxCount for this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
kunal642 commented on a change in pull request #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#discussion_r375654577
 
 

 ##########
 File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
 ##########
 @@ -502,7 +502,20 @@ case class CarbonInsertFromStageCommand(
   ): Array[(CarbonFile, CarbonFile)] = {
     val dir = FileFactory.getCarbonFile(loadDetailsDir, hadoopConf)
     if (dir.exists()) {
-      val allFiles = dir.listFiles()
+      val allFiles = dir match {
 
 Review comment:
   This check can be removed once you fix the LocalCarbonFile comment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on a change in pull request #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
niuge01 commented on a change in pull request #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#discussion_r375689363
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java
 ##########
 @@ -162,6 +163,20 @@ public boolean delete() {
     return carbonFiles;
   }
 
+  @Override
+  public CarbonFile[] listFiles(boolean recursive, int maxCount)
+      throws IOException {
+    List<CarbonFile> carbonFiles = new ArrayList<>();
+    int counter = 0;
+    Iterator it = FileUtils.iterateFiles(file, null, recursive);
+    while (it.hasNext() && counter < maxCount) {
+      CarbonFile carbonFile = new LocalCarbonFile((File) it.next());
+      carbonFiles.add(carbonFile);
+      counter++;
+    }
+    return carbonFiles.toArray(new CarbonFile[0]);
 
 Review comment:
   Use CarbonFile[0] is better than use CarbonFile[carbonFiles.size()].
   Is need not new a empty array every time here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#issuecomment-582823429
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/167/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#issuecomment-582824476
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1870/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#issuecomment-582848708
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/170/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#issuecomment-582870968
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1872/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
jackylk commented on issue #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600#issuecomment-583206314
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3600: [CARBONDATA-3678] optimize list files in insert stage

GitBox
In reply to this post by GitBox
asfgit closed pull request #3600: [CARBONDATA-3678] optimize list files in insert stage
URL: https://github.com/apache/carbondata/pull/3600
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
12