[GitHub] [carbondata] nihal0107 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

GitBox

nihal0107 commented on a change in pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#discussion_r482004050



##########
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
##########
@@ -72,6 +88,75 @@ public void write(Object object) throws IOException {
     }
   }
 
+  private CsvParser buildCsvParser(Configuration conf) {
+    CsvParserSettings settings = CSVInputFormat.extractCsvParserSettings(conf);
+    return new CsvParser(settings);
+  }
+
+  @Override
+  public void validateAndSetDataFiles(CarbonFile[] dataFiles) throws IOException {
+    if (dataFiles == null || dataFiles.length == 0) {
+      throw new RuntimeException("data files can't be empty.");
+    }
+    DataInputStream csvInputStream = null;
+    CsvParser csvParser = this.buildCsvParser(this.configuration);
+    for (CarbonFile dataFile : dataFiles) {
+      try {
+        csvInputStream = FileFactory.getDataInputStream(dataFile.getPath(),
+            -1, this.configuration);
+        csvParser.beginParsing(csvInputStream);
+      } catch (IllegalArgumentException ex) {
+        if (ex.getCause() instanceof FileNotFoundException) {
+          throw new FileNotFoundException("File " + dataFile +
+              " not found to build carbon writer.");
+        }
+        throw ex;
+      } finally {
+        if (csvInputStream != null) {
+          csvInputStream.close();
+        }
+      }
+    }
+    this.dataFiles = dataFiles;
+  }
+
+  /**
+   * Load data of all or selected csv files at given location iteratively.
+   *
+   * @throws IOException
+   */
+  @Override
+  public void write() throws IOException {

Review comment:
       Added in documentation.

##########
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
##########
@@ -594,6 +606,219 @@ public CarbonWriterBuilder withJsonInput(Schema carbonSchema) {
     return this;
   }
 
+  /**
+   * to build a {@link CarbonWriter}, which accepts loading CSV files.
+   *
+   * @param filePath absolute path under which files should be loaded.
+   * @return CarbonWriterBuilder
+   */
+  public CarbonWriterBuilder withCsvPath(String filePath) throws IOException {

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]