[GitHub] [carbondata] ravipesala commented on a change in pull request #3193: [CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ravipesala commented on a change in pull request #3193: [CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK

GitBox
ravipesala commented on a change in pull request #3193: [CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK
URL: https://github.com/apache/carbondata/pull/3193#discussion_r283760577
 
 

 ##########
 File path: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java
 ##########
 @@ -94,6 +102,62 @@ public T readNextRow() throws IOException, InterruptedException {
     return currentReader.getCurrentValue();
   }
 
+  /**
+   * Carbon reader will fill the arrow vector after reading the carbondata files.
+   * This arrow byte[] can be used to create arrow table and used for in memory analytics
+   *
+   * Note: create a reader at blocklet level, so that arrow byte[] will not exceed INT_MAX
+   *
+   * @param carbonSchema
+   * @return
+   * @throws Exception
+   */
+  public byte[] readArrowBatch(Schema carbonSchema) throws Exception {
 
 Review comment:
   I feel it is better to add new class ArrowCarbonReader and add these 3 methods to it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services