[GitHub] [carbondata] ajantha-bhat commented on issue #3311: [WIP] arrow vector push down

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on issue #3311: [WIP] arrow vector push down

GitBox
ajantha-bhat commented on issue #3311: [WIP] arrow vector push down
URL: https://github.com/apache/carbondata/pull/3311#issuecomment-545854296
 
 
   @jackylk : yes, Already Arrow is supported in carbondata SDK. so carbondata can integrate with other languages like python. **This PR is for perfromance improvement**
   
   Current Arrow integration with carbon, support complex type and primitive type. And conversion from carbonInternalRow to  Arrow vector happens at top layer. If Arrow vector is filled while rows are  read from blocklet. One conversion of CarbonInternalRow can be avoided. which will improve performance a bit and it will be proper integration.
   
   However, Current spark vector doesn't support complex type, so if arrow vectors are pushed down. Arrow also will stop supporting complex type. For a small performance improvement we lose functionality.
   
   So, need to support complex type filling in columnarBatch (vector) first. Then this PR should go.
   As we have complex column reading from presto requirement in carbon 2.0, that requirement will handle this problem. After that requirement is done. My PR can be merged.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services