[jira] [Updated] (CARBONDATA-3056) Implement concurrent reading through CarbonReader

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (CARBONDATA-3056) Implement concurrent reading through CarbonReader

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Naman Rastogi updated CARBONDATA-3056:
--------------------------------------
    Summary: Implement concurrent reading through CarbonReader  (was: Implement Concurrent SDK Reader)

> Implement concurrent reading through CarbonReader
> -------------------------------------------------
>
>                 Key: CARBONDATA-3056
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3056
>             Project: CarbonData
>          Issue Type: Sub-task
>            Reporter: Naman Rastogi
>            Priority: Minor
>
> The current reading through SDK is slow as in CarbonReader, we are reading the carbondata files sequentially, even though we have individual CarbonRecordReader for each file. We can parallelize this by adding an API in CarbonReader class
> *List<CarbonReader> readers = CarbonReader.split(numSplits)*
> which returns a list of CarbonReaders, which can be used to read parallelly, as reading each file is independent of other files.
>  
> This enables the SDK user to read the files as it is, or in a multithreaded environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)