Re: [Discussion] CarbonReader performance improvement
Posted by
Naman Rastogi on
Oct 31, 2018; 8:53am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-CarbonReader-performance-improvement-tp66650p67133.html
> After all, I think it would be better if the changes are in different PR so
> that we can review it easily.
>
*
https://github.com/apache/carbondata/pull/2850<
https://github.com/apache/carbondata/pull/2850>*
>
>
> *1. What does the relationship between `RecordReaders` and number of
> `DataFiles`/`Blocklets`/`Pages`/`records`? 2. Will the returning
> CarbonReaders process almost the same number of
> `DataFiles`/`Blocklets`/`Pages`/`records`?*
>
*> The number of CarbonRecordReader / RecordReader in CarbonReader is same
as the no. of files it is reading / going to read.*
*Please go through the PR to get more details on the implementation.*
>
>
>
>
>
> * 3. After the user get 2 new CarbonReaders from the old CarbonReader, Can
> user just close the old CarbonReader immediately? What if the user don't
> close it and still use the old CarbonReader alongside with the new
> CarbonReaders? -- This question is for potential shared state if you
> directly use the RecordReader from the old one.*
*> The user does not have to close it explicitily. It cant be closed, if it
gets closed, the children CarbonReader will not iterate over the files.*
*But it Is taken care internally inside split, such that the original
CarbonReader will not be able to read the files.*
---
Thanks
Naman Rastogi