Posted by
xuchuanyin on
Oct 31, 2018; 8:35am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-CarbonReader-performance-improvement-tp66650p67123.html
A question here:
"""
3. Add concurrent reading functionality to Carbon Reader. This can be
enabled by passing the number of splits required by the user. If the user
passes 2 as the split for reader then the user would be returned 2
CarbonReaders with equal number of RecordReaders in each.
The user can then run each CarbonReader instance in a separate thread to
read the data concurrently.
"""
===
1. What does the relationship between `RecordReaders` and number of
`DataFiles`/`Blocklets`/`Pages`/`records`?
2. Will the returning CarbonReaders process almost the same number of
`DataFiles`/`Blocklets`/`Pages`/`records`?
3. After the user get 2 new CarbonReaders from the old CarbonReader, Can
user just close the old CarbonReader immediately? What if the user don't
close it and still use the old CarbonReader alongside with the new
CarbonReaders? -- This question is for potential shared state if you
directly use the RecordReader from the old one.
After all, I think it would be better if the changes are in different PR so
that we can review it easily.
--
Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/