There are lots of gc when carbon is processing more number of records
during query, which is impacting carbon query performance.To solve this gc problem happening when query output is too huge or when more number of records are processed, I would like to propose below solution. Currently we are storing all the data which is read during query from carbon data file in heap, when number of query output is huge it is causing more gc. Instead of storing in heap we can store this data in offheap and will clear when scanning is finished for that query. Please vote and comment for above proposal. -Regards KUmar Vishal
kumar vishal
|
+1
Good idea to avoid gc overhead.we need to be careful in clearing memory after use On Tue, 13 Dec 2016 at 2:17 PM, Kumar Vishal <[hidden email]> wrote: > There are lots of gc when carbon is processing more number of records > during query, which is impacting carbon query performance.To solve this gc > problem happening when query output is too huge or when more number of > records are processed, I would like to propose below solution. > > Currently we are storing all the data which is read during query from > carbon data file in heap, when number of query output is huge it is causing > more gc. Instead of storing in heap we can store this data in offheap and > will clear when scanning is finished for that query. > > Please vote and comment for above proposal. > > -Regards > KUmar Vishal > |
Administrator
|
In reply to this post by kumarvishal09
Hi
+1,Store data in offheap to avoid gc problem , the solution will help performance more.
|
In reply to this post by kumarvishal09
+1 Heap should not store data ,it should be used to store runtime temp data.
|
+1, I have suffered from gc problem. As I understand, the BatchResult will
be cached and continue to be kept in memory for a little long term, which cause a lot of data be moved from Young to Old. It is better to move it to off-heap. 2016-12-20 11:57 GMT+08:00 ZhuWilliam <[hidden email]>: > +1 Heap should not store data ,it should be used to store runtime temp > data. > > > > -- > View this message in context: http://apache-carbondata- > mailing-list-archive.1130556.n5.nabble.com/Improvement- > Carbon-query-gc-problem-tp4322p4718.html > Sent from the Apache CarbonData Mailing List archive mailing list archive > at Nabble.com. > |
Hi All,
Please review Pr#450 https://github.com/apache/incubator-carbondata/pull/450/ -Regards Kumar Vishal On Tue, Dec 20, 2016 at 1:13 PM, An Lan <[hidden email]> wrote: > +1, I have suffered from gc problem. As I understand, the BatchResult will > be cached and continue to be kept in memory for a little long term, which > cause a lot of data be moved from Young to Old. It is better to move it to > off-heap. > > 2016-12-20 11:57 GMT+08:00 ZhuWilliam <[hidden email]>: > > > +1 Heap should not store data ,it should be used to store runtime temp > > data. > > > > > > > > -- > > View this message in context: http://apache-carbondata- > > mailing-list-archive.1130556.n5.nabble.com/Improvement- > > Carbon-query-gc-problem-tp4322p4718.html > > Sent from the Apache CarbonData Mailing List archive mailing list archive > > at Nabble.com. > > >
kumar vishal
|
Free forum by Nabble | Edit this page |