Apache CarbonData Dev Mailing List archive

GC problem and performance refine problem

Posted by Anning Luo-2 on Nov 10, 2016; 3:12pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GC-problem-and-performance-refine-problem-tp2833p2840.html

Hi,

We are using carbondata to build our table and running query in CarbonContext. We have some performance problem during refining the system.

Background：

cluster：    100 executor，5 task/executor, 10G memory/executor

data: 60+GB(per one replica) as carbon data format, 600+MB/file * 100 file, 300+columns, 300+million rows

sql example:

select A,

  sum(a),

  sum(b),

  sum(c),

  …( extra 100 aggregation like sum(column))

  from Table1 LATERAL VIEW explode(split(Aarray, ‘;’)) ATable AS A

  where A is not null and d > “ab:c-10” and d < “h:0f3s” and e!=10 and f=22 and g=33 and h=44 GROUP BY A

target query time: <10s

current query time: 15s ~ 25s

scene: OLAP system. <100 queries every day. Concurrency number is not high. Most time cpu is idle, so this service will run with other program. The service will run for long time. We could not occupy a very large memory for every executor.

refine: I have build index and dictionary on d, e, f, g, h and build dictionary on all other aggregation columns(i.e. a, b, c, …100+ columns). And make sure there is one segment for total data. I have open the speculation(quantile=0.5, interval=250, multiplier=1.2).

Time is mainly spent on first stage before shuffling. As 95% data will be filtered out, the shuffle process spend little time. In first stage, most task complete in less than 10s. But there still be near 50 tasks longer than 10s. Max task time for a query may be 12~16s.

Problem:

1.         GC problem. We suffer a 20%~30% GC time for some task in first stage after a lot of parameter refinement. We now use G1 GC in java8. GC time will double if use CMS. The main GC time is spent on young generation GC. Almost half memory of young generation will be copy to old generation. It seems lots of object has a long life than GC period and the space is not be reuse(as concurrent GC will release it later). When we use a large Eden(>=1G for example), once GC time will be seconds. If set Eden little(256M for example), once GC time will be hundreds milliseconds, but more frequency and total is still seconds. Is there any way to lessen the GC time? (We don’t consider the first query and second query in this case.)

2.         Performance refine problem. Row number after being filtered is not uniform. Some node maybe heavy. It spend more time than other node. The time of one task is 4s ~ 16s. Is any method to refine it?

3.         Too long time for first and second query. I know dictionary and some index need to be loaded for the first time. But after I trying use query below to preheat it, it still spend a lot of time. How could I preheat the query correctly?

                                    select Aarray, a, b, c… from Table1 where Aarray is not null and d = “sss” and e !=22 and f = 33 and g = 44 and h = 55

4.         Any other suggestion to lessen the query time?

Some suggestion：

The log by class QueryStatisticsRecorder give me a good means to find the neck bottle, but not enough. There still some metric I think is very useful:

1. filter ratio. i.e.. not only result_size but also the origin size so we could know how many data is filtered.

2. IO time. The scan_blocks_time is not enough. If it is high, we know somethings wrong, but not know what cause that problem. The real IO time for data is not be provided. As there may be several file for one partition, know the program slow is caused by datanode or executor itself give us intuition to find the problem.

3. The TableBlockInfo for task. I log it by myself when debugging. It tell me how many blocklets is locality. The spark web monitor just give a locality level, but may be only one blocklet is locality.

-----------------------------------------

Anning Luo

HULU

Email: [hidden email]

[hidden email]

Phone: (+86)15011500384