Apache CarbonData Dev Mailing List archive

GC problem and performance refine problem

Posted by Anning Luo-2 on Nov 10, 2016; 3:25pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GC-problem-and-performance-refine-problem-tp2844.html

Hi,

We are using carbondata to build our table and running query in
CarbonContext. We have some performance problem during refining the system.

*Background*：

*cluster*： 100 executor，5 task/executor, 10G
memory/executor

*data*: 60+GB(per one replica) as carbon data
format, 600+MB/file * 100 file, 300+columns, 300+million rows

*sql example:*

select A,

sum(a),

sum(b),

sum(c),

…( extra 100 aggregation like
sum(column))

from Table1 LATERAL VIEW
explode(split(Aarray, ‘*;*’)) ATable AS A

where A is not null and d > “ab:c-10”
and d < “h:0f3s” and e!=10 and f=22 and g=33 and h=44 GROUP BY A

*target query time*: <10s

*current query time*: 15s ~ 25s

*scene:* OLAP system. <100 queries every day.
Concurrency number is not high. Most time cpu is idle, so this service will
run with other program. The service will run for long time. We could not
occupy a very large memory for every executor.

*refine*: I have build index and dictionary on
d, e, f, g, h and build dictionary on all other aggregation columns(i.e. a,
b, c, …100+ columns). And make sure there is one segment for total data. I
have open the speculation(quantile=0.5, interval=250, multiplier=1.2).

Time is mainly spent on first stage before shuffling. As 95% data will be
filtered out, the shuffle process spend little time. In first stage, most
task complete in less than 10s. But there still be near 50 tasks longer
than 10s. Max task time in one query may be 12~16s.

*Problem:*

1. GC problem. We suffer a 20%~30% GC time for some task in first
stage after a lot of parameter refinement. We now use G1 GC in java8. GC
time will double if use CMS. The main GC time is spent on young generation
GC. Almost half memory of young generation will be copy to old generation.
It seems lots of object has a long life than GC period and the space is not
be reuse(as concurrent GC will release it later). When we use a large
Eden(>=1G for example), once GC time will be seconds. If set Eden
little(256M for example), once GC time will be hundreds milliseconds, but
more frequency and total is still seconds. Is there any way to lessen the
GC time? (We don’t consider the first query and second query in this case.)

2. Performance refine problem. Row number after being filtered is
not uniform. Some node maybe heavy. It spend more time than other node. The
time of one task is 4s ~ 16s. Is any method to refine it?

3. Too long time for first and second query. I know dictionary
and some index need to be loaded for the first time. But after I trying use
query below to preheat it, it still spend a lot of time. How could I
preheat the query correctly?

select Aarray, a, b, c… from Table1 where Aarray is not null
and d = “sss” and e !=22 and f = 33 and g = 44 and h = 55

4. Any other suggestion to lessen the query time?

Some suggestion：

The log by class QueryStatisticsRecorder give me a good means
to find the neck bottle, but not enough. There still some metric I think is
very useful:

1. filter ratio. i.e.. not only result_size but also the origin
size so we could know how many data is filtered.

2. IO time. The scan_blocks_time is not enough. If it is high,
we know somethings wrong, but not know what cause that problem. The real IO
time for data is not be provided. As there may be several file for one
partition, know the program slow is caused by datanode or executor itself
give us intuition to find the problem.

3. The TableBlockInfo for task. I log it by myself when
debugging. It tell me how many blocklets is locality. The spark web monitor
just give a locality level, but may be only one blocklet is locality.

---------------------

Anning Luo

*HULU*

Email: [hidden email]

[hidden email]