Apache CarbonData Dev Mailing List archive

Re: GC problem and performance refine problem

Posted by kumarvishal09 on Nov 11, 2016; 7:08am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GC-problem-and-performance-refine-problem-tp2844p2866.html

Hi An Lan,

Please confirm below things.

1. Is dynamic executor is enabled?? If it is enabled can u disabled and
try(this is to check is there any impact with dynamic executor)
for disabling dynamic executor you need set in spark-default.conf
--num-executors=100

2. Can u please set in below property enable.blocklet.distribution=false
and execute the query.

3. Cardinality of each column.

4. Any reason why you are setting "NO_INVERTED_INDEX”=“a” ??

5. Can u keep all the columns which is present in filter on left side so
less no of blocks will identified during block pruning and it will improve
the query performance.

-Regards
Kumar Vishal

On Fri, Nov 11, 2016 at 11:59 AM, An Lan <[hidden email]> wrote:

> Hi Kumar Vishal,
>
> 1. Create table ddl:
>
> CREATE TABLE IF NOT EXISTS Table1
>
> (* h Int, g Int, d String, f Int, e Int,*
>
> a Int, b Int, …(extra near 300 columns)
>
> STORED BY 'org.apache.carbondata.format'
> TBLPROPERTIES(
>
> "NO_INVERTED_INDEX”=“a”,
> "NO_INVERTED_INDEX”=“b”,
>
> …(extra near 300 columns)
>
> "DICTIONARY_INCLUDE”=“a”,
>
> "DICTIONARY_INCLUDE”=“b”,
>
> …(extra near 300 columns)
>
> )
>
> 2. 3. There more than hundreds node in the cluster, but cluster is
> used mixed with other application. Some time when node is enough, we will
> get 100 distinct node.
>
> 4. I give a statistic of task time during once query and mark
> distinct nodes below:
>
> [image: 内嵌图片 1]
>
>
>
>
> 2016-11-10 23:52 GMT+08:00 Kumar Vishal <[hidden email]>:
>
>> Hi Anning Luo,
>>
>> Can u please provide below details.
>>
>> 1.Create table ddl.
>> 2.Number of node in you cluster setup.
>> 3. Number of executors per node.
>> 4. Query statistics.
>>
>> Please find my comments in bold.
>>
>> Problem:
>> 1. GC problem. We suffer a 20%~30% GC time for
>> some task in first stage after a lot of parameter refinement. We now use
>> G1
>> GC in java8. GC time will double if use CMS. The main GC time is spent on
>> young generation GC. Almost half memory of young generation will be copy
>> to
>> old generation. It seems lots of object has a long life than GC period and
>> the space is not be reuse(as concurrent GC will release it later). When we
>> use a large Eden(>=1G for example), once GC time will be seconds. If set
>> Eden little(256M for example), once GC time will be hundreds milliseconds,
>> but more frequency and total is still seconds. Is there any way to lessen
>> the GC time? (We don’t consider the first query and second query in this
>> case.)
>>
>> *How many node are present in your cluster setup?? If nodes are less
>> please
>> reduce the number of executors per node.*
>>
>> 2. Performance refine problem. Row number after
>> being filtered is not uniform. Some node maybe heavy. It spend more time
>> than other node. The time of one task is 4s ~ 16s. Is any method to refine
>> it?
>>
>> 3. Too long time for first and second query. I
>> know dictionary and some index need to be loaded for the first time. But
>> after I trying use query below to preheat it, it still spend a lot of
>> time.
>> How could I preheat the query correctly?
>> select Aarray, a, b, c… from Table1 where Aarray
>> is
>> not null and d = “sss” and e !=22 and f = 33 and g = 44 and h = 55
>>
>> *Currently we are working on first time query improvement. For now you can
>> run select count(*) or count(column), so all the blocks get loaded and
>> then
>> you can run the actual query.*
>>
>>
>> 4. Any other suggestion to lessen the query time?
>>
>>
>> Some suggestion：
>> The log by class QueryStatisticsRecorder give me a good means
>> to find the neck bottle, but not enough. There still some metric I think
>> is
>> very useful:
>> 1. filter ratio. i.e.. not only result_size but also the
>> origin
>> size so we could know how many data is filtered.
>> 2. IO time. The scan_blocks_time is not enough. If it is high,
>> we know somethings wrong, but not know what cause that problem. The real
>> IO
>> time for data is not be provided. As there may be several file for one
>> partition, know the program slow is caused by datanode or executor itself
>> give us intuition to find the problem.
>> 3. The TableBlockInfo for task. I log it by myself when
>> debugging. It tell me how many blocklets is locality. The spark web
>> monitor
>> just give a locality level, but may be only one blocklet is locality.
>>
>>
>> -Regards
>> Kumar Vishal
>>
>> On Thu, Nov 10, 2016 at 8:55 PM, An Lan <[hidden email]> wrote:
>>
>> > Hi,
>> >
>> > We are using carbondata to build our table and running query in
>> > CarbonContext. We have some performance problem during refining the
>> system.
>> >
>> > *Background*：
>> >
>> > *cluster*： 100 executor，5 task/executor, 10G
>> > memory/executor
>> >
>> > *data*: 60+GB(per one replica) as carbon
>> data
>> > format, 600+MB/file * 100 file, 300+columns, 300+million rows
>> >
>> > *sql example:*
>> >
>> > select A,
>> >
>> > sum(a),
>> >
>> > sum(b),
>> >
>> > sum(c),
>> >
>> > …( extra 100 aggregation like
>> > sum(column))
>> >
>> > from Table1 LATERAL VIEW
>> > explode(split(Aarray, ‘*;*’)) ATable AS A
>> >
>> > where A is not null and d >
>> “ab:c-10”
>> > and d < “h:0f3s” and e!=10 and f=22 and g=33 and h=44 GROUP BY A
>> >
>> > *target query time*: <10s
>> >
>> > *current query time*: 15s ~ 25s
>> >
>> > *scene:* OLAP system. <100 queries every
>> day.
>> > Concurrency number is not high. Most time cpu is idle, so this service
>> will
>> > run with other program. The service will run for long time. We could not
>> > occupy a very large memory for every executor.
>> >
>> > *refine*: I have build index and
>> dictionary on
>> > d, e, f, g, h and build dictionary on all other aggregation
>> columns(i.e. a,
>> > b, c, …100+ columns). And make sure there is one segment for total
>> data. I
>> > have open the speculation(quantile=0.5, interval=250, multiplier=1.2).
>> >
>> > Time is mainly spent on first stage before shuffling. As 95% data will
>> be
>> > filtered out, the shuffle process spend little time. In first stage,
>> most
>> > task complete in less than 10s. But there still be near 50 tasks longer
>> > than 10s. Max task time in one query may be 12~16s.
>> >
>> > *Problem:*
>> >
>> > 1. GC problem. We suffer a 20%~30% GC time for some task in
>> first
>> > stage after a lot of parameter refinement. We now use G1 GC in java8. GC
>> > time will double if use CMS. The main GC time is spent on young
>> generation
>> > GC. Almost half memory of young generation will be copy to old
>> generation.
>> > It seems lots of object has a long life than GC period and the space is
>> not
>> > be reuse(as concurrent GC will release it later). When we use a large
>> > Eden(>=1G for example), once GC time will be seconds. If set Eden
>> > little(256M for example), once GC time will be hundreds milliseconds,
>> but
>> > more frequency and total is still seconds. Is there any way to lessen
>> the
>> > GC time? (We don’t consider the first query and second query in this
>> case.)
>> >
>> > 2. Performance refine problem. Row number after being
>> filtered is
>> > not uniform. Some node maybe heavy. It spend more time than other node.
>> The
>> > time of one task is 4s ~ 16s. Is any method to refine it?
>> >
>> > 3. Too long time for first and second query. I know dictionary
>> > and some index need to be loaded for the first time. But after I trying
>> use
>> > query below to preheat it, it still spend a lot of time. How could I
>> > preheat the query correctly?
>> >
>> > select Aarray, a, b, c… from Table1 where Aarray is not
>> null
>> > and d = “sss” and e !=22 and f = 33 and g = 44 and h = 55
>> >
>> > 4. Any other suggestion to lessen the query time?
>> >
>> >
>> >
>> > Some suggestion：
>> >
>> > The log by class QueryStatisticsRecorder give me a good
>> means
>> > to find the neck bottle, but not enough. There still some metric I
>> think is
>> > very useful:
>> >
>> > 1. filter ratio. i.e.. not only result_size but also the
>> origin
>> > size so we could know how many data is filtered.
>> >
>> > 2. IO time. The scan_blocks_time is not enough. If it is
>> high,
>> > we know somethings wrong, but not know what cause that problem. The
>> real IO
>> > time for data is not be provided. As there may be several file for one
>> > partition, know the program slow is caused by datanode or executor
>> itself
>> > give us intuition to find the problem.
>> >
>> > 3. The TableBlockInfo for task. I log it by myself when
>> > debugging. It tell me how many blocklets is locality. The spark web
>> monitor
>> > just give a locality level, but may be only one blocklet is locality.
>> >
>> >
>> > ---------------------
>> >
>> > Anning Luo
>> >
>> > *HULU*
>> >
>> > Email: [hidden email]
>> >
>> > [hidden email]
>> >
>>
>
>

kumar vishal