http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GC-problem-and-performance-refine-problem-tp2844p2867.html
1. I have set --num-executors=100 in all test. The image write 100 nodes
or three executor on one node. The total executor number is always 100.
2. I do not find the property enable.blocklet.distribution in 0.1.1. I
found it in 0.2.0. I am testing on 0.1.1, and will try it later. If you
IO problem. Statistic of the long task statistic like this:
+--------------------+----------------+--------------------+----------------+---------------+-----------+-------------------+
| task_id|load_blocks_time|load_dictionary_time|scan_blocks_time|scan_blocks_num|result_size|total_executor_time|
+--------------------+----------------+--------------------+----------------+---------------+-----------+-------------------+
+--------------------+----------------+--------------------+----------------+---------------+-----------+-------------------+
with a long IO will be resend.
3. distinct value:
4. In the query, only e, f ,g, h, d and A are used in filter, others are
not used. So I think others used in the aggregation are no need added for
index.
5. I have write the e, f, g, h, d in most left just after "create table..."
> Hi An Lan,
>
> Please confirm below things.
>
> 1. Is dynamic executor is enabled?? If it is enabled can u disabled and
> try(this is to check is there any impact with dynamic executor)
> for disabling dynamic executor you need set in spark-default.conf
> --num-executors=100
>
> 2. Can u please set in below property enable.blocklet.distribution=false
> and execute the query.
>
> 3. Cardinality of each column.
>
> 4. Any reason why you are setting "NO_INVERTED_INDEX”=“a” ??
>
> 5. Can u keep all the columns which is present in filter on left side so
> less no of blocks will identified during block pruning and it will improve
> the query performance.
>
>
> -Regards
> Kumar Vishal
>
> On Fri, Nov 11, 2016 at 11:59 AM, An Lan <
[hidden email]> wrote:
>
> > Hi Kumar Vishal,
> >
> > 1. Create table ddl:
> >
> > CREATE TABLE IF NOT EXISTS Table1
> >
> > (* h Int, g Int, d String, f Int, e Int,*
> >
> > a Int, b Int, …(extra near 300 columns)
> >
> > STORED BY 'org.apache.carbondata.format'
> > TBLPROPERTIES(
> >
> > "NO_INVERTED_INDEX”=“a”,
> > "NO_INVERTED_INDEX”=“b”,
> >
> > …(extra near 300 columns)
> >
> > "DICTIONARY_INCLUDE”=“a”,
> >
> > "DICTIONARY_INCLUDE”=“b”,
> >
> > …(extra near 300 columns)
> >
> > )
> >
> > 2. 3. There more than hundreds node in the cluster, but cluster is
> > used mixed with other application. Some time when node is enough, we
> will
> > get 100 distinct node.
> >
> > 4. I give a statistic of task time during once query and mark
> > distinct nodes below:
> >
> > [image: 内嵌图片 1]
> >
> >
> >
> >
> > 2016-11-10 23:52 GMT+08:00 Kumar Vishal <
[hidden email]>:
> >
> >> Hi Anning Luo,
> >>
> >> Can u please provide below details.
> >>
> >> 1.Create table ddl.
> >> 2.Number of node in you cluster setup.
> >> 3. Number of executors per node.
> >> 4. Query statistics.
> >>
> >> Please find my comments in bold.
> >>
> >> Problem:
> >> 1. GC problem. We suffer a 20%~30% GC time for
> >> some task in first stage after a lot of parameter refinement. We now use
> >> G1
> >> GC in java8. GC time will double if use CMS. The main GC time is spent
> on
> >> young generation GC. Almost half memory of young generation will be copy
> >> to
> >> old generation. It seems lots of object has a long life than GC period
> and
> >> the space is not be reuse(as concurrent GC will release it later). When
> we
> >> use a large Eden(>=1G for example), once GC time will be seconds. If set
> >> Eden little(256M for example), once GC time will be hundreds
> milliseconds,
> >> but more frequency and total is still seconds. Is there any way to
> lessen
> >> the GC time? (We don’t consider the first query and second query in this
> >> case.)
> >>
> >> *How many node are present in your cluster setup?? If nodes are less
> >> please
> >> reduce the number of executors per node.*
> >>
> >> 2. Performance refine problem. Row number after
> >> being filtered is not uniform. Some node maybe heavy. It spend more time
> >> than other node. The time of one task is 4s ~ 16s. Is any method to
> refine
> >> it?
> >>
> >> 3. Too long time for first and second query. I
> >> know dictionary and some index need to be loaded for the first time. But
> >> after I trying use query below to preheat it, it still spend a lot of
> >> time.
> >> How could I preheat the query correctly?
> >> select Aarray, a, b, c… from Table1 where Aarray
> >> is
> >> not null and d = “sss” and e !=22 and f = 33 and g = 44 and h = 55
> >>
> >> *Currently we are working on first time query improvement. For now you
> can
> >> run select count(*) or count(column), so all the blocks get loaded and
> >> then
> >> you can run the actual query.*
> >>
> >>
> >> 4. Any other suggestion to lessen the query time?
> >>
> >>
> >> Some suggestion:
> >> The log by class QueryStatisticsRecorder give me a good
> means
> >> to find the neck bottle, but not enough. There still some metric I think
> >> is
> >> very useful:
> >> 1. filter ratio. i.e.. not only result_size but also the
> >> origin
> >> size so we could know how many data is filtered.
> >> 2. IO time. The scan_blocks_time is not enough. If it is
> high,
> >> we know somethings wrong, but not know what cause that problem. The real
> >> IO
> >> time for data is not be provided. As there may be several file for one
> >> partition, know the program slow is caused by datanode or executor
> itself
> >> give us intuition to find the problem.
> >> 3. The TableBlockInfo for task. I log it by myself when
> >> debugging. It tell me how many blocklets is locality. The spark web
> >> monitor
> >> just give a locality level, but may be only one blocklet is locality.
> >>
> >>
> >> -Regards
> >> Kumar Vishal
> >>
> >> On Thu, Nov 10, 2016 at 8:55 PM, An Lan <
[hidden email]> wrote:
> >>
> >> > Hi,
> >> >
> >> > We are using carbondata to build our table and running query in
> >> > CarbonContext. We have some performance problem during refining the
> >> system.
> >> >
> >> > *Background*:
> >> >
> >> > *cluster*: 100 executor,5 task/executor, 10G
> >> > memory/executor
> >> >
> >> > *data*: 60+GB(per one replica) as carbon
> >> data
> >> > format, 600+MB/file * 100 file, 300+columns, 300+million rows
> >> >
> >> > *sql example:*
> >> >
> >> > select A,
> >> >
> >> > sum(a),
> >> >
> >> > sum(b),
> >> >
> >> > sum(c),
> >> >
> >> > …( extra 100 aggregation like
> >> > sum(column))
> >> >
> >> > from Table1 LATERAL VIEW
> >> > explode(split(Aarray, ‘*;*’)) ATable AS A
> >> >
> >> > where A is not null and d >
> >> “ab:c-10”
> >> > and d < “h:0f3s” and e!=10 and f=22 and g=33 and h=44 GROUP BY A
> >> >
> >> > *target query time*: <10s
> >> >
> >> > *current query time*: 15s ~ 25s
> >> >
> >> > *scene:* OLAP system. <100 queries every
> >> day.
> >> > Concurrency number is not high. Most time cpu is idle, so this service
> >> will
> >> > run with other program. The service will run for long time. We could
> not
> >> > occupy a very large memory for every executor.
> >> >
> >> > *refine*: I have build index and
> >> dictionary on
> >> > d, e, f, g, h and build dictionary on all other aggregation
> >> columns(i.e. a,
> >> > b, c, …100+ columns). And make sure there is one segment for total
> >> data. I
> >> > have open the speculation(quantile=0.5, interval=250, multiplier=1.2).
> >> >
> >> > Time is mainly spent on first stage before shuffling. As 95% data will
> >> be
> >> > filtered out, the shuffle process spend little time. In first stage,
> >> most
> >> > task complete in less than 10s. But there still be near 50 tasks
> longer
> >> > than 10s. Max task time in one query may be 12~16s.
> >> >
> >> > *Problem:*
> >> >
> >> > 1. GC problem. We suffer a 20%~30% GC time for some task in
> >> first
> >> > stage after a lot of parameter refinement. We now use G1 GC in java8.
> GC
> >> > time will double if use CMS. The main GC time is spent on young
> >> generation
> >> > GC. Almost half memory of young generation will be copy to old
> >> generation.
> >> > It seems lots of object has a long life than GC period and the space
> is
> >> not
> >> > be reuse(as concurrent GC will release it later). When we use a large
> >> > Eden(>=1G for example), once GC time will be seconds. If set Eden
> >> > little(256M for example), once GC time will be hundreds milliseconds,
> >> but
> >> > more frequency and total is still seconds. Is there any way to lessen
> >> the
> >> > GC time? (We don’t consider the first query and second query in this
> >> case.)
> >> >
> >> > 2. Performance refine problem. Row number after being
> >> filtered is
> >> > not uniform. Some node maybe heavy. It spend more time than other
> node.
> >> The
> >> > time of one task is 4s ~ 16s. Is any method to refine it?
> >> >
> >> > 3. Too long time for first and second query. I know
> dictionary
> >> > and some index need to be loaded for the first time. But after I
> trying
> >> use
> >> > query below to preheat it, it still spend a lot of time. How could I
> >> > preheat the query correctly?
> >> >
> >> > select Aarray, a, b, c… from Table1 where Aarray is not
> >> null
> >> > and d = “sss” and e !=22 and f = 33 and g = 44 and h = 55
> >> >
> >> > 4. Any other suggestion to lessen the query time?
> >> >
> >> >
> >> >
> >> > Some suggestion:
> >> >
> >> > The log by class QueryStatisticsRecorder give me a good
> >> means
> >> > to find the neck bottle, but not enough. There still some metric I
> >> think is
> >> > very useful:
> >> >
> >> > 1. filter ratio. i.e.. not only result_size but also the
> >> origin
> >> > size so we could know how many data is filtered.
> >> >
> >> > 2. IO time. The scan_blocks_time is not enough. If it is
> >> high,
> >> > we know somethings wrong, but not know what cause that problem. The
> >> real IO
> >> > time for data is not be provided. As there may be several file for one
> >> > partition, know the program slow is caused by datanode or executor
> >> itself
> >> > give us intuition to find the problem.
> >> >
> >> > 3. The TableBlockInfo for task. I log it by myself when
> >> > debugging. It tell me how many blocklets is locality. The spark web
> >> monitor
> >> > just give a locality level, but may be only one blocklet is locality.
> >> >
> >> >
> >> > ---------------------
> >> >
> >> > Anning Luo
> >> >
> >> > *HULU*
> >> >
> >> > Email:
[hidden email]
> >> >
> >> >
[hidden email]
> >> >
> >>
> >
> >
>