Login  Register

Re: Discussion about getting excution duration about a query when using sparkshell+carbondata

Posted by Liang Chen on Feb 07, 2017; 5:16am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-about-getting-excution-duration-about-a-query-when-using-sparkshell-carbondata-tp7379p7388.html

Hi

I used the below method in spark shell for DEMO, for your reference:

import org.apache.spark.sql.catalyst.util._

benchmark { carbondf.filter($"name" === "Allen" and $"gender" === "Male"
and $"province" === "NB" and $"singler" === "false").count }


Regards

Liang

2017-02-06 22:07 GMT-05:00 Yinwei Li <[hidden email]>:

> Hi all,
>
>
>   When we are using sparkshell + carbondata to send a query, how can we
> get the excution duration? Some topics are thrown as follows:
>
>
>   1. One query can produce one or more jobs, and some of the jobs may have
> DAG dependence, thus we can't get the excution duration by sum up all the
> jobs' duration or get the max duration of the jobs roughly.
>
>
>   2. In the spark shell console or spark application web ui, we can get
> each job's duration, but we can't get the carbondata-query directly, if
> some improvement would take by carbondata in the near future.
>
>
>   3. Maybe we can use the following command to get a approximate result:
>
>
>     scala > val begin = new Date();cc.sql("$SQL_COMMAND").show;val end =
> new Date();
>
>
>   Any other opinions?




--
Regards
Liang