Hi all,
When we are using sparkshell + carbondata to send a query, how can we get the excution duration? Some topics are thrown as follows: 1. One query can produce one or more jobs, and some of the jobs may have DAG dependence, thus we can't get the excution duration by sum up all the jobs' duration or get the max duration of the jobs roughly. 2. In the spark shell console or spark application web ui, we can get each job's duration, but we can't get the carbondata-query directly, if some improvement would take by carbondata in the near future. 3. Maybe we can use the following command to get a approximate result: scala > val begin = new Date();cc.sql("$SQL_COMMAND").show;val end = new Date(); Any other opinions? |
Administrator
|
Hi
I used the below method in spark shell for DEMO, for your reference: import org.apache.spark.sql.catalyst.util._ benchmark { carbondf.filter($"name" === "Allen" and $"gender" === "Male" and $"province" === "NB" and $"singler" === "false").count } Regards Liang 2017-02-06 22:07 GMT-05:00 Yinwei Li <[hidden email]>: > Hi all, > > > When we are using sparkshell + carbondata to send a query, how can we > get the excution duration? Some topics are thrown as follows: > > > 1. One query can produce one or more jobs, and some of the jobs may have > DAG dependence, thus we can't get the excution duration by sum up all the > jobs' duration or get the max duration of the jobs roughly. > > > 2. In the spark shell console or spark application web ui, we can get > each job's duration, but we can't get the carbondata-query directly, if > some improvement would take by carbondata in the near future. > > > 3. Maybe we can use the following command to get a approximate result: > > > scala > val begin = new Date();cc.sql("$SQL_COMMAND").show;val end = > new Date(); > > > Any other opinions? -- Regards Liang |
Hi
Now i can use carbondata 1.0.0 with spark-shell(spark 2.1) as: ./bin/spark-shell --jars <carbondata assembly jar path> but it's inconvenient to get the query time , so i try to use ./bin/spark-sql --jars <carbondata assembly jar path>,but i found some errors when create table : spark-sql> create table if not exists test_table(id string, name string, city string, age int) stored by 'carbondata'; Error in query: Operation not allowed:STORED BY(line 1, pos 87) it seems that the carbondata jar is not load successfully. How can i use ./bin/spark-sql? Regards Libis 2017-02-07 13:16 GMT+08:00 Liang Chen <[hidden email]>: > Hi > > I used the below method in spark shell for DEMO, for your reference: > > import org.apache.spark.sql.catalyst.util._ > > benchmark { carbondf.filter($"name" === "Allen" and $"gender" === "Male" > and $"province" === "NB" and $"singler" === "false").count } > > > Regards > > Liang > > 2017-02-06 22:07 GMT-05:00 Yinwei Li <[hidden email]>: > > > Hi all, > > > > > > When we are using sparkshell + carbondata to send a query, how can we > > get the excution duration? Some topics are thrown as follows: > > > > > > 1. One query can produce one or more jobs, and some of the jobs may > have > > DAG dependence, thus we can't get the excution duration by sum up all the > > jobs' duration or get the max duration of the jobs roughly. > > > > > > 2. In the spark shell console or spark application web ui, we can get > > each job's duration, but we can't get the carbondata-query directly, if > > some improvement would take by carbondata in the near future. > > > > > > 3. Maybe we can use the following command to get a approximate result: > > > > > > scala > val begin = new Date();cc.sql("$SQL_COMMAND").show;val end = > > new Date(); > > > > > > Any other opinions? > > > > > -- > Regards > Liang > |
Hi Libis,
spark-sql CLI is not supported by carbondata. Why don't you use carbon thrift server and beeline, it is also same as spark-sql CLI and it gives execution time for each query. Start carbondata thrift server script. bin/spark-submit --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer <carbondata jar file> <store-location> beeline script bin/beeline -u jdbc:hive2://localhost:10000 Regards, Ravindra On 9 February 2017 at 07:55, 范范欣欣 <[hidden email]> wrote: > Hi > > Now i can use carbondata 1.0.0 with spark-shell(spark 2.1) as: > > ./bin/spark-shell --jars <carbondata assembly jar path> > > but it's inconvenient to get the query time , so i try to use > ./bin/spark-sql --jars <carbondata assembly jar path>,but i found some > errors when create table : > > spark-sql> create table if not exists test_table(id string, name string, > city string, age int) stored by 'carbondata'; > Error in query: > Operation not allowed:STORED BY(line 1, pos 87) > > it seems that the carbondata jar is not load successfully. How can i use > ./bin/spark-sql? > > Regards > > Libis > > > > 2017-02-07 13:16 GMT+08:00 Liang Chen <[hidden email]>: > > > Hi > > > > I used the below method in spark shell for DEMO, for your reference: > > > > import org.apache.spark.sql.catalyst.util._ > > > > benchmark { carbondf.filter($"name" === "Allen" and $"gender" === "Male" > > and $"province" === "NB" and $"singler" === "false").count } > > > > > > Regards > > > > Liang > > > > 2017-02-06 22:07 GMT-05:00 Yinwei Li <[hidden email]>: > > > > > Hi all, > > > > > > > > > When we are using sparkshell + carbondata to send a query, how can we > > > get the excution duration? Some topics are thrown as follows: > > > > > > > > > 1. One query can produce one or more jobs, and some of the jobs may > > have > > > DAG dependence, thus we can't get the excution duration by sum up all > the > > > jobs' duration or get the max duration of the jobs roughly. > > > > > > > > > 2. In the spark shell console or spark application web ui, we can get > > > each job's duration, but we can't get the carbondata-query directly, if > > > some improvement would take by carbondata in the near future. > > > > > > > > > 3. Maybe we can use the following command to get a approximate > result: > > > > > > > > > scala > val begin = new Date();cc.sql("$SQL_COMMAND").show;val > end = > > > new Date(); > > > > > > > > > Any other opinions? > > > > > > > > > > -- > > Regards > > Liang > > > -- Thanks & Regards, Ravi |
Free forum by Nabble | Edit this page |