Apache CarbonData Dev Mailing List archive

[DISCUSSION] the driver time coast with spark sql + carbondata

Classic

List

Threaded

2 messages Options

litao

Aug 24, 2021; 12:56am

[DISCUSSION] the driver time coast with spark sql + carbondata

using spark sql + carbondata for data load and query, we often found that the time coast of driver is much more high,especially in concurrent query scenarios. some times the total query time coast is 4s but the the driver coast more than 3s. This needs pay more attention.
under driver the communication with hive metastore or namemode is an obvious performance risk.
both hive metastore and namemode can be accessing slowly, do we create some methods to slove the problem?

litao

Aug 27, 2021; 8:40am

Re: [DISCUSSION] the driver time coast with spark sql + carbondata

I ask this issue under the spark discussion group， and has some reply。
https://lists.apache.org/thread.html/rae7cc9057c3e90f439de5b899d0b0c0fb1e9cbee85349d15ec1236e9%40%3Cdev.spark.apache.org%3E

This is a good phenomenon, although looking at this reply is very academic.

Let's talk more about the driver.