[jira] [Commented] (CARBONDATA-2204) Access tablestatus file too many times during query

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-2204) Access tablestatus file too many times during query

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376444#comment-16376444 ]

xuchuanyin commented on CARBONDATA-2204:
----------------------------------------

I deleted the unrelated logs and the remaining are shown as below:

 


18/02/25 21:35:38 ERROR TestLoadDataGeneral: ScalaTest-run-running-TestLoadDataGeneral XU begin to query data
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/

18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/
18/02/25 21:35:38 ERROR ThriftReader: ScalaTest-run-running-TestLoadDataGeneral XU Open thrift reader for file: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Fact/Part0/Segment_0/1519623338082.carbonindexmerge
18/02/25 21:35:38 ERROR ThriftReader: ScalaTest-run-running-TestLoadDataGeneral XU Open thrift reader for file: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Fact/Part0/Segment_0/1519623338082.carbonindexmerge

18/02/25 21:35:38 ERROR TestLoadDataGeneral: ScalaTest-run-running-TestLoadDataGeneral XU begin to query2 data
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/

18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/tablestatus
18/02/25 21:35:38 ERROR AtomicFileOperationsImpl: ScalaTest-run-running-TestLoadDataGeneral XU Open atomic file for read: /home/xu/ws/carbondata/integration/spark-common/target/warehouse/loadtest/Metadata/

> Access tablestatus file too many times during query
> ---------------------------------------------------
>
>                 Key: CARBONDATA-2204
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2204
>             Project: CarbonData
>          Issue Type: Improvement
>          Components: data-query
>    Affects Versions: 1.3.0
>            Reporter: xuchuanyin
>            Priority: Major
>
> * Problems
> Currently in carbondata, a single query will access tablestatus file 7 times, which will definitely slow down the query performance especially when this file is in remote cluster since reading this file is purely client side operation.
>  
>  *  Steps to reproduce
> 1. Add logger in `AtomicFileOperationsImpl.openForRead` and printout the file name to read.
> 2. Run a query on carbondata table. Here I ran `TestLoadDataGeneral.test("test data loading CSV file without extension name")`.
> 3. Observe the output log and search the keyword 'tablestatus'.  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)