Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Assigned] (CARBONDATA-4079) Queries with Date range are taking time

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Assigned] (CARBONDATA-4079) Queries with Date range are taking time

[ https://issues.apache.org/jira/browse/CARBONDATA-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ajantha Bhat reassigned CARBONDATA-4079:
----------------------------------------

Assignee: Ajantha Bhat

> Queries with Date range are taking time
> ---------------------------------------
>
> Key: CARBONDATA-4079
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4079
> Project: CarbonData
> Issue Type: Improvement
> Components: data-query
> Affects Versions: 2.1.0
> Reporter: suyash yadav
> Assignee: Ajantha Bhat
> Priority: Major
>
> Hi Team,
> We are doing a POC to understand how can we improve performance of the query fired against table created in apache carbondata.
> Below is the sample query:
>
> *spark.sql("select ts,resource,metric,value from fact_timestamp_global left join tags_10_Days_test on fact_timestamp_global.tags_id= tags_10_Days_test.id where metric in ('Outbound Utilization (percent)','Inbound Utilization (percent)') and resource='10.212.7.98_if:<0001>' and ts between '2020-09-21 00:00:00' and '2020-09-21 12:55:55' group by ts,resource,metric,value").show(10000,false)*
> As you can see above query contains the date range filter.We have noticed that due to this date range filter the query time is coming around 15 seconds which is not proving useful as we have to bring down the query execution time to 3 to 4 seconds.
> Could you please review above query and suggest a better way of framing the above query specially the date range filter which can be helpful to get the desired query execution time?
>
> In case you need more details then please do let me know.
>
> Regards
> Suyash Yadav

--
This message was sent by Atlassian Jira
(v8.3.4#803005)