Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Created] (CARBONDATA-3832) Block Pruning for geospatial polygon expression

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Created] (CARBONDATA-3832) Block Pruning for geospatial polygon expression

Venugopal Reddy K created CARBONDATA-3832:
---------------------------------------------

Summary: Block Pruning for geospatial polygon expression
Key: CARBONDATA-3832
URL: https://issues.apache.org/jira/browse/CARBONDATA-3832
Project: CarbonData
Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Venugopal Reddy K

*[Issue]*

At present, carbon doesn't do block/blocklet pruning for polygon fileter queries. It does rowlevel filtering at carbon layer and returns result. With this approach, all the carbon files are scanned irrespective of the where there are any matching rows in the block. It also has spark overhead to launch many jobs and tasks to process them. Thus affects the overall performance of polygon query.

*[Solution]*

We can leverage the existing block pruning mechanism in the carbon and avoid the unwanted blocks with block pruning. Thus reduce the number of splits. And at the executor side, we can also use blocklet pruning and reduce the number of blocklets to be read and scanned.

Thus improves the polygon query performace.

--
This message was sent by Atlassian Jira
(v8.3.4#803005)