Apache CarbonData Dev Mailing List archive

Re: [DISCUSSION]Support for Geospatial indexing

Posted by Jacky Li-3 on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Support-for-Geospatial-indexing-tp85268p85306.html

definitely +1.

Before going through the design doc, I have two questions:
1. In this domain, there are some opensource solutions with SQL extension or DSL designed for geographical analytic, such as geomesa (it also works with spark). So is there considerations to integration with geomesa also? Can geomesa user benefit from CarbonData spatial index?

2. Besides Z-order curve, there are other curve maybe useful in some use case, like Hilbert curve. To maximize the extensionbility for CarbonData, is it possible to have a framework to support different curve implementation?

Regards,
Jacky

On 2019/10/16 11:31:35, Venu Reddy <[hidden email]> wrote:

> Hi all,
>
> In general, database may contain geographical location data. For instance,
> Telecom operators require to perform analytics based on a particular
> region, cell tower IDs(within a region) and/or may include geographical
> locations for a particular period of time. At present, Carbon do not have
> native support to store geographical locations/coordinates and to do filter
> queries based on them. Yet, longitude and latitude of coordinates can be
> treated as independent columns, sort hierarchically and store them.
>
> But, when longitude and latitude are treated independently, 2D
> space is linearized i.e., points in the two dimensional domain are ordered
> by sorting first on longitide and then on latitude. Thus, data is not
> ordered by geospatial proximity. Hence range queries require lot of IO
> operations and query performance is degraded.
>
> To alleviate it, we can use z-order curve to store geospatial data
> points. This ensures that geographically nearer points are present at same
> block/blocklet. This reduces the IO operations for range queries and
> improves query performance. Also can support polygon queries for geodata.
>
> Have raised a jira https://issues.apache.org/jira/browse/CARBONDATA-3548 and
> attached design document to it. Request you to please have a look. Welcome
> your opinion and suggestions.
>
> Thanks,
>