[jira] [Updated] (CARBONDATA-4051) Geo spatial index algorithm improvement and UDFs enhancement

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (CARBONDATA-4051) Geo spatial index algorithm improvement and UDFs enhancement

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jiayu Shen updated CARBONDATA-4051:
-----------------------------------
    Attachment: CarbonData Spatial Index Design Doc v2.docx

> Geo spatial index algorithm improvement and UDFs enhancement
> ------------------------------------------------------------
>
>                 Key: CARBONDATA-4051
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-4051
>             Project: CarbonData
>          Issue Type: New Feature
>            Reporter: Jiayu Shen
>            Priority: Minor
>         Attachments: CarbonData Spatial Index Design Doc v2.docx, Genex Cloud&Discvoery Carbon Spatial Index Specification.docx
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> The requirement is from SEQ,related algorithms are provided by group Discovery.
> 1. Replace geohash encoded algorithm, and reduce required properties of CREATE TABLE. For example,
> {code:java}
> CREATE TABLE geoTable(
>  timevalue BIGINT,
>  longitude LONG,
>  latitude LONG) COMMENT "This is a GeoTable"
>  STORED AS carbondata
>  TBLPROPERTIES ($customProperties 'SPATIAL_INDEX'='mygeohash',
>  'SPATIAL_INDEX.mygeohash.type'='geohash',
>  'SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, latitude',
>  'SPATIAL_INDEX.mygeohash.originLatitude'='39.832277',
>  'SPATIAL_INDEX.mygeohash.gridSize'='50',
>  'SPATIAL_INDEX.mygeohash.conversionRatio'='1000000'){code}
> 2. Add geo query UDFs
> query filter UDFs :
>  * _*InPolygonList (List<String> polygonList, OperationType opType)*_
>  * _*InPolylineList (List<String> polylineList, Float bufferInMeter)*_
>  * _*InPolygonRangeList (List<Long []> RangeList, **OperationType opType**)*_
> *operation only support :*
>  * *"OR", means calculating union of two polygons*
>  * *"AND", means calculating intersection of two polygons*
> geo util UDFs :
>  * _*GeoIdToGridXy(Long geoId) :* *Pair<Integer, Integer>*_
>  * _*LatLngToGeoId(**Long* *latitude, Long* *longitude) : Long*_
>  * _*GeoIdToLatLng(Long geoId) : Pair<Double, Double>*_
>  * _*ToUpperLayerGeoId(Long geoId) : Long*_
>  * _*ToRangeList (String polygon) : List<Long []>*_
> 3. Currently GeoID is a column created internally for spatial tables, this PR will support GeoID column to be customized during LOAD/INSERT INTO. For example, 
> {code:java}
> INSERT INTO geoTable SELECT 0,1575428400000,116285807,40084087;
> It uesed to be as below, '855280799612' is generated internally,
> +------------+-------------+---------+--------+
> |mygeohash  |timevalue   |longitude|latitude|
> +------------+-------------+---------+--------+
> |855280799612|1575428400000|116285807|40084087|
> +------------+-------------+---------+--------+
> but now is
> +------------+-------------+---------+--------+
> |mygeohash  |timevalue  |longitude|latitude|
> +------------+-------------+---------+--------+
> |0           |1575428400000|116285807|40084087|
> +------------+-------------+---------+--------+{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)