[ https://issues.apache.org/jira/browse/CARBONDATA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiayu Shen updated CARBONDATA-4051: ----------------------------------- Description: The requirement is from SEQ,related algorithms are provided by Discovery Team. 1. Replace geohash encoded algorithm, and reduce required properties of CREATE TABLE. For example, {code:java} CREATE TABLE geoTable( timevalue BIGINT, longitude LONG, latitude LONG) COMMENT "This is a GeoTable" STORED AS carbondata TBLPROPERTIES ($customProperties 'SPATIAL_INDEX'='mygeohash', 'SPATIAL_INDEX.mygeohash.type'='geohash', 'SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, latitude', 'SPATIAL_INDEX.mygeohash.originLatitude'='39.832277', 'SPATIAL_INDEX.mygeohash.gridSize'='50', 'SPATIAL_INDEX.mygeohash.conversionRatio'='1000000'){code} 2. Add geo query UDFs query filter UDFs : * _*InPolygonList (List<String> polygonList, OperationType opType)*_ * _*InPolylineList (List<String> polylineList, Float bufferInMeter)*_ * _*InPolygonRangeList (List<Long []> RangeList, **OperationType opType**)*_ *operation only support :* * *"OR", means calculating union of two polygons* * *"AND", means calculating intersection of two polygons* geo util UDFs : * _*GeoIdToGridXy(Long geoId) :* *Pair<Integer, Integer>*_ * _*LatLngToGeoId(**Long* *latitude, Long* *longitude) : Long*_ * _*GeoIdToLatLng(Long geoId) : Pair<Double, Double>*_ * _*ToUpperLayerGeoId(Long geoId) : Long*_ * _*ToRangeList (String polygon) : List<Long []>*_ 3. Currently GeoID is a column created internally for spatial tables, this PR will support GeoID column to be customized during LOAD/INSERT INTO. For example, {code:java} INSERT INTO geoTable SELECT 0,1575428400000,116285807,40084087; It uesed to be as below, '855280799612' is generated internally, +------------+-------------+---------+--------+ |mygeohash |timevalue |longitude|latitude| +------------+-------------+---------+--------+ |855280799612|1575428400000|116285807|40084087| +------------+-------------+---------+--------+ but now is +------------+-------------+---------+--------+ |mygeohash |timevalue |longitude|latitude| +------------+-------------+---------+--------+ |0 |1575428400000|116285807|40084087| +------------+-------------+---------+--------+{code} was: The requirement is from SEQ,related algorithms are provided by group Discovery. 1. Replace geohash encoded algorithm, and reduce required properties of CREATE TABLE. For example, {code:java} CREATE TABLE geoTable( timevalue BIGINT, longitude LONG, latitude LONG) COMMENT "This is a GeoTable" STORED AS carbondata TBLPROPERTIES ($customProperties 'SPATIAL_INDEX'='mygeohash', 'SPATIAL_INDEX.mygeohash.type'='geohash', 'SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, latitude', 'SPATIAL_INDEX.mygeohash.originLatitude'='39.832277', 'SPATIAL_INDEX.mygeohash.gridSize'='50', 'SPATIAL_INDEX.mygeohash.conversionRatio'='1000000'){code} 2. Add geo query UDFs query filter UDFs : * _*InPolygonList (List<String> polygonList, OperationType opType)*_ * _*InPolylineList (List<String> polylineList, Float bufferInMeter)*_ * _*InPolygonRangeList (List<Long []> RangeList, **OperationType opType**)*_ *operation only support :* * *"OR", means calculating union of two polygons* * *"AND", means calculating intersection of two polygons* geo util UDFs : * _*GeoIdToGridXy(Long geoId) :* *Pair<Integer, Integer>*_ * _*LatLngToGeoId(**Long* *latitude, Long* *longitude) : Long*_ * _*GeoIdToLatLng(Long geoId) : Pair<Double, Double>*_ * _*ToUpperLayerGeoId(Long geoId) : Long*_ * _*ToRangeList (String polygon) : List<Long []>*_ 3. Currently GeoID is a column created internally for spatial tables, this PR will support GeoID column to be customized during LOAD/INSERT INTO. For example, {code:java} INSERT INTO geoTable SELECT 0,1575428400000,116285807,40084087; It uesed to be as below, '855280799612' is generated internally, +------------+-------------+---------+--------+ |mygeohash |timevalue |longitude|latitude| +------------+-------------+---------+--------+ |855280799612|1575428400000|116285807|40084087| +------------+-------------+---------+--------+ but now is +------------+-------------+---------+--------+ |mygeohash |timevalue |longitude|latitude| +------------+-------------+---------+--------+ |0 |1575428400000|116285807|40084087| +------------+-------------+---------+--------+{code} > Geo spatial index algorithm improvement and UDFs enhancement > ------------------------------------------------------------ > > Key: CARBONDATA-4051 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4051 > Project: CarbonData > Issue Type: New Feature > Reporter: Jiayu Shen > Priority: Minor > Attachments: CarbonData Spatial Index Design Doc v2.docx > > Time Spent: 4h 20m > Remaining Estimate: 0h > > The requirement is from SEQ,related algorithms are provided by Discovery Team. > 1. Replace geohash encoded algorithm, and reduce required properties of CREATE TABLE. For example, > {code:java} > CREATE TABLE geoTable( > timevalue BIGINT, > longitude LONG, > latitude LONG) COMMENT "This is a GeoTable" > STORED AS carbondata > TBLPROPERTIES ($customProperties 'SPATIAL_INDEX'='mygeohash', > 'SPATIAL_INDEX.mygeohash.type'='geohash', > 'SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, latitude', > 'SPATIAL_INDEX.mygeohash.originLatitude'='39.832277', > 'SPATIAL_INDEX.mygeohash.gridSize'='50', > 'SPATIAL_INDEX.mygeohash.conversionRatio'='1000000'){code} > 2. Add geo query UDFs > query filter UDFs : > * _*InPolygonList (List<String> polygonList, OperationType opType)*_ > * _*InPolylineList (List<String> polylineList, Float bufferInMeter)*_ > * _*InPolygonRangeList (List<Long []> RangeList, **OperationType opType**)*_ > *operation only support :* > * *"OR", means calculating union of two polygons* > * *"AND", means calculating intersection of two polygons* > geo util UDFs : > * _*GeoIdToGridXy(Long geoId) :* *Pair<Integer, Integer>*_ > * _*LatLngToGeoId(**Long* *latitude, Long* *longitude) : Long*_ > * _*GeoIdToLatLng(Long geoId) : Pair<Double, Double>*_ > * _*ToUpperLayerGeoId(Long geoId) : Long*_ > * _*ToRangeList (String polygon) : List<Long []>*_ > 3. Currently GeoID is a column created internally for spatial tables, this PR will support GeoID column to be customized during LOAD/INSERT INTO. For example, > {code:java} > INSERT INTO geoTable SELECT 0,1575428400000,116285807,40084087; > It uesed to be as below, '855280799612' is generated internally, > +------------+-------------+---------+--------+ > |mygeohash |timevalue |longitude|latitude| > +------------+-------------+---------+--------+ > |855280799612|1575428400000|116285807|40084087| > +------------+-------------+---------+--------+ > but now is > +------------+-------------+---------+--------+ > |mygeohash |timevalue |longitude|latitude| > +------------+-------------+---------+--------+ > |0 |1575428400000|116285807|40084087| > +------------+-------------+---------+--------+{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) |
Free forum by Nabble | Edit this page |