Apache CarbonData Dev Mailing List archive

[GitHub] [carbondata-site] ajantha-bhat opened a new pull request #79: [WIP] update carbondata 2.1.1 release

Classic

List

8 messages Options

Options

GitBox

[GitHub] [carbondata-site] ajantha-bhat opened a new pull request #79: [WIP] update carbondata 2.1.1 release

ajantha-bhat opened a new pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79

1. Added a dropdown for carbondata 2.1.1 release
2. TODO: update spatial index and clean files document

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata-site] ajantha-bhat commented on pull request #79: Update carbondata 2.1.1 release

ajantha-bhat commented on pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79#issuecomment-816671605

@sraghunandan , @chenliang613 : Please review and merge this.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata-site] PurujitChaugule commented on a change in pull request #79: Update carbondata 2.1.1 release

In reply to this post by GitBox

PurujitChaugule commented on a change in pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79#discussion_r611622451

##########
File path: src/site/markdown/spatial-index-guide.md
##########
@@ -32,12 +32,28 @@ Below figure shows the relationship between the grid and the points residing in
![File Directory Structure](../docs/images/spatial-index-1.png?raw=true)

Carbon supports Polygon User Defined Function(UDF) as filter condition in the query to return all the data points lying within it. Polygon UDF takes multiple points(i.e., pair of longitude and latitude) separated by a comma. Longitude and latitude in the pair are separated by a space. The first and last points in the polygon must be same to form a closed loop. CarbonData builds a quad tree using this polygon and spatial region information passed while creating a table. The nodes in the quad tree are composed of indices generated by the row and column information projected in the polygon area. When the grid center point lies within the polygon area, the grid is considered as selected. In the following figure, user selects a quadrilateral shaped polygon. The grid at the center of the region is chosen to build a quad tree. Once tree is build, all the leafs are scanned to get the list of range of indices(with each range consisting of minimum index and maximum index in the range). All th
e indices starting from minimum to maximum in each range forms the result.
-The main reasons for faster query response are as follows :
-* Data is sorted based on the index values.
-* Polygon UDF filter is pushed down from engine to the carbon layer such that CarbonData scans only matched blocklets avoiding full scan.

![File Directory Structure](../docs/images/spatial-index-2.png?raw=true)

+There are some other UDFs supporting more filter conditions in the query, including Polygon List, Polyline List, and spatial index range list.
+
+Polygon List UDF takes multiple polygons(i.e., a set of points) and operation type for combining polygons. Only `OR` and `AND` are supported at present, operation 'OR' means union of multiple polygons and 'AND' means intersection of that, shown as the following figure. Then CarbonData gets the list of range of indices from the combined region by quad tree, which is the same processing as Polygon UDF.
+
+![File Directory Structure](../docs/images/spatial-index-polygonlist.png?raw=true)
+
+Polyline List UDF takes multiple polylines(i.e., a set of points) and buffer in meter. CarbonData first converts polyline to polygon and then gets the list of range of indices from these polygons. The processing is the same as Polygon UDF and return all the data points lying within the buffer region of polylines.
+

Review comment:
can mention the unit of measurement i.e. meters, used for buffer size in PolyLine UDF

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata-site] ajantha-bhat commented on a change in pull request #79: Update carbondata 2.1.1 release

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79#discussion_r611626814

##########
File path: src/site/markdown/spatial-index-guide.md
##########
@@ -32,12 +32,28 @@ Below figure shows the relationship between the grid and the points residing in
![File Directory Structure](../docs/images/spatial-index-1.png?raw=true)

Carbon supports Polygon User Defined Function(UDF) as filter condition in the query to return all the data points lying within it. Polygon UDF takes multiple points(i.e., pair of longitude and latitude) separated by a comma. Longitude and latitude in the pair are separated by a space. The first and last points in the polygon must be same to form a closed loop. CarbonData builds a quad tree using this polygon and spatial region information passed while creating a table. The nodes in the quad tree are composed of indices generated by the row and column information projected in the polygon area. When the grid center point lies within the polygon area, the grid is considered as selected. In the following figure, user selects a quadrilateral shaped polygon. The grid at the center of the region is chosen to build a quad tree. Once tree is build, all the leafs are scanned to get the list of range of indices(with each range consisting of minimum index and maximum index in the range). All th
e indices starting from minimum to maximum in each range forms the result.
-The main reasons for faster query response are as follows :
-* Data is sorted based on the index values.
-* Polygon UDF filter is pushed down from engine to the carbon layer such that CarbonData scans only matched blocklets avoiding full scan.

![File Directory Structure](../docs/images/spatial-index-2.png?raw=true)

+There are some other UDFs supporting more filter conditions in the query, including Polygon List, Polyline List, and spatial index range list.
+
+Polygon List UDF takes multiple polygons(i.e., a set of points) and operation type for combining polygons. Only `OR` and `AND` are supported at present, operation 'OR' means union of multiple polygons and 'AND' means intersection of that, shown as the following figure. Then CarbonData gets the list of range of indices from the combined region by quad tree, which is the same processing as Polygon UDF.
+
+![File Directory Structure](../docs/images/spatial-index-polygonlist.png?raw=true)
+
+Polyline List UDF takes multiple polylines(i.e., a set of points) and buffer in meter. CarbonData first converts polyline to polygon and then gets the list of range of indices from these polygons. The processing is the same as Polygon UDF and return all the data points lying within the buffer region of polylines.
+

Review comment:
whatever there in github document, same thing I ma updating in the website.
https://github.com/apache/carbondata/tree/master/docs

This kind of changes you cannot request here, please comment on original PR or raise new PR in github.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata-site] ajantha-bhat commented on a change in pull request #79: Update carbondata 2.1.1 release

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79#discussion_r611626814

##########
File path: src/site/markdown/spatial-index-guide.md
##########
@@ -32,12 +32,28 @@ Below figure shows the relationship between the grid and the points residing in
![File Directory Structure](../docs/images/spatial-index-1.png?raw=true)

Carbon supports Polygon User Defined Function(UDF) as filter condition in the query to return all the data points lying within it. Polygon UDF takes multiple points(i.e., pair of longitude and latitude) separated by a comma. Longitude and latitude in the pair are separated by a space. The first and last points in the polygon must be same to form a closed loop. CarbonData builds a quad tree using this polygon and spatial region information passed while creating a table. The nodes in the quad tree are composed of indices generated by the row and column information projected in the polygon area. When the grid center point lies within the polygon area, the grid is considered as selected. In the following figure, user selects a quadrilateral shaped polygon. The grid at the center of the region is chosen to build a quad tree. Once tree is build, all the leafs are scanned to get the list of range of indices(with each range consisting of minimum index and maximum index in the range). All th
e indices starting from minimum to maximum in each range forms the result.
-The main reasons for faster query response are as follows :
-* Data is sorted based on the index values.
-* Polygon UDF filter is pushed down from engine to the carbon layer such that CarbonData scans only matched blocklets avoiding full scan.

![File Directory Structure](../docs/images/spatial-index-2.png?raw=true)

+There are some other UDFs supporting more filter conditions in the query, including Polygon List, Polyline List, and spatial index range list.
+
+Polygon List UDF takes multiple polygons(i.e., a set of points) and operation type for combining polygons. Only `OR` and `AND` are supported at present, operation 'OR' means union of multiple polygons and 'AND' means intersection of that, shown as the following figure. Then CarbonData gets the list of range of indices from the combined region by quad tree, which is the same processing as Polygon UDF.
+
+![File Directory Structure](../docs/images/spatial-index-polygonlist.png?raw=true)
+
+Polyline List UDF takes multiple polylines(i.e., a set of points) and buffer in meter. CarbonData first converts polyline to polygon and then gets the list of range of indices from these polygons. The processing is the same as Polygon UDF and return all the data points lying within the buffer region of polylines.
+

Review comment:
whatever there in github document, same thing I am updating in the website.
https://github.com/apache/carbondata/tree/master/docs

This kind of changes you cannot request here, please comment on original PR or raise new PR in github.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata-site] PurujitChaugule commented on a change in pull request #79: Update carbondata 2.1.1 release

In reply to this post by GitBox

PurujitChaugule commented on a change in pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79#discussion_r611622451

##########
File path: src/site/markdown/spatial-index-guide.md
##########
@@ -32,12 +32,28 @@ Below figure shows the relationship between the grid and the points residing in
![File Directory Structure](../docs/images/spatial-index-1.png?raw=true)

Carbon supports Polygon User Defined Function(UDF) as filter condition in the query to return all the data points lying within it. Polygon UDF takes multiple points(i.e., pair of longitude and latitude) separated by a comma. Longitude and latitude in the pair are separated by a space. The first and last points in the polygon must be same to form a closed loop. CarbonData builds a quad tree using this polygon and spatial region information passed while creating a table. The nodes in the quad tree are composed of indices generated by the row and column information projected in the polygon area. When the grid center point lies within the polygon area, the grid is considered as selected. In the following figure, user selects a quadrilateral shaped polygon. The grid at the center of the region is chosen to build a quad tree. Once tree is build, all the leafs are scanned to get the list of range of indices(with each range consisting of minimum index and maximum index in the range). All th
e indices starting from minimum to maximum in each range forms the result.
-The main reasons for faster query response are as follows :
-* Data is sorted based on the index values.
-* Polygon UDF filter is pushed down from engine to the carbon layer such that CarbonData scans only matched blocklets avoiding full scan.

![File Directory Structure](../docs/images/spatial-index-2.png?raw=true)

+There are some other UDFs supporting more filter conditions in the query, including Polygon List, Polyline List, and spatial index range list.
+
+Polygon List UDF takes multiple polygons(i.e., a set of points) and operation type for combining polygons. Only `OR` and `AND` are supported at present, operation 'OR' means union of multiple polygons and 'AND' means intersection of that, shown as the following figure. Then CarbonData gets the list of range of indices from the combined region by quad tree, which is the same processing as Polygon UDF.
+
+![File Directory Structure](../docs/images/spatial-index-polygonlist.png?raw=true)
+
+Polyline List UDF takes multiple polylines(i.e., a set of points) and buffer in meter. CarbonData first converts polyline to polygon and then gets the list of range of indices from these polygons. The processing is the same as Polygon UDF and return all the data points lying within the buffer region of polylines.
+

Review comment:
can mention the unit of measurement i.e. meters, used for buffer size in PolyLine UDF

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata-site] ajantha-bhat commented on a change in pull request #79: Update carbondata 2.1.1 release

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79#discussion_r611626814

##########
File path: src/site/markdown/spatial-index-guide.md
##########
@@ -32,12 +32,28 @@ Below figure shows the relationship between the grid and the points residing in
![File Directory Structure](../docs/images/spatial-index-1.png?raw=true)

Carbon supports Polygon User Defined Function(UDF) as filter condition in the query to return all the data points lying within it. Polygon UDF takes multiple points(i.e., pair of longitude and latitude) separated by a comma. Longitude and latitude in the pair are separated by a space. The first and last points in the polygon must be same to form a closed loop. CarbonData builds a quad tree using this polygon and spatial region information passed while creating a table. The nodes in the quad tree are composed of indices generated by the row and column information projected in the polygon area. When the grid center point lies within the polygon area, the grid is considered as selected. In the following figure, user selects a quadrilateral shaped polygon. The grid at the center of the region is chosen to build a quad tree. Once tree is build, all the leafs are scanned to get the list of range of indices(with each range consisting of minimum index and maximum index in the range). All th
e indices starting from minimum to maximum in each range forms the result.
-The main reasons for faster query response are as follows :
-* Data is sorted based on the index values.
-* Polygon UDF filter is pushed down from engine to the carbon layer such that CarbonData scans only matched blocklets avoiding full scan.

![File Directory Structure](../docs/images/spatial-index-2.png?raw=true)

+There are some other UDFs supporting more filter conditions in the query, including Polygon List, Polyline List, and spatial index range list.
+
+Polygon List UDF takes multiple polygons(i.e., a set of points) and operation type for combining polygons. Only `OR` and `AND` are supported at present, operation 'OR' means union of multiple polygons and 'AND' means intersection of that, shown as the following figure. Then CarbonData gets the list of range of indices from the combined region by quad tree, which is the same processing as Polygon UDF.
+
+![File Directory Structure](../docs/images/spatial-index-polygonlist.png?raw=true)
+
+Polyline List UDF takes multiple polylines(i.e., a set of points) and buffer in meter. CarbonData first converts polyline to polygon and then gets the list of range of indices from these polygons. The processing is the same as Polygon UDF and return all the data points lying within the buffer region of polylines.
+

Review comment:
whatever there in github document, same thing I am updating in the website.
https://github.com/apache/carbondata/tree/master/docs

This kind of changes you cannot request here, please comment on original PR or raise new PR in github.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata-site] ajantha-bhat closed pull request #79: Update carbondata 2.1.1 release

In reply to this post by GitBox

ajantha-bhat closed pull request #79:
URL: https://github.com/apache/carbondata-site/pull/79

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]