[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3774: [WIP] Make geoID visible

classic Classic list List threaded Threaded
75 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3774: [WIP] Make geoID visible

GitBox

ShreelekhyaG opened a new pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774


    ### Why is this PR needed?
    To make geohash column visible to the user
   
    ### What changes were proposed in this PR?
   Generated geohash column is included in the schema. Validation added to avoid alter and drop geohash column.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [WIP] Make geoID visible

GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-634249313






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ydvpankaj99 commented on pull request #3774: [WIP] Make geoID visible

GitBox
In reply to this post by GitBox

ydvpankaj99 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-634578301


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [WIP] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-634635351


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3076/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [WIP] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-634636348


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1355/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [WIP] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635214295


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1362/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [WIP] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635215816


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3083/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on pull request #3774: [WIP] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635226998


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA - 3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635396909


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1367/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA - 3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635397490


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3088/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on pull request #3774: [CARBONDATA - 3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635465726


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA - 3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635531765


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1368/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA - 3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-635531975


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3089/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-636062526


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3090/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-636064979


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1369/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-636458681


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3104/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-636458697


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1381/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-636504454


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3111/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-636504686


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1387/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r437199601



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -122,7 +161,8 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with BeforeAndAfterEach {
     createTable(sourceTable)
     loadData(sourceTable)
     createTable(targetTable, "'SORT_SCOPE'='GLOBAL_SORT',")
-    sql(s"insert into  $targetTable select * from $sourceTable")
+    sql(s"insert into  $targetTable select timevalue, longitude," +

Review comment:
       why select * is changed to individual columns ? Probably because you get geospatial column as well when you give select *.  But, we need to handle this insert into select *.

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -653,6 +653,21 @@ object CommonUtil {
     storeLocation
   }
 
+  def validateForSpatialTypeColumn(properties: Map[String, String],

Review comment:
       `columnPropert` string is just to use that in exception message. we alreday have a similar method `validateSpatialIndexColumn()` for it. Suggest to reuse.

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -653,6 +653,21 @@ object CommonUtil {
     storeLocation
   }
 
+  def validateForSpatialTypeColumn(properties: Map[String, String],

Review comment:
       `columnPropert` string is just to use that in exception message. we alreday have a similar method `validateSpatialIndexColumn()` for it. Suggest to modify existing and reuse instead of complete new method for the same validation.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala
##########
@@ -379,6 +380,17 @@ object CarbonSource {
     if (isCreatedByCarbonExtension(properties)) {
       // Table is created by SparkSession with CarbonExtension,
       // There is no TableInfo yet, so create it from CatalogTable
+      val columnSchema = updateTable.schema
+      val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+      // validate for spatial index column
+      columnSchema.foreach(x => {
+        if (spatialProperty.isDefined &&

Review comment:
       instead of `spatialProperty.isDefined`check inside the loop it should be outside. We can avoid unnecessary loop traversal if the property itself is not defined.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala
##########
@@ -379,6 +380,17 @@ object CarbonSource {
     if (isCreatedByCarbonExtension(properties)) {
       // Table is created by SparkSession with CarbonExtension,
       // There is no TableInfo yet, so create it from CatalogTable
+      val columnSchema = updateTable.schema

Review comment:
       why moved this exception case alone here from `processSpatialIndexProperty` ? Probably we have missed something here. I think, we have many other invalid property exceptions in `processSpatialIndexProperty` and even for other properties in `prepareTableModel()`.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -213,8 +207,8 @@ object CarbonParserUtil {
 
     // Process spatial index property
     val indexFields = processSpatialIndexProperty(tableProperties, fields)
-    val allFields = fields ++ indexFields
 
+    val allFields = (fields ++ indexFields).distinct

Review comment:
       why distinct ? have already checked that indexFields are not in actual fields. right ? If required please add the comments here.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -213,8 +207,8 @@ object CarbonParserUtil {
 
     // Process spatial index property
     val indexFields = processSpatialIndexProperty(tableProperties, fields)
-    val allFields = fields ++ indexFields
 
+    val allFields = (fields ++ indexFields).distinct

Review comment:
       why distinct ? have already checked that indexFields are not in actual fields. right ? If required please add the comments here. Same applies for another instance below.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -337,7 +331,7 @@ object CarbonParserUtil {
 
     if (tableProperties.get(CarbonCommonConstants.COLUMN_META_CACHE).isDefined) {
       // validate the column_meta_cache option
-      val tableColumns = dims.view.filterNot(_.spatialIndex).map(x => x.name.get) ++
+      val tableColumns = dims.view.map(x => x.name.get) ++

Review comment:
       if we had not removed this filter, then you wouldn't require modification in `validateColumnMetaCacheFields` or `validateColumnMetaCacheOption` i guess.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -337,7 +331,7 @@ object CarbonParserUtil {
 
     if (tableProperties.get(CarbonCommonConstants.COLUMN_META_CACHE).isDefined) {
       // validate the column_meta_cache option
-      val tableColumns = dims.view.filterNot(_.spatialIndex).map(x => x.name.get) ++
+      val tableColumns = dims.view.map(x => x.name.get) ++

Review comment:
       if we had not removed this filter, then you wouldn't require modification in `validateColumnMetaCacheFields` `validateColumnMetaCacheOption`, `validateColumnMetaCacheAndCacheLevel` i guess.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
       convertedStaticPartition)
     scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
     if (logicalPartitionRelation != null) {
-      if (selectedColumnSchema.length != logicalPartitionRelation.output.length) {
+      val properties = table.getTableInfo.getFactTable.getTableProperties.asScala

Review comment:
       In `getReArrangedIndexAndSelectedSchema()` above, we make `selectedColumnSchema` and return. In that method, we are excluding SpatialColumn column. That is the reason why the `selectedColumnSchema` & `logicalPartitionRelation.output` do not match here. I think, you need to fix there. then this change wouldn't be required anymore.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
       convertedStaticPartition)
     scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
     if (logicalPartitionRelation != null) {
-      if (selectedColumnSchema.length != logicalPartitionRelation.output.length) {
+      val properties = table.getTableInfo.getFactTable.getTableProperties.asScala

Review comment:
       In `getReArrangedIndexAndSelectedSchema()` above, we make `selectedColumnSchema` and return. In that method, we are excluding SpatialColumn column. That is the reason why the `selectedColumnSchema` & `logicalPartitionRelation.output` do not match here. Probably you need to fix there. then this change wouldn't be required anymore.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
       convertedStaticPartition)
     scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
     if (logicalPartitionRelation != null) {
-      if (selectedColumnSchema.length != logicalPartitionRelation.output.length) {
+      val properties = table.getTableInfo.getFactTable.getTableProperties.asScala

Review comment:
       In `getReArrangedIndexAndSelectedSchema()` above, we make `selectedColumnSchema` and return. In that method, we are excluding SpatialColumn column. That is the reason why the `selectedColumnSchema` & `logicalPartitionRelation.output` do not match here. Probably you need to fix there. then this change wouldn't be required anymore ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


1234