[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3774: [WIP] Make geoID visible

classic Classic list List threaded Threaded
75 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458694598



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
##########
@@ -266,16 +266,24 @@ case class CarbonPreInsertionCasts(sparkSession: SparkSession) extends Rule[Logi
       relation: LogicalRelation,
       child: LogicalPlan): LogicalPlan = {
     val carbonDSRelation = relation.relation.asInstanceOf[CarbonDatasourceHadoopRelation]
-    if (carbonDSRelation.carbonRelation.output.size > CarbonCommonConstants
+    val carbonTable = carbonDSRelation.carbonRelation.carbonTable
+    val properties = carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+    val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+    var expectedOutput = carbonDSRelation.carbonRelation.output
+    // have to remove geo column to support insert with original schema

Review comment:
       I think no need to remove this column from original schema as it is a visible column , same reason as I mentioned for `insertIntoCommand`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458695151



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
##########
@@ -228,8 +230,15 @@ class CarbonFileMetastore extends CarbonMetaStore {
             c.getClass.getName.equals("org.apache.spark.sql.catalyst.catalog.HiveTableRelation") ||
             c.getClass.getName.equals(
               "org.apache.spark.sql.catalyst.catalog.UnresolvedCatalogRelation")) =>
-        val catalogTable =
+        var catalogTable =
           CarbonReflectionUtils.getFieldOfCatalogTable("tableMeta", c).asInstanceOf[CatalogTable]
+        // remove spatial column from schema

Review comment:
       same as above, we can register this column as it is visible column




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458696449



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -82,6 +81,33 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with BeforeAndAfterEach {
       case None => assert(false)
     }
   }
+  test("test geo table drop spatial index column") {
+    createTable()
+    loadData()
+    val exception = intercept[MalformedCarbonCommandException](sql(s"alter table $table1 drop columns(mygeohash)"))
+    assert(exception.getMessage.contains(

Review comment:
       some more test case can be added with mixed case column names for other table property validation added, example range column, bucket column etc




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458697336



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
##########
@@ -209,6 +209,11 @@ private[sql] case class CarbonCreateSecondaryIndexCommand(
             .get
         }")
       }
+      val isSpatialColPresent = dims.find(x => x.getColumnSchema.isSpatialColumn)

Review comment:
       Please check , I think we need to remove isSpatialColumn from columnSchema.
   we used that mainly for making  it invisible. Now as it is visible. It is just another plan column. Instead we can check if column name is in the table property or not.
   
   @VenuReddy2103 @ShreelekhyaG




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r460033679



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
       convertedStaticPartition)
     scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
     if (logicalPartitionRelation != null) {
-      if (selectedColumnSchema.length != logicalPartitionRelation.output.length) {
+      val properties = table.getTableInfo.getFactTable.getTableProperties.asScala
+      val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+      var expectedOutput = logicalPartitionRelation.output
+      if (spatialProperty.isDefined && selectedColumnSchema.size + 1 == expectedOutput.length) {

Review comment:
       Yes, select *from table includes all columns including geoSpatial. I have added a testcase for that now. This change is when a user tries to insert with original schema. Like `sql(s"insert into $table1 select 1575428400000,116285807,40084087")`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r460034270



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -82,6 +81,33 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with BeforeAndAfterEach {
       case None => assert(false)
     }
   }
+  test("test geo table drop spatial index column") {
+    createTable()
+    loadData()
+    val exception = intercept[MalformedCarbonCommandException](sql(s"alter table $table1 drop columns(mygeohash)"))
+    assert(exception.getMessage.contains(

Review comment:
       ok done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r460034871



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
##########
@@ -70,12 +71,20 @@ private[sql] case class CarbonProjectForUpdateCommand(
     val carbonTable = CarbonEnv.getCarbonTable(databaseNameOp, tableName)(sparkSession)
     setAuditTable(carbonTable)
     setAuditInfo(Map("plan" -> plan.simpleString))
+    val properties = carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+    val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
     columns.foreach { col =>
       val dataType = carbonTable.getColumnByName(col).getColumnSchema.getDataType
       if (dataType.isComplexType) {
         throw new UnsupportedOperationException("Unsupported operation on Complex data type")
       }
-
+      if (spatialProperty.isDefined) {
+        if (col.contains(spatialProperty.get.trim)) {

Review comment:
       changed now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r460035762



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -677,6 +677,8 @@ object CarbonParserUtil {
         val errorMsg = "range_column not support multiple columns"
         throw new MalformedCarbonCommandException(errorMsg)
       }
+      CommonUtil.validateForSpatialTypeColumn(tableProperties, rangeColumn,

Review comment:
       modified the method to validate all table properties in one place as suggested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r460033679



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
       convertedStaticPartition)
     scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
     if (logicalPartitionRelation != null) {
-      if (selectedColumnSchema.length != logicalPartitionRelation.output.length) {
+      val properties = table.getTableInfo.getFactTable.getTableProperties.asScala
+      val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+      var expectedOutput = logicalPartitionRelation.output
+      if (spatialProperty.isDefined && selectedColumnSchema.size + 1 == expectedOutput.length) {

Review comment:
       Yes, select *from table includes all columns including geoSpatial. I have added a testcase for that now. This change is when a user tries to insert with original schema. Like `sql(s"insert into $table1 select 1575428400000,116285807,40084087")` . As Spatial column is not given by the user and is internally generated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-663581874


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3497/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-663586046


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1755/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r460688922



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -112,6 +238,23 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with BeforeAndAfterEach {
       result)
   }
 
+  test("test insert into non-geo table select from geo table") {

Review comment:
       please add a test case of insert into geo table, where insert rows will not have geo data. but select *  shows geo data




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r461343500



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -112,6 +238,23 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with BeforeAndAfterEach {
       result)
   }
 
+  test("test insert into non-geo table select from geo table") {

Review comment:
       modified existing test case and added validation for the geo column.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
##########
@@ -209,6 +209,11 @@ private[sql] case class CarbonCreateSecondaryIndexCommand(
             .get
         }")
       }
+      val isSpatialColPresent = dims.find(x => x.getColumnSchema.isSpatialColumn)

Review comment:
       removed isSpatialColumn from schema.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
##########
@@ -228,8 +230,15 @@ class CarbonFileMetastore extends CarbonMetaStore {
             c.getClass.getName.equals("org.apache.spark.sql.catalyst.catalog.HiveTableRelation") ||
             c.getClass.getName.equals(
               "org.apache.spark.sql.catalyst.catalog.UnresolvedCatalogRelation")) =>
-        val catalogTable =
+        var catalogTable =
           CarbonReflectionUtils.getFieldOfCatalogTable("tableMeta", c).asInstanceOf[CatalogTable]
+        // remove spatial column from schema

Review comment:
       Here, catalogTable will have spatial column in schema which is used to build carbon table.  As spatial column is not supposed to be present in user-defined columns, removing it here. Later from tableproperties the column will be added in carbonTable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r461692563



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
##########
@@ -210,6 +210,14 @@ private[sql] case class CarbonCreateSecondaryIndexCommand(
             .get
         }")
       }
+      val properties = carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+      val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+      if (spatialProperty.isDefined) {
+        if (dims.find(x => x.getColName.equalsIgnoreCase(spatialProperty.get.trim)).isDefined) {

Review comment:
       you would want to check for index column names.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-664592745






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r462128423



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
##########
@@ -210,6 +210,14 @@ private[sql] case class CarbonCreateSecondaryIndexCommand(
             .get
         }")
       }
+      val properties = carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+      val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+      if (spatialProperty.isDefined) {
+        if (dims.find(x => x.getColName.equalsIgnoreCase(spatialProperty.get.trim)).isDefined) {

Review comment:
       changed now

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
##########
@@ -266,16 +266,24 @@ case class CarbonPreInsertionCasts(sparkSession: SparkSession) extends Rule[Logi
       relation: LogicalRelation,
       child: LogicalPlan): LogicalPlan = {
     val carbonDSRelation = relation.relation.asInstanceOf[CarbonDatasourceHadoopRelation]
-    if (carbonDSRelation.carbonRelation.output.size > CarbonCommonConstants
+    val carbonTable = carbonDSRelation.carbonRelation.carbonTable
+    val properties = carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+    val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+    var expectedOutput = carbonDSRelation.carbonRelation.output
+    // have to remove geo column to support insert with original schema

Review comment:
       same reason as above. To support inset without geo column.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-665593991






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-665813658






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-666112316


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r462751220



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
##########
@@ -228,8 +230,15 @@ class CarbonFileMetastore extends CarbonMetaStore {
             c.getClass.getName.equals("org.apache.spark.sql.catalyst.catalog.HiveTableRelation") ||
             c.getClass.getName.equals(
               "org.apache.spark.sql.catalyst.catalog.UnresolvedCatalogRelation")) =>
-        val catalogTable =
+        var catalogTable =
           CarbonReflectionUtils.getFieldOfCatalogTable("tableMeta", c).asInstanceOf[CatalogTable]
+        // remove spatial column from schema

Review comment:
       would have added this as comment !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


1234