Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3774: [WIP] Make geoID visible

Classic

List

75 messages Options

Options

1234

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

VenuReddy2103 commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r437595830

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala
##########
@@ -568,8 +567,7 @@ object CommonLoadUtils {
.getColumnSchemaList.size()))
.asInstanceOf[Array[ColumnSchema]]
} else {
- colSchema = colSchema.filterNot(x => x.isInvisible || x.isComplexColumn ||
- x.getSchemaOrdinal == -1)
+ colSchema = colSchema.filterNot(x => x.isInvisible || x.isComplexColumn)

Review comment:
why remove getSchemaOrdinal check? we do not judge spatial column by schema ordinal as -1. And x.getSchemaOrdinal was not introduced with spatial support PR. Same applies to if case above.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r437599439

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -82,6 +81,45 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with BeforeAndAfterEach {
case None => assert(false)
}
}
+ test("test geo table drop spatial index column") {
+ createTable()
+ loadData()
+ val exception = intercept[MalformedCarbonCommandException](sql(s"alter table $table1 drop columns(mygeohash)"))
+ assert(exception.getMessage.contains(
+ s"Columns present in ${CarbonCommonConstants.SPATIAL_INDEX} table property cannot be altered."))
+ }
+
+ test("test geo table filter by geo spatial index column") {
+ createTable()
+ loadData()
+ checkAnswer(sql(s"select *from $table1 where mygeohash = '2196036'"),
+ Seq(Row(2196036,1575428400000L,116337069,39951887)))
+ }
+
+ test("test geo table create index on geohandler column") {
+ createTable()
+ loadData()
+ val exception = intercept[MalformedIndexCommandException](sql(
+ s"""
+ | CREATE INDEX bloom_index ON TABLE $table1 (mygeohash)
+ | AS 'bloomfilter'
+ | PROPERTIES('BLOOM_SIZE'='640000', 'BLOOM_FPP'='0.00001')
+ """.stripMargin))
+ assert(exception.getMessage.contains(
+ s"Spatial Index column is not supported, column 'mygeohash' is spatial column"))
+ }
+
+ test("test geo table custom properties on geohandler column") {
+ try {
+ createTable(table1,"'COLUMN_META_CACHE' = 'mygeohash',")

Review comment:
Test framework supports interction of exceptions. Instead of this try catch and assert, please use intercept like above testcases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r437700014

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -482,6 +490,7 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
null
}
var createOrderColumns = table.getCreateOrderColumn.asScala
+ .filterNot(_.getColumnSchema.isSpatialColumn)

Review comment:
Please refer above comment. Also you would want to revisit `convertToNoDictionaryToBytes()` and `convertToNoDictionaryToBytesWithoutReArrange()` method in `InputProcessorStepWithNoConverterImpl` class. Without this PR, length of `data` and `dataFields` args of those method were not matching as we not including spatial column. so we had some change in them. With your PR it can be simplified. Please refer PR #3760.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-648385173

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3197/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-648386101

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1471/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r444645924

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -122,7 +161,8 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with BeforeAndAfterEach {
createTable(sourceTable)
loadData(sourceTable)
createTable(targetTable, "'SORT_SCOPE'='GLOBAL_SORT',")
- sql(s"insert into $targetTable select * from $sourceTable")
+ sql(s"insert into $targetTable select timevalue, longitude," +

Review comment:
Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r444648223

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -653,6 +653,21 @@ object CommonUtil {
storeLocation
}

+ def validateForSpatialTypeColumn(properties: Map[String, String],

Review comment:
removed `validateSpatialIndexColumn()` from AlterTableUtil as validation to add column is no longer required, as spatial column is now part of schema.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-654179380

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-654383362

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3308/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-654385185

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1570/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r456271810

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala
##########
@@ -379,6 +380,17 @@ object CarbonSource {
if (isCreatedByCarbonExtension(properties)) {
// Table is created by SparkSession with CarbonExtension,
// There is no TableInfo yet, so create it from CatalogTable
+ val columnSchema = updateTable.schema
+ val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+ // validate for spatial index column
+ columnSchema.foreach(x => {
+ if (spatialProperty.isDefined &&

Review comment:
removed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r456272331

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -337,7 +331,7 @@ object CarbonParserUtil {

if (tableProperties.get(CarbonCommonConstants.COLUMN_META_CACHE).isDefined) {
// validate the column_meta_cache option
- val tableColumns = dims.view.filterNot(_.spatialIndex).map(x => x.name.get) ++
+ val tableColumns = dims.view.map(x => x.name.get) ++

Review comment:
yes, made changes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r456275841

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
convertedStaticPartition)
scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
if (logicalPartitionRelation != null) {
- if (selectedColumnSchema.length != logicalPartitionRelation.output.length) {
+ val properties = table.getTableInfo.getFactTable.getTableProperties.asScala

Review comment:
In case of an insert with original schema had to remove to match with the input projection list.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458007728

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -482,6 +490,7 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
null
}
var createOrderColumns = table.getCreateOrderColumn.asScala
+ .filterNot(_.getColumnSchema.isSpatialColumn)

Review comment:
even now `data` and `dataFields` args of the methods don't match when insert with original schema.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] brijoobopanna commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

brijoobopanna commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-662238464

retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-662284603

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1717/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-662284841

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3459/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458686578

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -677,6 +677,8 @@ object CarbonParserUtil {
val errorMsg = "range_column not support multiple columns"
throw new MalformedCarbonCommandException(errorMsg)
}
+ CommonUtil.validateForSpatialTypeColumn(tableProperties, rangeColumn,

Review comment:
Instead of checking property by property, once all the properties are filled, better to validate at one place ?

Also I see that for sortcolumns, column_metacache,no_inverted_index it is not handled

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458689351

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: Option[String],
convertedStaticPartition)
scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
if (logicalPartitionRelation != null) {
- if (selectedColumnSchema.length != logicalPartitionRelation.output.length) {
+ val properties = table.getTableInfo.getFactTable.getTableProperties.asScala
+ val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+ var expectedOutput = logicalPartitionRelation.output
+ if (spatialProperty.isDefined && selectedColumnSchema.size + 1 == expectedOutput.length) {

Review comment:
why the changes in this function ?

As user wanted to created geoSpatial column, we created an extra column, select * from table should include all the columns.

If the target table don't want geo column, user can specify projections.

**we should not skip spatial column while doing insert into**

@VenuReddy2103 , @ShreelekhyaG

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458690774

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
##########
@@ -70,12 +71,20 @@ private[sql] case class CarbonProjectForUpdateCommand(
val carbonTable = CarbonEnv.getCarbonTable(databaseNameOp, tableName)(sparkSession)
setAuditTable(carbonTable)
setAuditInfo(Map("plan" -> plan.simpleString))
+ val properties = carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+ val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
columns.foreach { col =>
val dataType = carbonTable.getColumnByName(col).getColumnSchema.getDataType
if (dataType.isComplexType) {
throw new UnsupportedOperationException("Unsupported operation on Complex data type")
}
-
+ if (spatialProperty.isDefined) {
+ if (col.contains(spatialProperty.get.trim)) {

Review comment:
why `contains` here ? it's a column name. suppose to be equalsIgnoreCase ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

1234