Indhumathi27 opened a new pull request #3846: URL: https://github.com/apache/carbondata/pull/3846 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-658779577 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1656/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-658780719 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3398/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-658879364 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3400/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-658880484 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1658/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#discussion_r457857926 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/merge/CarbonMergeDataSetCommand.scala ########## @@ -175,6 +178,21 @@ case class CarbonMergeDataSetCommand( LOGGER.error("writing of update status file failed") throw new CarbonMergeDataSetException("writing of update status file failed") } + if (carbonTable.isHivePartitionTable) { + // If load count is 0 and if merge action contains delete operation, update + // tableUpdateStatus file name in loadMeta entry + if (count == 0 && hasDelAction && !tuple._1.isEmpty) { + val loadMetaDataDetails = SegmentStatusManager.readTableStatusFile(CarbonTablePath + .getTableStatusFilePath(carbonTable.getTablePath)) + CarbonUpdateUtil.updateTableMetadataStatus(loadMetaDataDetails.map(l => + new Segment(l.getMergedLoadName, Review comment: rename variable `l` to `loadMetadataDetails` ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/merge/MergeTestCase.scala ########## @@ -725,6 +725,45 @@ class MergeTestCase extends QueryTest with BeforeAndAfterAll { checkAnswer(sql("select * from target order by key"), Seq(Row("c", "200"), Row("d", "3"), Row("e", "100"))) } + test("check the ccd delete with partition") { + sql("drop table if exists target") + + val initframe = sqlContext.sparkSession.createDataFrame(Seq( + Row("a", "0"), + Row("b", "1"), + Row("c", "2"), + Row("d", "3") + ).asJava, StructType(Seq(StructField("key", StringType), StructField("value", StringType)))) + + initframe.write + .format("carbondata") + .option("tableName", "target") + .option("partitionColumns", "value") + .mode(SaveMode.Overwrite) + .save() + val target = sqlContext.read.format("carbondata").option("tableName", "target").load() + var ccd = + sqlContext.sparkSession.createDataFrame(Seq( + Row("a", null, true, 1), + Row("b", null, true, 2), + Row("c", null, true, 3), + Row("e", "100", false, 6) + ).asJava, + StructType(Seq(StructField("key", StringType), + StructField("newValue", StringType), + StructField("deleted", BooleanType), StructField("time", IntegerType)))) + ccd.createOrReplaceTempView("changes") + + ccd = sql("SELECT key, latest.newValue as newValue, latest.deleted as deleted FROM ( SELECT key, max(struct(time, newValue, deleted)) as latest FROM changes GROUP BY key)") + + target.as("A").merge(ccd.as("B"), "A.key=B.key"). + whenMatched("B.deleted=true").delete().execute() + + assert(getDeleteDeltaFileCount("target", "0") == 0) Review comment: if delete operation is successful, then it should contain delete delta file, please check and add proper assert for delete delta file count ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-661748809 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3449/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-661749375 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1707/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#discussion_r458017848 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/merge/CarbonMergeDataSetCommand.scala ########## @@ -175,6 +178,21 @@ case class CarbonMergeDataSetCommand( LOGGER.error("writing of update status file failed") throw new CarbonMergeDataSetException("writing of update status file failed") } + if (carbonTable.isHivePartitionTable) { + // If load count is 0 and if merge action contains delete operation, update + // tableUpdateStatus file name in loadMeta entry + if (count == 0 && hasDelAction && !tuple._1.isEmpty) { + val loadMetaDataDetails = SegmentStatusManager.readTableStatusFile(CarbonTablePath + .getTableStatusFilePath(carbonTable.getTablePath)) + CarbonUpdateUtil.updateTableMetadataStatus(loadMetaDataDetails.map(loadMetadataDetails => Review comment: ```suggestion CarbonUpdateUtil.updateTableMetadataStatus(loadMetaDataDetails.map(loadMetadataDetail => ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#discussion_r458018601 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/merge/MergeTestCase.scala ########## @@ -725,6 +725,47 @@ class MergeTestCase extends QueryTest with BeforeAndAfterAll { checkAnswer(sql("select * from target order by key"), Seq(Row("c", "200"), Row("d", "3"), Row("e", "100"))) } + test("check the ccd delete with partition") { Review comment: ```suggestion test("check the cdc delete with partition") { ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#discussion_r458018787 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/merge/MergeTestCase.scala ########## @@ -725,6 +725,47 @@ class MergeTestCase extends QueryTest with BeforeAndAfterAll { checkAnswer(sql("select * from target order by key"), Seq(Row("c", "200"), Row("d", "3"), Row("e", "100"))) } + test("check the ccd delete with partition") { + sql("drop table if exists target") + + val initframe = sqlContext.sparkSession.createDataFrame(Seq( + Row("a", "0"), + Row("a1", "0"), + Row("b", "1"), + Row("c", "2"), + Row("d", "3") + ).asJava, StructType(Seq(StructField("key", StringType), StructField("value", StringType)))) + + initframe.write + .format("carbondata") + .option("tableName", "target") + .option("partitionColumns", "value") + .mode(SaveMode.Overwrite) + .save() + val target = sqlContext.read.format("carbondata").option("tableName", "target").load() + var ccd = Review comment: ```suggestion var cdc = ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#discussion_r458018898 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/merge/MergeTestCase.scala ########## @@ -725,6 +725,47 @@ class MergeTestCase extends QueryTest with BeforeAndAfterAll { checkAnswer(sql("select * from target order by key"), Seq(Row("c", "200"), Row("d", "3"), Row("e", "100"))) } + test("check the ccd delete with partition") { + sql("drop table if exists target") + + val initframe = sqlContext.sparkSession.createDataFrame(Seq( + Row("a", "0"), + Row("a1", "0"), + Row("b", "1"), + Row("c", "2"), + Row("d", "3") + ).asJava, StructType(Seq(StructField("key", StringType), StructField("value", StringType)))) + + initframe.write + .format("carbondata") + .option("tableName", "target") + .option("partitionColumns", "value") + .mode(SaveMode.Overwrite) + .save() + val target = sqlContext.read.format("carbondata").option("tableName", "target").load() + var ccd = + sqlContext.sparkSession.createDataFrame(Seq( + Row("a", null, true, 1), + Row("a1", null, false, 1), + Row("b", null, true, 2), + Row("c", null, true, 3), + Row("e", "100", false, 6) + ).asJava, + StructType(Seq(StructField("key", StringType), + StructField("newValue", StringType), + StructField("deleted", BooleanType), StructField("time", IntegerType)))) + ccd.createOrReplaceTempView("changes") + + ccd = sql("SELECT key, latest.newValue as newValue, latest.deleted as deleted FROM ( SELECT key, max(struct(time, newValue, deleted)) as latest FROM changes GROUP BY key)") Review comment: same as above ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#discussion_r458019493 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/merge/MergeTestCase.scala ########## @@ -869,10 +910,15 @@ class MergeTestCase extends QueryTest with BeforeAndAfterAll { CarbonProperties.getInstance().addProperty("carbon.enable.auto.load.merge", "false") } - private def getDeleteDeltaFileCount(tableName: String, segment: String): Int = { + private def getDeleteDeltaFileCount(tableName: String, + segment: String, Review comment: no need to change the method signature, you are already having the carbonTable object, check from it whether its partition table or not. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-662011240 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3458/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-662012100 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1716/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on pull request #3846: URL: https://github.com/apache/carbondata/pull/3846#issuecomment-662235965 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #3846: URL: https://github.com/apache/carbondata/pull/3846 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |