Login  Register

[GitHub] [carbondata] QiangCai opened a new pull request #4044: [CARBONDATA-4062] Refactor clean files feature

classic Classic list List threaded Threaded
95 messages Options Options
Embed post
Permalink
12345
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts

CarbonDataQA2 commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739821005


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3327/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#discussion_r537420813



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
##########
@@ -577,38 +556,46 @@ object CarbonDataRDDFactory {
           LOGGER.info("Data load is successful for " +
                       s"${ carbonLoadModel.getDatabaseName }.${ carbonLoadModel.getTableName }")
         }
-
-        // code to handle Pre-Priming cache for loading
-
-        if (!StringUtils.isEmpty(carbonLoadModel.getSegmentId)) {
-          DistributedRDDUtils.triggerPrepriming(sqlContext.sparkSession, carbonTable, Seq(),
-            operationContext, hadoopConf, List(carbonLoadModel.getSegmentId))
-        }
-        try {
-          // compaction handling
-          if (carbonTable.isHivePartitionTable) {
-            carbonLoadModel.setFactTimeStamp(System.currentTimeMillis())
-          }
-          val compactedSegments = new util.ArrayList[String]()
-          handleSegmentMerging(sqlContext,
-            carbonLoadModel
-              .getCopyWithPartition(carbonLoadModel.getCsvHeader, carbonLoadModel.getCsvDelimiter),
-            carbonTable,
-            compactedSegments,
-            operationContext)
-          carbonLoadModel.setMergedSegmentIds(compactedSegments)
-          writtenSegment
-        } catch {
-          case e: Exception =>
-            LOGGER.error(
-              "Auto-Compaction has failed. Ignoring this exception because the" +
-              " load is passed.", e)
-            writtenSegment
-        }
+        isLoadingCommitted = true
+        writtenSegment
       }
     } finally {
       // Release the segment lock, once table status is finally updated
       segmentLock.unlock()
+      if (isLoadingCommitted) {
+        triggerEventsAfterLoading(sqlContext, carbonLoadModel, hadoopConf, operationContext)
+      }
+    }
+  }
+
+  private def triggerEventsAfterLoading(
+      sqlContext: SQLContext,
+      carbonLoadModel: CarbonLoadModel,
+      hadoopConf: Configuration,
+      operationContext: OperationContext): Unit = {
+    val carbonTable = carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable
+    // code to handle Pre-Priming cache for loading
+    if (!StringUtils.isEmpty(carbonLoadModel.getSegmentId)) {
+      DistributedRDDUtils.triggerPrepriming(sqlContext.sparkSession, carbonTable, Seq(),

Review comment:
       ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739869023


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/5088/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739883277


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3331/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] QiangCai commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

QiangCai commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739889707


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739890535


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3332/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] QiangCai commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

QiangCai commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739891819


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] akashrn5 commented on a change in pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

akashrn5 commented on a change in pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#discussion_r537484782



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/events/CleanFilesPostEventListener.scala
##########
@@ -48,30 +50,59 @@ class CleanFilesPostEventListener extends OperationEventListener with Logging {
     event match {
       case cleanFilesPostEvent: CleanFilesPostEvent =>
         LOGGER.info("Clean files post event listener called")
-        val carbonTable = cleanFilesPostEvent.carbonTable
-        val indexTables = CarbonIndexUtil
-          .getIndexCarbonTables(carbonTable, cleanFilesPostEvent.sparkSession)
-        val isForceDelete = cleanFilesPostEvent.ifForceDelete
-        val inProgressSegmentsClean = cleanFilesPostEvent.cleanStaleInProgress
-        indexTables.foreach { indexTable =>
-          val partitions: Option[Seq[PartitionSpec]] = CarbonFilters.getPartitions(
-            Seq.empty[Expression],
-            cleanFilesPostEvent.sparkSession,
-            indexTable)
-          SegmentStatusManager.deleteLoadsAndUpdateMetadata(
-              indexTable, isForceDelete, partitions.map(_.asJava).orNull, inProgressSegmentsClean,
-            true)
-          CarbonUpdateUtil.cleanUpDeltaFiles(indexTable, true)
-          cleanUpUnwantedSegmentsOfSIAndUpdateMetadata(indexTable, carbonTable)
-        }
+        cleanFilesForIndex(

Review comment:
       please see if you can do along with  [CARBONDATA-4074](https://issues.apache.org/jira/browse/CARBONDATA-4074)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] QiangCai commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

QiangCai commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739903966


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739958168


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/5095/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739959057


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3334/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] akashrn5 commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

akashrn5 commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739984283


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] ajantha-bhat commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

ajantha-bhat commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739984807


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] ajantha-bhat commented on pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

ajantha-bhat commented on pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044#issuecomment-739986550


   Merging this, please handle the open issue raised (example JIRA-CARBONDATA-4074) in another PR soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] [carbondata] asfgit closed pull request #4044: [CARBONDATA-4062] Refactor clean files feature

GitBox
31718 posts
In reply to this post by GitBox

asfgit closed pull request #4044:
URL: https://github.com/apache/carbondata/pull/4044


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12345