[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3345: [CARBONDATA-3487] wrong Input metrics (size/record) displayed in spark UI during insert into

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3345: [CARBONDATA-3487] wrong Input metrics (size/record) displayed in spark UI during insert into

GitBox
ajantha-bhat commented on a change in pull request #3345: [CARBONDATA-3487] wrong Input metrics (size/record) displayed in spark UI during insert into
URL: https://github.com/apache/carbondata/pull/3345#discussion_r311389139
 
 

 ##########
 File path: integration/spark-common/src/main/scala/org/apache/spark/CarbonInputMetrics.scala
 ##########
 @@ -35,20 +35,28 @@ class CarbonInputMetrics extends InitInputMetrics{
     var inputMetrics: InputMetrics = _
     // bytes read before compute by other map rdds in lineage
     var existingBytesRead: Long = _
+    var recordCount: Long = _
+    var inputMetricsInterval: Long = _
     var carbonMultiBlockSplit: CarbonMultiBlockSplit = _
 
   def initBytesReadCallback(context: TaskContext,
-      carbonMultiBlockSplit: CarbonMultiBlockSplit) {
+      carbonMultiBlockSplit: CarbonMultiBlockSplit, inputMetricsInterval: Long) {
     inputMetrics = context.taskMetrics().inputMetrics
     existingBytesRead = inputMetrics.bytesRead
-    this.carbonMultiBlockSplit = carbonMultiBlockSplit;
+    recordCount = 0L
+    this.inputMetricsInterval = inputMetricsInterval
+    this.carbonMultiBlockSplit = carbonMultiBlockSplit
   }
 
   def incrementRecordRead(recordRead: Long) {
-    val value : scala.Long = recordRead
-    inputMetrics.incRecordsRead(value)
-    if (inputMetrics.recordsRead % SparkHadoopUtil.UPDATE_INPUT_METRICS_INTERVAL_RECORDS == 0) {
-      updateBytesRead()
+    val value: scala.Long = recordRead
+    recordCount = recordCount + value
+    if (recordCount > inputMetricsInterval) {
+      inputMetrics.synchronized {
+        inputMetrics.incRecordsRead(recordCount)
+        updateBytesRead()
+      }
+      recordCount = 0L
 
 Review comment:
   CarbonInputMetrics doesn't need syncronization, only inputMetrics (spark) inside that need synchronization

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services