[GitHub] [carbondata] marchpure opened a new pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure opened a new pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox

marchpure opened a new pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-700843530


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4274/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-700844396


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2529/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-700899917


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2531/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-700906032


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4276/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-701128059


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4278/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-701129169


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2533/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-701160434


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2535/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-701160608


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4280/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

marchpure commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-701334249


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-701400075


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4286/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-701401558


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2539/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#discussion_r498084784



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>

Review comment:
       Can use foreach instead of map

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable { while (retry > 0) { try {
+          val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
+          stageInput.setCreateTime(stage.getLastModifiedTime)
+          stageInput.setStatus(status)
+          output.add(stageInput)
+        } catch {
+          case _ : FileNotFoundException => breakable()

Review comment:
       should add log if file is not found?

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable { while (retry > 0) { try {
+          val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
+          stageInput.setCreateTime(stage.getLastModifiedTime)
+          stageInput.setStatus(status)
+          output.add(stageInput)

Review comment:
       should break from the loop, once the stage file is found

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##########
@@ -477,13 +479,23 @@ case class CarbonInsertFromStageCommand(
     stageFiles.map { stage =>
       executorService.submit(new Runnable {
         override def run(): Unit = {
-          val filePath = stage._1.getAbsolutePath
-          val stream = FileFactory.getDataInputStream(filePath)
+          val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage._1.getName
+          var stream: DataInputStream = null
           try {
-            val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-            output.add(stageInput)
+            stream = FileFactory.getDataInputStream(filePath)
+            var retry = CarbonInsertFromStageCommand.DELETE_FILES_RETRY_TIMES
+            breakable (while (retry > 0) try {

Review comment:
       please format the code

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable { while (retry > 0) { try {

Review comment:
       PLease format the code




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on a change in pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

marchpure commented on a change in pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#discussion_r498155406



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##########
@@ -477,13 +479,23 @@ case class CarbonInsertFromStageCommand(
     stageFiles.map { stage =>
       executorService.submit(new Runnable {
         override def run(): Unit = {
-          val filePath = stage._1.getAbsolutePath
-          val stream = FileFactory.getDataInputStream(filePath)
+          val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage._1.getName
+          var stream: DataInputStream = null
           try {
-            val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-            output.add(stageInput)
+            stream = FileFactory.getDataInputStream(filePath)
+            var retry = CarbonInsertFromStageCommand.DELETE_FILES_RETRY_TIMES
+            breakable (while (retry > 0) try {

Review comment:
       I have modified code according to your suggestion

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable { while (retry > 0) { try {
+          val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
+          stageInput.setCreateTime(stage.getLastModifiedTime)
+          stageInput.setStatus(status)
+          output.add(stageInput)

Review comment:
       I have modified code according to your suggestion

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable { while (retry > 0) { try {
+          val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
+          stageInput.setCreateTime(stage.getLastModifiedTime)
+          stageInput.setStatus(status)
+          output.add(stageInput)
+        } catch {
+          case _ : FileNotFoundException => breakable()

Review comment:
       I have modified code according to your suggestion

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable { while (retry > 0) { try {

Review comment:
       I have modified code according to your suggestion

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,31 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
     stageFiles.map { stage =>

Review comment:
       I have modified code according to your suggestion




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

marchpure commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-702062616


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#discussion_r498182999



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,37 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
-    stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+    stageFiles.foreach { stage =>
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable {
+          while (retry > 0) {
+            try {
+              val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
+              stageInput.setCreateTime(stage.getLastModifiedTime)
+              stageInput.setStatus(status)
+              output.add(stageInput)
+              break()
+            } catch {
+              case _ : FileNotFoundException => break()
+                LOGGER.warn("The stage file: " + filePath + " does not exist");

Review comment:
       move this log before break




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

marchpure commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-702095077


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-702125570


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2545/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

marchpure commented on pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#issuecomment-702129123


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on a change in pull request #3965: [CARBONDATA-4016] NPE and FileNotFound in Show Segments and Insert Stage

GitBox
In reply to this post by GitBox

marchpure commented on a change in pull request #3965:
URL: https://github.com/apache/carbondata/pull/3965#discussion_r498335797



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##########
@@ -96,20 +100,37 @@ object CarbonStore {
    * Read stage files and return input files
    */
   def readStageInput(
+      tableStagePath: String,
       stageFiles: Seq[CarbonFile],
       status: StageInput.StageStatus): Seq[StageInput] = {
     val gson = new Gson()
     val output = Collections.synchronizedList(new util.ArrayList[StageInput]())
-    stageFiles.map { stage =>
-      val filePath = stage.getAbsolutePath
-      val stream = FileFactory.getDataInputStream(filePath)
+    stageFiles.foreach { stage =>
+      val filePath = tableStagePath + CarbonCommonConstants.FILE_SEPARATOR + stage.getName
+      var stream: DataInputStream = null
       try {
-        val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
-        stageInput.setCreateTime(stage.getLastModifiedTime)
-        stageInput.setStatus(status)
-        output.add(stageInput)
+        stream = FileFactory.getDataInputStream(filePath)
+        var retry = READ_FILE_RETRY_TIMES
+        breakable {
+          while (retry > 0) {
+            try {
+              val stageInput = gson.fromJson(new InputStreamReader(stream), classOf[StageInput])
+              stageInput.setCreateTime(stage.getLastModifiedTime)
+              stageInput.setStatus(status)
+              output.add(stageInput)
+              break()
+            } catch {
+              case _ : FileNotFoundException => break()
+                LOGGER.warn("The stage file: " + filePath + " does not exist");

Review comment:
       I have modified the code according to your suggestion




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12