[GitHub] carbondata pull request #2128: [WIP] partition table clean files fixed

classic Classic list List threaded Threaded
59 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4347/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user rahulforallp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180156365
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala ---
    @@ -151,13 +153,82 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for $dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    +        .readLoadMetadata(carbonTable.getMetadataPath)
    +
    +      val fileType = FileFactory.getFileType(carbonTable.getTablePath)
    +      val carbonFile = FileFactory.getCarbonFile(carbonTable.getTablePath, fileType)
    +
    +      // list all files from table path
    +      val listOfDefaultPartFilesIterator = carbonFile.listFiles(true)
    +      loadMetadataDetails.foreach { metadataDetail =>
    +        if (metadataDetail.getSegmentStatus.equals(SegmentStatus.MARKED_FOR_DELETE) &&
    +            metadataDetail.getSegmentFile == null) {
    +          val loadStartTime: Long = metadataDetail.getLoadStartTime
    +          // delete all files of @loadStartTime from tablepath
    +          cleanPartitionFolder(listOfDefaultPartFilesIterator, loadStartTime)
    +          partitionSpecList.foreach {
    +            partitionSpec =>
    +              val partitionLocation = partitionSpec.getLocation
    +              // For partition folder outside the tablePath
    +              if (!partitionLocation.toString.startsWith(carbonTable.getTablePath)) {
    +                val fileType = FileFactory.getFileType(partitionLocation.toString)
    +                val partitionCarbonFile = FileFactory
    +                  .getCarbonFile(partitionLocation.toString, fileType)
    +                // list all files from partitionLoacation
    +                val listOfExternalPartFilesIterator = partitionCarbonFile.listFiles(true)
    +                // delete all files of @loadStartTime from externalPath
    +                cleanPartitionFolder(listOfExternalPartFilesIterator, loadStartTime)
    +              }
    +          }
    +        }
    +      }
    +    }
    +  }
    +
    +  /**
    +   *
    +   * @param carbonFiles
    +   * @param timestamp
    +   */
    +  private def cleanPartitionFolder(carbonFiles: Array[CarbonFile],
    +      timestamp: Long): Unit = {
    +    carbonFiles.foreach {
    +      carbonFile =>
    +        val filePath = carbonFile.getPath
    +        val fileName = carbonFile.getName
    +        if (fileName.lastIndexOf("-") > 0 && fileName.lastIndexOf(".") > 0) {
    +          if (fileName.substring(fileName.lastIndexOf("-") + 1, fileName.lastIndexOf("."))
    --- End diff --
   
    done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user rahulforallp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180156407
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java ---
    @@ -156,6 +158,25 @@ public boolean delete() {
     
       }
     
    +  @Override
    +  public CarbonFile[] listFiles(Boolean recurssive) {
    +    if (!file.isDirectory()) {
    +      return new CarbonFile[0];
    +    }
    +    String[] filter = null;
    +    Collection<File> fileCollection = FileUtils.listFiles(file, null, true);
    +    File[] files = fileCollection.toArray(new File[fileCollection.size()]);
    +    if (files == null) {
    +      return new CarbonFile[0];
    +    }
    +    CarbonFile[] carbonFiles = new CarbonFile[files.length];
    --- End diff --
   
    done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3672/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4895/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4359/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4364/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3727/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4946/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4947/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user rahulforallp commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3728/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2128


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3730/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4949/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4401/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4403/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4404/



---
123