akashrn5 opened a new pull request #3927: URL: https://github.com/apache/carbondata/pull/3927 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#issuecomment-691982389 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4065/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#issuecomment-691991425 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2326/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#discussion_r489299596 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -725,43 +725,53 @@ object CarbonDataRDDFactory { val metadataDetails = SegmentStatusManager.readTableStatusFile( CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)) + val updateTableStatusFile = CarbonUpdateUtil.getUpdateStatusFileName(updateModel + .updatedTimeStamp.toString) + val updatedSegments = SegmentUpdateStatusManager.readLoadMetadata(updateTableStatusFile, + carbonTable.getTablePath).map(_.getSegmentName).toSet val segmentFiles = segmentDetails.asScala.map { seg => - val load = - metadataDetails.find(_.getLoadName.equals(seg.getSegmentNo)).get - val segmentFile = load.getSegmentFile - var segmentFiles: Seq[CarbonFile] = Seq.empty[CarbonFile] + // create new segment files and merge for only updated segments Review comment: I thought you will strictly follow to clean up the base code in the area of modification to make it more readable and maintainable! here `seg`, `file`, `carbonFile` are not good variable names, you can check and refactor if you want ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#discussion_r489300387 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -725,43 +725,53 @@ object CarbonDataRDDFactory { val metadataDetails = SegmentStatusManager.readTableStatusFile( CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)) + val updateTableStatusFile = CarbonUpdateUtil.getUpdateStatusFileName(updateModel + .updatedTimeStamp.toString) + val updatedSegments = SegmentUpdateStatusManager.readLoadMetadata(updateTableStatusFile, + carbonTable.getTablePath).map(_.getSegmentName).toSet val segmentFiles = segmentDetails.asScala.map { seg => - val load = - metadataDetails.find(_.getLoadName.equals(seg.getSegmentNo)).get - val segmentFile = load.getSegmentFile - var segmentFiles: Seq[CarbonFile] = Seq.empty[CarbonFile] + // create new segment files and merge for only updated segments + if (updatedSegments.contains(seg.getSegmentNo)) { + val load = + metadataDetails.find(_.getLoadName.equals(seg.getSegmentNo)).get + val segmentFile = load.getSegmentFile + var segmentFiles: Seq[CarbonFile] = Seq.empty[CarbonFile] - val segmentMetaDataInfo = segmentMetaDataInfoMap.get(seg.getSegmentNo) - val file = SegmentFileStore.writeSegmentFile( - carbonTable, - seg.getSegmentNo, - String.valueOf(System.currentTimeMillis()), - load.getPath, - segmentMetaDataInfo) + val segmentMetaDataInfo = segmentMetaDataInfoMap.get(seg.getSegmentNo) + val file = SegmentFileStore.writeSegmentFile( Review comment: here writing segment file again to just update the current timestamp ? you can add comment If possible. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#discussion_r489299596 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -725,43 +725,53 @@ object CarbonDataRDDFactory { val metadataDetails = SegmentStatusManager.readTableStatusFile( CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)) + val updateTableStatusFile = CarbonUpdateUtil.getUpdateStatusFileName(updateModel + .updatedTimeStamp.toString) + val updatedSegments = SegmentUpdateStatusManager.readLoadMetadata(updateTableStatusFile, + carbonTable.getTablePath).map(_.getSegmentName).toSet val segmentFiles = segmentDetails.asScala.map { seg => - val load = - metadataDetails.find(_.getLoadName.equals(seg.getSegmentNo)).get - val segmentFile = load.getSegmentFile - var segmentFiles: Seq[CarbonFile] = Seq.empty[CarbonFile] + // create new segment files and merge for only updated segments Review comment: here `seg`, `file`, `carbonFile` are not good variable names, you can check and refactor if you want ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#discussion_r489323476 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -725,43 +725,53 @@ object CarbonDataRDDFactory { val metadataDetails = SegmentStatusManager.readTableStatusFile( CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)) + val updateTableStatusFile = CarbonUpdateUtil.getUpdateStatusFileName(updateModel + .updatedTimeStamp.toString) + val updatedSegments = SegmentUpdateStatusManager.readLoadMetadata(updateTableStatusFile, + carbonTable.getTablePath).map(_.getSegmentName).toSet val segmentFiles = segmentDetails.asScala.map { seg => - val load = - metadataDetails.find(_.getLoadName.equals(seg.getSegmentNo)).get - val segmentFile = load.getSegmentFile - var segmentFiles: Seq[CarbonFile] = Seq.empty[CarbonFile] + // create new segment files and merge for only updated segments Review comment: changed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#discussion_r489324392 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -725,43 +725,53 @@ object CarbonDataRDDFactory { val metadataDetails = SegmentStatusManager.readTableStatusFile( CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)) + val updateTableStatusFile = CarbonUpdateUtil.getUpdateStatusFileName(updateModel + .updatedTimeStamp.toString) + val updatedSegments = SegmentUpdateStatusManager.readLoadMetadata(updateTableStatusFile, + carbonTable.getTablePath).map(_.getSegmentName).toSet val segmentFiles = segmentDetails.asScala.map { seg => - val load = - metadataDetails.find(_.getLoadName.equals(seg.getSegmentNo)).get - val segmentFile = load.getSegmentFile - var segmentFiles: Seq[CarbonFile] = Seq.empty[CarbonFile] + // create new segment files and merge for only updated segments + if (updatedSegments.contains(seg.getSegmentNo)) { + val load = + metadataDetails.find(_.getLoadName.equals(seg.getSegmentNo)).get + val segmentFile = load.getSegmentFile + var segmentFiles: Seq[CarbonFile] = Seq.empty[CarbonFile] - val segmentMetaDataInfo = segmentMetaDataInfoMap.get(seg.getSegmentNo) - val file = SegmentFileStore.writeSegmentFile( - carbonTable, - seg.getSegmentNo, - String.valueOf(System.currentTimeMillis()), - load.getPath, - segmentMetaDataInfo) + val segmentMetaDataInfo = segmentMetaDataInfoMap.get(seg.getSegmentNo) + val file = SegmentFileStore.writeSegmentFile( Review comment: here method level comment says for new updates on the same segment we need to write new segment file with new segmentMetaDataInfo and merge with old segment file , i think that comment is pretty clear. It should be ok i think. Please check once. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#issuecomment-693310888 LGTM. will merge once the build passes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#issuecomment-693362694 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4097/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3927: URL: https://github.com/apache/carbondata/pull/3927#issuecomment-693363425 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2356/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #3927: URL: https://github.com/apache/carbondata/pull/3927 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |