GitHub user dhatchayani opened a pull request:
https://github.com/apache/carbondata/pull/2462 [CARBONDATA-2704] Index file size in describe formatted command is not updated correctly with the segment file **Problem:** Describe formatted command is not showing correct index files size after index files merge. **Solution:** Segment file should be updated with the actual index files size of that segment after index files merge. - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done UT added - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhatchayani/carbondata CARBONDATA-2704 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2462.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2462 ---- commit 11f52bd21d6496764d0841c6f732cb66da583c16 Author: dhatchayani <dhatcha.official@...> Date: 2018-07-09T05:49:51Z [CARBONDATA-2704] Index file size in describe formatted command is not updated correctly with the segment file ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6939/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5722/ --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/2462 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6944/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5727/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2462 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5712/ --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2462#discussion_r202229032 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java --- @@ -304,10 +304,15 @@ public static boolean updateSegmentFile(String tablePath, String segmentId, Stri LoadMetadataDetails[] listOfLoadFolderDetailsArray = SegmentStatusManager.readLoadMetadata(metadataPath); + if (null == segmentFileStore) { + segmentFileStore = new SegmentFileStore(tablePath, segmentFile); + } --- End diff -- Do we need to calculate the size even if this object is null? --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2462#discussion_r202229089 --- Diff: core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java --- @@ -155,7 +155,7 @@ private String writeMergeIndexFileBasedOnSegmentFile( + CarbonCommonConstants.FILE_SEPARATOR + newSegmentFileName; SegmentFileStore.writeSegmentFile(sfs.getSegmentFile(), path); SegmentFileStore.updateSegmentFile(table.getTablePath(), segmentId, newSegmentFileName, - table.getCarbonTableIdentifier().getTableId()); + table.getCarbonTableIdentifier().getTableId(), sfs); --- End diff -- change the variable name to segmentFileStore --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2462#discussion_r202229386 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala --- @@ -193,6 +196,30 @@ class CarbonIndexFileMergeTestCase sql("select * from mitable").show() } + // CARBONDATA-2704, test the index file size after merge + test("Verify the size of the index file after merge") { + sql("DROP TABLE IF EXISTS fileSize") + sql( + """ + | CREATE TABLE fileSize(id INT, name STRING, city STRING, age INT) + | STORED BY 'org.apache.carbondata.format' + | TBLPROPERTIES('SORT_COLUMNS'='city,name') + """.stripMargin) + sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE fileSize OPTIONS('header'='false')") + val table = CarbonMetadata.getInstance().getCarbonTable("default", "fileSize") + var loadMetadataDetails = SegmentStatusManager + .readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath)) + var segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0")) + Assert.assertEquals(692, segment0.head.getIndexSize.toLong) --- End diff -- Do not assert for hard coded values. Instead create the file path and take its size using CarbonFile getSize interface method --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2462#discussion_r202229400 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala --- @@ -193,6 +196,30 @@ class CarbonIndexFileMergeTestCase sql("select * from mitable").show() } + // CARBONDATA-2704, test the index file size after merge + test("Verify the size of the index file after merge") { + sql("DROP TABLE IF EXISTS fileSize") + sql( + """ + | CREATE TABLE fileSize(id INT, name STRING, city STRING, age INT) + | STORED BY 'org.apache.carbondata.format' + | TBLPROPERTIES('SORT_COLUMNS'='city,name') + """.stripMargin) + sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE fileSize OPTIONS('header'='false')") + val table = CarbonMetadata.getInstance().getCarbonTable("default", "fileSize") + var loadMetadataDetails = SegmentStatusManager + .readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath)) + var segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0")) + Assert.assertEquals(692, segment0.head.getIndexSize.toLong) + new CarbonIndexFileMergeWriter(table) + .mergeCarbonIndexFilesOfSegment("0", table.getTablePath, false) + loadMetadataDetails = SegmentStatusManager + .readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath)) + segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0")) + Assert.assertEquals(741, segment0.head.getIndexSize.toLong) --- End diff -- same comment as above --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2462 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5816/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7096/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5872/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7178/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2462 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5954/ --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:
https://github.com/apache/carbondata/pull/2462 LGTM --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |