GitHub user kumarvishal09 opened a pull request:
https://github.com/apache/carbondata/pull/2529 [WIP] Reduce Memory footprint and store size for local dictionary encoded columns Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kumarvishal09/incubator-carbondata localdictperformance1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2529.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2529 ---- commit 1d4891551d66a6cf11783da01687642f3ee10d89 Author: kumarvishal09 <kumarvishal1802@...> Date: 2018-07-19T10:22:34Z fixed performance issue commit 97673c7f6b0f8ead1d4725f704c3b861e8e45af8 Author: kumarvishal09 <kumarvishal1802@...> Date: 2018-07-19T13:44:16Z fixed performance issue ---- --- |
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2529#discussion_r203750816 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java --- @@ -431,6 +439,15 @@ private void ensureArraySize(int requestSize, DataType dataType) { System.arraycopy(doubleData, 0, newArray, 0, arrayElementCount); doubleData = newArray; } + } else if (dataType == DataTypes.BYTE_ARRAY) { --- End diff -- increasing by 16 is too low, it can be doubled like array list case --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2529#discussion_r203753043 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -201,12 +201,18 @@ public void putDouble(int rowId, double value) { @Override public void putBytes(int rowId, byte[] bytes) { + try { + ensureMemory(eachRowSize); + } catch (MemoryException e) { --- End diff -- MemoryException can be runtime exception --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2529#discussion_r203754026 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -355,16 +361,33 @@ public BigDecimal getDecimal(int rowId) { @Override public byte[][] getByteArrayPage() { - throw new UnsupportedOperationException("invalid data type: " + dataType); + byte[][] data = new byte[getEndLoop()][eachRowSize]; + long offset = baseOffset; + for (int i = 0; i < data.length; i++) { + //copy the row from memory block based on offset + // offset position will be index * each column value length + CarbonUnsafe.getUnsafe().copyMemory(memoryBlock.getBaseObject(), offset, data[i], + CarbonUnsafe.BYTE_ARRAY_OFFSET, eachRowSize); + offset += eachRowSize; + } + return data; } @Override public byte[] getLVFlattenedBytePage() { throw new UnsupportedOperationException("invalid data type: " + dataType); } - @Override - public byte[] getComplexChildrenLVFlattenedBytePage() throws IOException { - throw new UnsupportedOperationException("invalid data type: " + dataType); + + @Override public byte[] getComplexChildrenLVFlattenedBytePage() { + byte[] data = new byte[totalLength]; + int numberOfRows = getEndLoop(); + int destOffset = 0; + for (int i = 0; i < numberOfRows; i++) { --- End diff -- Directly get single byte array --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6088/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7327/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2529#discussion_r203783172 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -201,12 +201,18 @@ public void putDouble(int rowId, double value) { @Override public void putBytes(int rowId, byte[] bytes) { + try { + ensureMemory(eachRowSize); + } catch (MemoryException e) { --- End diff -- ok Will handle this in different PR --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2529#discussion_r203783213 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -355,16 +361,33 @@ public BigDecimal getDecimal(int rowId) { @Override public byte[][] getByteArrayPage() { - throw new UnsupportedOperationException("invalid data type: " + dataType); + byte[][] data = new byte[getEndLoop()][eachRowSize]; + long offset = baseOffset; + for (int i = 0; i < data.length; i++) { + //copy the row from memory block based on offset + // offset position will be index * each column value length + CarbonUnsafe.getUnsafe().copyMemory(memoryBlock.getBaseObject(), offset, data[i], + CarbonUnsafe.BYTE_ARRAY_OFFSET, eachRowSize); + offset += eachRowSize; + } + return data; } @Override public byte[] getLVFlattenedBytePage() { throw new UnsupportedOperationException("invalid data type: " + dataType); } - @Override - public byte[] getComplexChildrenLVFlattenedBytePage() throws IOException { - throw new UnsupportedOperationException("invalid data type: " + dataType); + + @Override public byte[] getComplexChildrenLVFlattenedBytePage() { + byte[] data = new byte[totalLength]; + int numberOfRows = getEndLoop(); + int destOffset = 0; + for (int i = 0; i < numberOfRows; i++) { --- End diff -- Fixed --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2529#discussion_r203783265 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java --- @@ -431,6 +439,15 @@ private void ensureArraySize(int requestSize, DataType dataType) { System.arraycopy(doubleData, 0, newArray, 0, arrayElementCount); doubleData = newArray; } + } else if (dataType == DataTypes.BYTE_ARRAY) { --- End diff -- Fixed --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7321/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7332/ --- |
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2529 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6097/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7338/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6102/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2529 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5929/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2529 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5932/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7366/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2529 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6127/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2529 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5940/ --- |
Free forum by Nabble | Edit this page |