[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

qiuchenjian-2
GitHub user kumarvishal09 opened a pull request:

    https://github.com/apache/carbondata/pull/2529

    [WIP] Reduce Memory footprint and store size for local dictionary encoded columns

    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kumarvishal09/incubator-carbondata localdictperformance1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2529.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2529
   
----
commit 1d4891551d66a6cf11783da01687642f3ee10d89
Author: kumarvishal09 <kumarvishal1802@...>
Date:   2018-07-19T10:22:34Z

    fixed performance issue

commit 97673c7f6b0f8ead1d4725f704c3b861e8e45af8
Author: kumarvishal09 <kumarvishal1802@...>
Date:   2018-07-19T13:44:16Z

    fixed performance issue

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2529#discussion_r203750816
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java ---
    @@ -431,6 +439,15 @@ private void ensureArraySize(int requestSize, DataType dataType) {
             System.arraycopy(doubleData, 0, newArray, 0, arrayElementCount);
             doubleData = newArray;
           }
    +    } else if (dataType == DataTypes.BYTE_ARRAY) {
    --- End diff --
   
    increasing by 16 is too low, it can be doubled like array list case


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2529#discussion_r203753043
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java ---
    @@ -201,12 +201,18 @@ public void putDouble(int rowId, double value) {
     
       @Override
       public void putBytes(int rowId, byte[] bytes) {
    +    try {
    +      ensureMemory(eachRowSize);
    +    } catch (MemoryException e) {
    --- End diff --
   
    MemoryException can be runtime exception


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2529#discussion_r203754026
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java ---
    @@ -355,16 +361,33 @@ public BigDecimal getDecimal(int rowId) {
     
       @Override
       public byte[][] getByteArrayPage() {
    -    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +    byte[][] data = new byte[getEndLoop()][eachRowSize];
    +    long offset = baseOffset;
    +    for (int i = 0; i < data.length; i++) {
    +      //copy the row from memory block based on offset
    +      // offset position will be index * each column value length
    +      CarbonUnsafe.getUnsafe().copyMemory(memoryBlock.getBaseObject(), offset, data[i],
    +          CarbonUnsafe.BYTE_ARRAY_OFFSET, eachRowSize);
    +      offset += eachRowSize;
    +    }
    +    return data;
       }
     
       @Override
       public byte[] getLVFlattenedBytePage() {
         throw new UnsupportedOperationException("invalid data type: " + dataType);
       }
    -  @Override
    -  public byte[] getComplexChildrenLVFlattenedBytePage() throws IOException {
    -    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +
    +  @Override public byte[] getComplexChildrenLVFlattenedBytePage() {
    +    byte[] data = new byte[totalLength];
    +    int numberOfRows = getEndLoop();
    +    int destOffset = 0;
    +    for (int i = 0; i < numberOfRows; i++) {
    --- End diff --
   
    Directly get single byte array


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [WIP] Reduce Memory footprint and store size for loc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6088/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [WIP] Reduce Memory footprint and store size for loc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7327/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2529#discussion_r203783172
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java ---
    @@ -201,12 +201,18 @@ public void putDouble(int rowId, double value) {
     
       @Override
       public void putBytes(int rowId, byte[] bytes) {
    +    try {
    +      ensureMemory(eachRowSize);
    +    } catch (MemoryException e) {
    --- End diff --
   
    ok Will handle this in different PR


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2529#discussion_r203783213
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java ---
    @@ -355,16 +361,33 @@ public BigDecimal getDecimal(int rowId) {
     
       @Override
       public byte[][] getByteArrayPage() {
    -    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +    byte[][] data = new byte[getEndLoop()][eachRowSize];
    +    long offset = baseOffset;
    +    for (int i = 0; i < data.length; i++) {
    +      //copy the row from memory block based on offset
    +      // offset position will be index * each column value length
    +      CarbonUnsafe.getUnsafe().copyMemory(memoryBlock.getBaseObject(), offset, data[i],
    +          CarbonUnsafe.BYTE_ARRAY_OFFSET, eachRowSize);
    +      offset += eachRowSize;
    +    }
    +    return data;
       }
     
       @Override
       public byte[] getLVFlattenedBytePage() {
         throw new UnsupportedOperationException("invalid data type: " + dataType);
       }
    -  @Override
    -  public byte[] getComplexChildrenLVFlattenedBytePage() throws IOException {
    -    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +
    +  @Override public byte[] getComplexChildrenLVFlattenedBytePage() {
    +    byte[] data = new byte[totalLength];
    +    int numberOfRows = getEndLoop();
    +    int destOffset = 0;
    +    for (int i = 0; i < numberOfRows; i++) {
    --- End diff --
   
    Fixed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2529: [WIP] Reduce Memory footprint and store size ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2529#discussion_r203783265
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java ---
    @@ -431,6 +439,15 @@ private void ensureArraySize(int requestSize, DataType dataType) {
             System.arraycopy(doubleData, 0, newArray, 0, arrayElementCount);
             doubleData = newArray;
           }
    +    } else if (dataType == DataTypes.BYTE_ARRAY) {
    --- End diff --
   
    Fixed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7321/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7332/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6097/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7338/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6102/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5929/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5932/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7366/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6127/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2529: [CARBONDATA-2760] Reduce Memory footprint and store ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2529
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5940/



---
12