[GitHub] carbondata pull request #2607: Presto Upgrade to 0.206

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2607: Presto Upgrade to 0.206

qiuchenjian-2
GitHub user bhavya411 opened a pull request:

    https://github.com/apache/carbondata/pull/2607

    Presto Upgrade to 0.206

    This PR upgraded the Presto version from 0.187 to 0.206, the CarbondataSplitManager has been changed to accommodate the SplitSchedulingStrategy, also the SliceArrayBlock has now been removed from this version so we have to add the VariableWidthBlock for Dictionary. BlockBuilderStatus access has also been changed to protected instead of public so we have to pass null. Also, now the implementation of Stream Readers uses Block.
   
     - [ Y] Any interfaces changed?
     
     - [ Y] Any backward compatibility impacted?
     
     - [ N] Document update required?
   
     - [ Y] Testing was done
    All Unit test cases are passing and have run queries manually to test them.
           
   
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bhavya411/incubator-carbondata CARBONDATA-2818

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2607
   
----
commit a1b698aa58a714ebb21dfcc01b5bd2d610378649
Author: Bhavya <bhavya@...>
Date:   2018-08-03T08:10:14Z

    Presto Upgrade to 0.206

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7774/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6498/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6157/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207771253
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorBatch.java ---
    @@ -73,7 +73,7 @@ public static CarbonVectorBatch allocate(StructField[] schema,
       }
     
       private CarbonColumnVectorImpl createDirectStreamReader(int batchSize, DataType dataType,
    -      StructField field, Dictionary dictionary, SliceArrayBlock dictionarySliceArrayBlock) {
    +      StructField field, Dictionary dictionary, Block dictionarySliceArrayBlock) {
    --- End diff --
   
    better to modify the names also for readability and maintainability purpose.  
    dictionarySliceArrayBlock --> dictionaryBlock


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207774205
 
    --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala ---
    @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] {
        * @param dictionaryData
        * @return
        */
    -  private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = {
    +  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
         val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks
    -    val sliceArray = new Array[Slice](chunks.getSize + 1)
    -    // Initialize Slice Array with Empty Slice as per Presto's code
    -    sliceArray(0) = Slices.EMPTY_SLICE
    -    var count = 1
    +    val positionCount = chunks.getSize;
    +    val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
    +    val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1)
    +    isNullVector(0) = true
    +    isNullVector(1) = true
    --- End diff --
   
    Can you please add a comment about this logic. It is hard to understand.
    Why **isNullVector** is not filled inside while() if content is null.
    Also why first 3 elements of offset vector is set to 0 ?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207804289
 
    --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala ---
    @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] {
        * @param dictionaryData
        * @return
        */
    -  private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = {
    +  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
         val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks
    -    val sliceArray = new Array[Slice](chunks.getSize + 1)
    -    // Initialize Slice Array with Empty Slice as per Presto's code
    -    sliceArray(0) = Slices.EMPTY_SLICE
    -    var count = 1
    +    val positionCount = chunks.getSize;
    +    val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
    +    val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1)
    +    isNullVector(0) = true
    +    isNullVector(1) = true
    --- End diff --
   
    We are talking about dictionary here , so In dictionary there will be only one null and the key value will be 1 by default in CarbonData, hence the isNullVector will be populated only once with null value it has no bearing on actual data. The Carbondata key starts from 1 so we need a filler at 0th position and 1 index is actually Null to map to carbondata null values . The offset index will be like 0th Position  ->  0  (As it is filler)
    1st Position -> 0 (For actual Null)
    2nd Postion -> 0 as the byte[] is still null so starting point will be 0 only


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207812426
 
    --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala ---
    @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] {
        * @param dictionaryData
        * @return
        */
    -  private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = {
    +  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
         val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks
    -    val sliceArray = new Array[Slice](chunks.getSize + 1)
    -    // Initialize Slice Array with Empty Slice as per Presto's code
    -    sliceArray(0) = Slices.EMPTY_SLICE
    -    var count = 1
    +    val positionCount = chunks.getSize;
    +    val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
    +    val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1)
    +    isNullVector(0) = true
    +    isNullVector(1) = true
    --- End diff --
   
    ok.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7891/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6615/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    @bhavya411  I tested this pr,  the performance(simple aggregation) not getting any improvement(0.206 compare to 0.187)
   
    Just i checked 0.207 and 0.208, there are fixed many memory issues, so propose to upgrade to 0.208 for CarbonData integration.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    @bhavya411 any new progress ?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/32/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/224/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8463/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/393/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/333/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/510/



---
12