[GitHub] carbondata pull request #2978: [WIP] Added lazy load and direct vector fill ...

classic Classic list List threaded Threaded
64 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2978: [WIP] Added lazy load and direct vector fill ...

qiuchenjian-2
GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/2978

    [WIP] Added lazy load and direct vector fill support to Presto

   
    This PR is on top of https://github.com/apache/carbondata/pull/2972
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata presto-lazy-load

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2978.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2978
   
----
commit 801b7bf266f6c53cd7764de708327d1ba587d6c5
Author: ravipesala <ravi.pesala@...>
Date:   2018-12-03T12:57:33Z

    Fixed local dictionary in presto

commit 61fb9c7f103d6e3e59270f2ceae47e98dca8f356
Author: ravipesala <ravi.pesala@...>
Date:   2018-12-05T13:04:13Z

    Added integration for lazy column page and direct fill vector processing.

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1647/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9907/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1858/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1648/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9908/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1859/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1653/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9913/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1865/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1658/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1870/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9918/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2978#discussion_r240159318
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java ---
    @@ -0,0 +1,321 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.presto;
    +
    +import java.math.BigDecimal;
    +import java.util.BitSet;
    +
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
    +import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
    +import org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
    +import org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
    +import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
    +
    +/**
    + * Fills the vector directly with out considering any deleted rows.
    + */
    +class ColumnarVectorWrapperDirect implements CarbonColumnVector,SequentialFill {
    +
    +
    +  /**
    +   * It is adapter class of complete ColumnarBatch.
    +   */
    +  protected CarbonColumnVectorImpl columnVector;
    +
    +  private DataType blockDataType;
    +
    +  private CarbonColumnVector dictionaryVector;
    +
    +  private BitSet nullBitset;
    +
    +  ColumnarVectorWrapperDirect(CarbonColumnVectorImpl columnVector) {
    +    this.columnVector = columnVector;
    +    this.dictionaryVector = columnVector.getDictionaryVector();
    +    this.nullBitset = new BitSet();
    +  }
    +
    +  @Override public void setNullBits(BitSet nullBits) {
    +    this.nullBitset = nullBits;
    +  }
    +
    +  @Override public void putBoolean(int rowId, boolean value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putBoolean(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putFloat(int rowId, float value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putFloat(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putShort(int rowId, short value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putShort(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putShorts(int rowId, int count, short value) {
    +    for (int i = 0; i < count; i++) {
    +      if (nullBitset.get(rowId)) {
    +        columnVector.putNull(rowId);
    +      } else {
    +        columnVector.putShort(rowId, value);
    +      }
    +      rowId++;
    +    }
    +
    +  }
    +
    +  @Override public void putInt(int rowId, int value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putInt(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putInts(int rowId, int count, int value) {
    +    columnVector.putInts(rowId, count, value);
    +  }
    +
    +  @Override public void putLong(int rowId, long value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putLong(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putLongs(int rowId, int count, long value) {
    +    columnVector.putLongs(rowId, count, value);
    +  }
    +
    +  @Override public void putDecimal(int rowId, BigDecimal value, int precision) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putDecimal(rowId, value, precision);
    +    }
    +  }
    +
    +  @Override public void putDecimals(int rowId, int count, BigDecimal value, int precision) {
    +    for (int i = 0; i < count; i++) {
    +      if (nullBitset.get(rowId)) {
    +        columnVector.putNull(rowId);
    +      } else {
    +        columnVector.putDecimal(rowId, value, precision);
    +      }
    +      rowId++;
    +    }
    +  }
    +
    +  @Override public void putDouble(int rowId, double value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putDouble(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putDoubles(int rowId, int count, double value) {
    +    columnVector.putDoubles(rowId, count, value);
    +  }
    +
    +  @Override public void putByteArray(int rowId, byte[] value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putByteArray(rowId, value);
    +    }
    +  }
    +
    +  @Override
    +  public void putBytes(int rowId, int count, byte[] value) {
    +    for (int i = 0; i < count; i++) {
    +      if (nullBitset.get(rowId)) {
    +        columnVector.putNull(rowId);
    +      } else {
    +        columnVector.putByteArray(rowId, value);
    +      }
    +      rowId++;
    +    }
    +  }
    +
    +  @Override public void putByteArray(int rowId, int offset, int length, byte[] value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putByteArray(rowId, offset, length, value);
    +    }
    +  }
    +
    +  @Override public void putNull(int rowId) {
    +    columnVector.putNull(rowId);
    +  }
    +
    +  @Override public void putNulls(int rowId, int count) {
    +    columnVector.putNulls(rowId, count);
    +  }
    +
    +  @Override public void putNotNull(int rowId) {
    +    columnVector.putNotNull(rowId);
    +  }
    +
    +  @Override public void putNotNull(int rowId, int count) {
    +//    columnVector.putNotNulls(rowId, count);
    --- End diff --
   
    remove this commented line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2978#discussion_r240204662
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java ---
    @@ -0,0 +1,321 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.presto;
    +
    +import java.math.BigDecimal;
    +import java.util.BitSet;
    +
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
    +import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
    +import org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
    +import org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
    +import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
    +
    +/**
    + * Fills the vector directly with out considering any deleted rows.
    + */
    +class ColumnarVectorWrapperDirect implements CarbonColumnVector,SequentialFill {
    +
    +
    +  /**
    +   * It is adapter class of complete ColumnarBatch.
    +   */
    +  protected CarbonColumnVectorImpl columnVector;
    +
    +  private DataType blockDataType;
    +
    +  private CarbonColumnVector dictionaryVector;
    +
    +  private BitSet nullBitset;
    +
    +  ColumnarVectorWrapperDirect(CarbonColumnVectorImpl columnVector) {
    +    this.columnVector = columnVector;
    +    this.dictionaryVector = columnVector.getDictionaryVector();
    +    this.nullBitset = new BitSet();
    +  }
    +
    +  @Override public void setNullBits(BitSet nullBits) {
    +    this.nullBitset = nullBits;
    +  }
    +
    +  @Override public void putBoolean(int rowId, boolean value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putBoolean(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putFloat(int rowId, float value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putFloat(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putShort(int rowId, short value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putShort(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putShorts(int rowId, int count, short value) {
    +    for (int i = 0; i < count; i++) {
    +      if (nullBitset.get(rowId)) {
    +        columnVector.putNull(rowId);
    +      } else {
    +        columnVector.putShort(rowId, value);
    +      }
    +      rowId++;
    +    }
    +
    +  }
    +
    +  @Override public void putInt(int rowId, int value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putInt(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putInts(int rowId, int count, int value) {
    +    columnVector.putInts(rowId, count, value);
    +  }
    +
    +  @Override public void putLong(int rowId, long value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putLong(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putLongs(int rowId, int count, long value) {
    +    columnVector.putLongs(rowId, count, value);
    +  }
    +
    +  @Override public void putDecimal(int rowId, BigDecimal value, int precision) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putDecimal(rowId, value, precision);
    +    }
    +  }
    +
    +  @Override public void putDecimals(int rowId, int count, BigDecimal value, int precision) {
    +    for (int i = 0; i < count; i++) {
    +      if (nullBitset.get(rowId)) {
    +        columnVector.putNull(rowId);
    +      } else {
    +        columnVector.putDecimal(rowId, value, precision);
    +      }
    +      rowId++;
    +    }
    +  }
    +
    +  @Override public void putDouble(int rowId, double value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putDouble(rowId, value);
    +    }
    +  }
    +
    +  @Override public void putDoubles(int rowId, int count, double value) {
    +    columnVector.putDoubles(rowId, count, value);
    +  }
    +
    +  @Override public void putByteArray(int rowId, byte[] value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putByteArray(rowId, value);
    +    }
    +  }
    +
    +  @Override
    +  public void putBytes(int rowId, int count, byte[] value) {
    +    for (int i = 0; i < count; i++) {
    +      if (nullBitset.get(rowId)) {
    +        columnVector.putNull(rowId);
    +      } else {
    +        columnVector.putByteArray(rowId, value);
    +      }
    +      rowId++;
    +    }
    +  }
    +
    +  @Override public void putByteArray(int rowId, int offset, int length, byte[] value) {
    +    if (nullBitset.get(rowId)) {
    +      columnVector.putNull(rowId);
    +    } else {
    +      columnVector.putByteArray(rowId, offset, length, value);
    +    }
    +  }
    +
    +  @Override public void putNull(int rowId) {
    +    columnVector.putNull(rowId);
    +  }
    +
    +  @Override public void putNulls(int rowId, int count) {
    +    columnVector.putNulls(rowId, count);
    +  }
    +
    +  @Override public void putNotNull(int rowId) {
    +    columnVector.putNotNull(rowId);
    +  }
    +
    +  @Override public void putNotNull(int rowId, int count) {
    +//    columnVector.putNotNulls(rowId, count);
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1682/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1684/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2978
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1688/



---
1234