[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

classic Classic list List threaded Threaded
126 messages Options
1234567
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2819#discussion_r226863656
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/reader/dimension/v3/CompressedDimensionChunkFileBasedReaderV3.java ---
    @@ -221,49 +229,66 @@ protected DimensionRawColumnChunk getDimensionRawColumnChunk(FileReader fileRead
         int offset = (int) rawColumnPage.getOffSet() + dimensionChunksLength
             .get(rawColumnPage.getColumnIndex()) + dataChunk3.getPage_offset().get(pageNumber);
         // first read the data and uncompressed it
    -    return decodeDimension(rawColumnPage, rawData, pageMetadata, offset);
    +    return decodeDimension(rawColumnPage, rawData, pageMetadata, offset, vectorInfo);
    +  }
    +
    +  @Override
    +  public void decodeColumnPageAndFillVector(DimensionRawColumnChunk dimensionRawColumnChunk,
    +      int pageNumber, ColumnVectorInfo vectorInfo) throws IOException, MemoryException {
    +    DimensionColumnPage columnPage =
    +        decodeColumnPage(dimensionRawColumnChunk, pageNumber, vectorInfo);
    +    columnPage.freeMemory();
       }
     
    -  private ColumnPage decodeDimensionByMeta(DataChunk2 pageMetadata,
    -      ByteBuffer pageData, int offset, boolean isLocalDictEncodedPage)
    +  private ColumnPage decodeDimensionByMeta(DataChunk2 pageMetadata, ByteBuffer pageData, int offset,
    +      boolean isLocalDictEncodedPage, ColumnVectorInfo vectorInfo, BitSet nullBitSet)
           throws IOException, MemoryException {
         List<Encoding> encodings = pageMetadata.getEncoders();
         List<ByteBuffer> encoderMetas = pageMetadata.getEncoder_meta();
         String compressorName = CarbonMetadataUtil.getCompressorNameFromChunkMeta(
             pageMetadata.getChunk_meta());
         ColumnPageDecoder decoder = encodingFactory.createDecoder(encodings, encoderMetas,
    -        compressorName);
    -    return decoder
    -        .decode(pageData.array(), offset, pageMetadata.data_page_length, isLocalDictEncodedPage);
    +        compressorName, vectorInfo != null);
    +    if (vectorInfo != null) {
    +      return decoder
    +          .decodeAndFillVector(pageData.array(), offset, pageMetadata.data_page_length, vectorInfo,
    +              nullBitSet, isLocalDictEncodedPage);
    +    } else {
    +      return decoder
    +          .decode(pageData.array(), offset, pageMetadata.data_page_length, isLocalDictEncodedPage);
    +    }
       }
     
       protected DimensionColumnPage decodeDimension(DimensionRawColumnChunk rawColumnPage,
    -      ByteBuffer pageData, DataChunk2 pageMetadata, int offset)
    +      ByteBuffer pageData, DataChunk2 pageMetadata, int offset, ColumnVectorInfo vectorInfo)
           throws IOException, MemoryException {
         List<Encoding> encodings = pageMetadata.getEncoders();
         if (CarbonUtil.isEncodedWithMeta(encodings)) {
    -      ColumnPage decodedPage = decodeDimensionByMeta(pageMetadata, pageData, offset,
    -          null != rawColumnPage.getLocalDictionary());
    -      decodedPage.setNullBits(QueryUtil.getNullBitSet(pageMetadata.presence, this.compressor));
           int[] invertedIndexes = new int[0];
           int[] invertedIndexesReverse = new int[0];
           // in case of no dictionary measure data types, if it is included in sort columns
           // then inverted index to be uncompressed
    +      boolean isExplicitSorted =
    +          CarbonUtil.hasEncoding(pageMetadata.encoders, Encoding.INVERTED_INDEX);
    +      int dataOffset = offset;
           if (encodings.contains(Encoding.INVERTED_INDEX)) {
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2819#discussion_r226863659
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/reader/dimension/v3/CompressedDimensionChunkFileBasedReaderV3.java ---
    @@ -221,49 +229,66 @@ protected DimensionRawColumnChunk getDimensionRawColumnChunk(FileReader fileRead
         int offset = (int) rawColumnPage.getOffSet() + dimensionChunksLength
             .get(rawColumnPage.getColumnIndex()) + dataChunk3.getPage_offset().get(pageNumber);
         // first read the data and uncompressed it
    -    return decodeDimension(rawColumnPage, rawData, pageMetadata, offset);
    +    return decodeDimension(rawColumnPage, rawData, pageMetadata, offset, vectorInfo);
    +  }
    +
    +  @Override
    +  public void decodeColumnPageAndFillVector(DimensionRawColumnChunk dimensionRawColumnChunk,
    +      int pageNumber, ColumnVectorInfo vectorInfo) throws IOException, MemoryException {
    +    DimensionColumnPage columnPage =
    +        decodeColumnPage(dimensionRawColumnChunk, pageNumber, vectorInfo);
    +    columnPage.freeMemory();
       }
     
    -  private ColumnPage decodeDimensionByMeta(DataChunk2 pageMetadata,
    -      ByteBuffer pageData, int offset, boolean isLocalDictEncodedPage)
    +  private ColumnPage decodeDimensionByMeta(DataChunk2 pageMetadata, ByteBuffer pageData, int offset,
    +      boolean isLocalDictEncodedPage, ColumnVectorInfo vectorInfo, BitSet nullBitSet)
           throws IOException, MemoryException {
         List<Encoding> encodings = pageMetadata.getEncoders();
         List<ByteBuffer> encoderMetas = pageMetadata.getEncoder_meta();
         String compressorName = CarbonMetadataUtil.getCompressorNameFromChunkMeta(
             pageMetadata.getChunk_meta());
         ColumnPageDecoder decoder = encodingFactory.createDecoder(encodings, encoderMetas,
    -        compressorName);
    -    return decoder
    -        .decode(pageData.array(), offset, pageMetadata.data_page_length, isLocalDictEncodedPage);
    +        compressorName, vectorInfo != null);
    +    if (vectorInfo != null) {
    +      return decoder
    +          .decodeAndFillVector(pageData.array(), offset, pageMetadata.data_page_length, vectorInfo,
    +              nullBitSet, isLocalDictEncodedPage);
    +    } else {
    +      return decoder
    +          .decode(pageData.array(), offset, pageMetadata.data_page_length, isLocalDictEncodedPage);
    +    }
       }
     
       protected DimensionColumnPage decodeDimension(DimensionRawColumnChunk rawColumnPage,
    -      ByteBuffer pageData, DataChunk2 pageMetadata, int offset)
    +      ByteBuffer pageData, DataChunk2 pageMetadata, int offset, ColumnVectorInfo vectorInfo)
           throws IOException, MemoryException {
         List<Encoding> encodings = pageMetadata.getEncoders();
         if (CarbonUtil.isEncodedWithMeta(encodings)) {
    -      ColumnPage decodedPage = decodeDimensionByMeta(pageMetadata, pageData, offset,
    -          null != rawColumnPage.getLocalDictionary());
    -      decodedPage.setNullBits(QueryUtil.getNullBitSet(pageMetadata.presence, this.compressor));
           int[] invertedIndexes = new int[0];
           int[] invertedIndexesReverse = new int[0];
           // in case of no dictionary measure data types, if it is included in sort columns
           // then inverted index to be uncompressed
    +      boolean isExplicitSorted =
    +          CarbonUtil.hasEncoding(pageMetadata.encoders, Encoding.INVERTED_INDEX);
    +      int dataOffset = offset;
           if (encodings.contains(Encoding.INVERTED_INDEX)) {
             offset += pageMetadata.data_page_length;
    -        if (CarbonUtil.hasEncoding(pageMetadata.encoders, Encoding.INVERTED_INDEX)) {
    +        if (isExplicitSorted) {
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2819#discussion_r226863664
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java ---
    @@ -633,6 +622,56 @@ public boolean getBoolean(int rowId) {
        */
       public abstract double getDouble(int rowId);
     
    +
    +
    +
    +
    +  /**
    +   * Get byte value at rowId
    +   */
    +  public abstract byte[] getByteData();
    --- End diff --
   
    removed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/885/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9150/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1083/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/887/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9152/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1085/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/900/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/904/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1098/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9169/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/911/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/918/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1118/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9178/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/930/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2819: [CARBONDATA-3012] Added support for full scan querie...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2819
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1128/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kunal642 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2819#discussion_r227027472
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveIntegralCodec.java ---
    @@ -248,6 +266,136 @@ public double decodeDouble(float value) {
         public double decodeDouble(double value) {
           throw new RuntimeException("internal error: " + debugInfo());
         }
    +
    +    @Override
    +    public void decodeAndFillVector(ColumnPage columnPage, ColumnVectorInfo vectorInfo) {
    +      CarbonColumnVector vector = vectorInfo.vector;
    +      BitSet nullBits = columnPage.getNullBits();
    +      DataType dataType = vector.getType();
    +      DataType type = columnPage.getDataType();
    +      int pageSize = columnPage.getPageSize();
    +      BitSet deletedRows = vectorInfo.deletedRows;
    +      fillVector(columnPage, vector, dataType, type, pageSize, vectorInfo);
    +      if (deletedRows == null || deletedRows.isEmpty()) {
    +        for (int i = nullBits.nextSetBit(0); i >= 0; i = nullBits.nextSetBit(i + 1)) {
    +          vector.putNull(i);
    +        }
    +      }
    +    }
    +
    +    private void fillVector(ColumnPage columnPage, CarbonColumnVector vector, DataType dataType,
    --- End diff --
   
    For Timestamp type
    `vector.putLong(i, byteData[i] * 1000);`  should be changed to `vector.putLong(i, (long) byteData[i] * 1000L);` otherwise it would cross integer range and give wrong results.
   
    Please handle the same for AdaptiveDeltaIntegralCodec


---
1234567