[GitHub] carbondata pull request #3001: [Presto][Streaming] support presto read strea...

classic Classic list List threaded Threaded
67 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [Presto][Streaming] support presto read strea...

qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r244093200
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorBatch.java ---
    @@ -95,6 +95,9 @@ private CarbonColumnVectorImpl createDirectStreamReader(int batchSize, DataType
         } else if (dataType == DataTypes.STRING) {
           return new SliceStreamReader(batchSize, field.getDataType(), dictionaryBlock);
         } else if (DataTypes.isDecimal(dataType)) {
    +      if (dictionary != null && dataType instanceof DecimalType) {
    +        return new DecimalSliceStreamReader(batchSize, (DecimalType) dataType, dictionary);
    +      }
    --- End diff --
   
    I am not sure why this change is required? Can you explain?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [Presto][Streaming] support presto read strea...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r244095118
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataPageSourceProvider.java ---
    @@ -79,13 +80,31 @@
       @Override
       public ConnectorPageSource createPageSource(ConnectorTransactionHandle transactionHandle,
           ConnectorSession session, ConnectorSplit split, List<ColumnHandle> columns) {
    -    this.queryId = ((CarbondataSplit)split).getQueryId();
    +    CarbondataSplit carbondataSplit = (CarbondataSplit) split;
    --- End diff --
   
    Not expecting any changes in presto implementations, Changes should be only present in `PrestoCarbonVectorizedRecordReader` or `org.apache.carbondata.presto.CarbonVectorBatch`. Because these are reader classes why changes needed here.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [Presto][Streaming] support presto read strea...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r244095288
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/readers/BooleanStreamReader.java ---
    @@ -86,4 +86,15 @@ public BooleanStreamReader(int batchSize, DataType dataType, Dictionary dictiona
         builder = type.createBlockBuilder(null, batchSize);
       }
     
    +  @Override public void putObject(int rowId, Object value) {
    --- End diff --
   
    Already it has methods to handle put why added methods again in all readers. I don't think these changes required in all readers


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [Presto][Streaming] support presto read strea...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r244095320
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/stream/CarbondataStreamPageSource.java ---
    @@ -0,0 +1,243 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.presto.stream;
    --- End diff --
   
    Not expecting any changes in presto implementations, Changes should be only present in `PrestoCarbonVectorizedRecordReader` or `org.apache.carbondata.presto.CarbonVectorBatch`. Because these are reader classes why changes needed here.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [Presto][Streaming] support presto read strea...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r244095423
 
    --- Diff: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoAllDataTypeLocalDictTest.scala ---
    @@ -38,48 +38,16 @@ class PrestoAllDataTypeLocalDictTest extends FunSuiteLike with BeforeAndAfterAll
     
       private val rootPath = new File(this.getClass.getResource("/").getPath
                                       + "../../../..").getCanonicalPath
    -  private val storePath = s"$rootPath/integration/presto/target/store"
    -  private val systemPath = s"$rootPath/integration/presto/target/system"
    +  private val storePath = rootPath + "/examples/spark2/target/store"
    --- End diff --
   
    Why removed all existing testcases?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [Presto][Streaming] support presto read streaming ta...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10214/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [Presto][Streaming] support presto read streaming ta...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2235/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2118/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2324/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10373/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2170/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10425/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2385/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [CARBONDATA-3220] Support presto to read stre...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r245903277
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorBatch.java ---
    @@ -95,6 +95,9 @@ private CarbonColumnVectorImpl createDirectStreamReader(int batchSize, DataType
         } else if (dataType == DataTypes.STRING) {
           return new SliceStreamReader(batchSize, field.getDataType(), dictionaryBlock);
         } else if (DataTypes.isDecimal(dataType)) {
    +      if (dictionary != null && dataType instanceof DecimalType) {
    --- End diff --
   
    when the decimal column uses dictionary encoding,  it can't work fine.
    the method field.getDataType() will return IntegerType, It is not Decimal type.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on the issue:

    https://github.com/apache/carbondata/pull/3001
 
    > @QiangCai Can you explain the dependency of readers after this PR? Ideally, presto module should depends on streaming reader in core or hadoop modules.
   
    yes, I removed the reader into hadoop module.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [CARBONDATA-3220] Support presto to read stre...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r245912249
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorBatch.java ---
    @@ -95,6 +95,9 @@ private CarbonColumnVectorImpl createDirectStreamReader(int batchSize, DataType
         } else if (dataType == DataTypes.STRING) {
           return new SliceStreamReader(batchSize, field.getDataType(), dictionaryBlock);
         } else if (DataTypes.isDecimal(dataType)) {
    +      if (dictionary != null && dataType instanceof DecimalType) {
    +        return new DecimalSliceStreamReader(batchSize, (DecimalType) dataType, dictionary);
    +      }
    --- End diff --
   
    when the decimal column uses dictionary encoding, it can't work fine.
    the method field.getDataType() will return IntegerType, It is not Decimal type.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [CARBONDATA-3220] Support presto to read stre...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r245915288
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataPageSourceProvider.java ---
    @@ -79,13 +80,31 @@
       @Override
       public ConnectorPageSource createPageSource(ConnectorTransactionHandle transactionHandle,
           ConnectorSession session, ConnectorSplit split, List<ColumnHandle> columns) {
    -    this.queryId = ((CarbondataSplit)split).getQueryId();
    +    CarbondataSplit carbondataSplit = (CarbondataSplit) split;
    --- End diff --
   
    already reverted the modification. but it is difficult to reuse PrestoCarbonVectorizedRecordReader for streaming file.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [CARBONDATA-3220] Support presto to read stre...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r245916004
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/readers/BooleanStreamReader.java ---
    @@ -86,4 +86,15 @@ public BooleanStreamReader(int batchSize, DataType dataType, Dictionary dictiona
         builder = type.createBlockBuilder(null, batchSize);
       }
     
    +  @Override public void putObject(int rowId, Object value) {
    --- End diff --
   
    It is only required by stream reader. But I don't want to create many new adaptive objects, it will become complex.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [CARBONDATA-3220] Support presto to read stre...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r245916085
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/stream/CarbondataStreamPageSource.java ---
    @@ -0,0 +1,243 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.presto.stream;
    --- End diff --
   
    Already removed it


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3001: [CARBONDATA-3220] Support presto to read stre...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3001#discussion_r245916195
 
    --- Diff: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoAllDataTypeLocalDictTest.scala ---
    @@ -38,48 +38,16 @@ class PrestoAllDataTypeLocalDictTest extends FunSuiteLike with BeforeAndAfterAll
     
       private val rootPath = new File(this.getClass.getResource("/").getPath
                                       + "../../../..").getCanonicalPath
    -  private val storePath = s"$rootPath/integration/presto/target/store"
    -  private val systemPath = s"$rootPath/integration/presto/target/system"
    +  private val storePath = rootPath + "/examples/spark2/target/store"
    --- End diff --
   
    already reverted it


---
1234