[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

qiuchenjian-2
GitHub user zzcclp opened a pull request:

    https://github.com/apache/carbondata/pull/2779

    [WIP] Upgrade spark integration version to 2.3.2

    Upgrade spark integration version to 2.3.2
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zzcclp/carbondata wip_upgrade_to_spark2.3.2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2779.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2779
   
----
commit 586cf7b6a23fa5b110f3490f8123d1b15b30e4bc
Author: Zhang Zhichao <441586683@...>
Date:   2018-09-27T17:30:34Z

    [WIP] Upgrade spark integration version to 2.3.2
   
    Upgrade spark integration version to 2.3.2

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2779#discussion_r221013145
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala ---
    @@ -296,7 +296,7 @@ object CarbonReflectionUtils {
               classOf[LogicalPlan],
               classOf[Seq[Attribute]],
               classOf[SparkPlan])
    -      method.invoke(dataSourceObj, mode, query, query.output, physicalPlan)
    +      method.invoke(dataSourceObj, mode, query, query.output.map(_.name), physicalPlan)
    --- End diff --
   
    The parameters of 'writeAndRead' method had been changed, please see: [SPARK-PR#22346](https://github.com/apache/spark/pull/22346)


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2779#discussion_r221014706
 
    --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala ---
    @@ -0,0 +1,55 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.execution.strategy
    +
    +import org.apache.spark.rdd.RDD
    +import org.apache.spark.sql.catalyst.{InternalRow, TableIdentifier}
    +import org.apache.spark.sql.catalyst.expressions.{Attribute, SortOrder}
    +import org.apache.spark.sql.catalyst.plans.physical.Partitioning
    +import org.apache.spark.sql.execution.FileSourceScanExec
    +import org.apache.spark.sql.execution.datasources.{HadoopFsRelation, LogicalRelation}
    +
    +/**
    + *  Physical plan node for scanning data. It is applied for both tables
    + *  USING carbondata and STORED AS CARBONDATA.
    + */
    +class CarbonDataSourceScan(
    +    override val output: Seq[Attribute],
    +    val rdd: RDD[InternalRow],
    +    @transient override val relation: HadoopFsRelation,
    +    val partitioning: Partitioning,
    +    val md: Map[String, String],
    +    identifier: Option[TableIdentifier],
    +    @transient private val logicalRelation: LogicalRelation)
    +  extends FileSourceScanExec(
    +    relation,
    +    output,
    +    relation.dataSchema,
    +    Seq.empty,
    +    Seq.empty,
    +    identifier) {
    +
    +  override lazy val supportsBatch: Boolean = true
    +
    +  override lazy val (outputPartitioning, outputOrdering): (Partitioning, Seq[SortOrder]) =
    +    (partitioning, Nil)
    +
    +  override lazy val metadata: Map[String, String] = md
    --- End diff --
   
    The parameters (supportsBatch, outputPartitioning, outputOrdering, metadata) had been added keyword 'lazy', please see: [SPARK-PR#21815](https://github.com/apache/spark/pull/21815)


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    @jackylk @chenliang613 @sujith71955 please review.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/615/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sujith71955 commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Thanks for raising the PR, It will better if you can add the description about the changes in this PR.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8876/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2779#discussion_r221131678
 
    --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala ---
    @@ -0,0 +1,55 @@
    +/*
    --- End diff --
   
    Why need to move CarbonDataSourceScan.scala?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/619/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8880/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/811/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2779#discussion_r221141998
 
    --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala ---
    @@ -0,0 +1,55 @@
    +/*
    --- End diff --
   
    move original class 'CarbonDataSourceScan' to src path 'commonTo2.1And2.2', and add a new class 'CarbonDataSourceScan' in src path 'spark2.3' which is added some lazy parameters.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    @zzcclp Please check and fix the tests


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    @ravipesala can you help me to check why these three test cases fail? It's about the decimal precision.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [WIP] Upgrade spark integration version to 2.3.2

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    @ravipesala I know how to fix and will fix the tests ASAP.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [CARBONDATA-2989] Upgrade spark integration version ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/837/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [CARBONDATA-2989] Upgrade spark integration version ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/643/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [CARBONDATA-2989] Upgrade spark integration version ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8905/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2779: [CARBONDATA-2989] Upgrade spark integration version ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2779
 
    @sujith71955 @chenliang613 @ravipesala @jackylk this pr is ready, please review, thanks.


---
123