GitHub user zzcclp opened a pull request:
https://github.com/apache/carbondata/pull/2779 [WIP] Upgrade spark integration version to 2.3.2 Upgrade spark integration version to 2.3.2 Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zzcclp/carbondata wip_upgrade_to_spark2.3.2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2779.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2779 ---- commit 586cf7b6a23fa5b110f3490f8123d1b15b30e4bc Author: Zhang Zhichao <441586683@...> Date: 2018-09-27T17:30:34Z [WIP] Upgrade spark integration version to 2.3.2 Upgrade spark integration version to 2.3.2 ---- --- |
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221013145 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala --- @@ -296,7 +296,7 @@ object CarbonReflectionUtils { classOf[LogicalPlan], classOf[Seq[Attribute]], classOf[SparkPlan]) - method.invoke(dataSourceObj, mode, query, query.output, physicalPlan) + method.invoke(dataSourceObj, mode, query, query.output.map(_.name), physicalPlan) --- End diff -- The parameters of 'writeAndRead' method had been changed, please see: [SPARK-PR#22346](https://github.com/apache/spark/pull/22346) --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221014706 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.execution.strategy + +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.catalyst.{InternalRow, TableIdentifier} +import org.apache.spark.sql.catalyst.expressions.{Attribute, SortOrder} +import org.apache.spark.sql.catalyst.plans.physical.Partitioning +import org.apache.spark.sql.execution.FileSourceScanExec +import org.apache.spark.sql.execution.datasources.{HadoopFsRelation, LogicalRelation} + +/** + * Physical plan node for scanning data. It is applied for both tables + * USING carbondata and STORED AS CARBONDATA. + */ +class CarbonDataSourceScan( + override val output: Seq[Attribute], + val rdd: RDD[InternalRow], + @transient override val relation: HadoopFsRelation, + val partitioning: Partitioning, + val md: Map[String, String], + identifier: Option[TableIdentifier], + @transient private val logicalRelation: LogicalRelation) + extends FileSourceScanExec( + relation, + output, + relation.dataSchema, + Seq.empty, + Seq.empty, + identifier) { + + override lazy val supportsBatch: Boolean = true + + override lazy val (outputPartitioning, outputOrdering): (Partitioning, Seq[SortOrder]) = + (partitioning, Nil) + + override lazy val metadata: Map[String, String] = md --- End diff -- The parameters (supportsBatch, outputPartitioning, outputOrdering, metadata) had been added keyword 'lazy', please see: [SPARK-PR#21815](https://github.com/apache/spark/pull/21815) --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/2779 @jackylk @chenliang613 @sujith71955 please review. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/615/ --- |
In reply to this post by qiuchenjian-2
Github user sujith71955 commented on the issue:
https://github.com/apache/carbondata/pull/2779 Thanks for raising the PR, It will better if you can add the description about the changes in this PR. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8876/ --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/2779 retest this please --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221131678 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* --- End diff -- Why need to move CarbonDataSourceScan.scala? --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/619/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8880/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/811/ --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221141998 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* --- End diff -- move original class 'CarbonDataSourceScan' to src path 'commonTo2.1And2.2', and add a new class 'CarbonDataSourceScan' in src path 'spark2.3' which is added some lazy parameters. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2779 @zzcclp Please check and fix the tests --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/2779 @ravipesala can you help me to check why these three test cases fail? It's about the decimal precision. --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/2779 @ravipesala I know how to fix and will fix the tests ASAP. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/837/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/643/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8905/ --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/2779 @sujith71955 @chenliang613 @ravipesala @jackylk this pr is ready, please review, thanks. --- |
Free forum by Nabble | Edit this page |