Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221414692 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* --- End diff -- My comment : only for 4 parameters , copy the whole file(CarbonDataSourceScan.scala) for spark 2.3 integration, may not require. see if can add the judgement for different spark version with different code/parameters. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/2779 My comment : only for 4 parameters , copy the whole file(CarbonDataSourceScan.scala) for spark 2.3 integration, may not require. see if can add the judgement for different spark version with different code/parameters. --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221419564 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* --- End diff -- @chenliang613 , it's not about the different code/parameters, it changes common val parameters to lazy val parameters. I think it's difficult to add lazy keyword on val parameters according to spark version. Or does anyone know how to do and give me some help, thanks. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221420261 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* --- End diff -- I think it is not possible to add lazy variable without breaking spark 2.2 integration. Since this class is small, I think it is ok to have a separate one for spark 2.3 integration. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2779 @chenliang613 I think it is not possible to add lazy variable without breaking spark 2.2 integration. Since this class is small, I think it is ok to have a separate one for spark 2.3 integration --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2779 I think it is not possible to add lazy variable without breaking spark 2.2 integration. Since this class is small, I think it is ok to have a separate one for spark 2.3 integration --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/2779 @sujith71955 any suggestion for this? --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221444503 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* --- End diff -- ok. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/2779 LGTM --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2779 LGTM --- |
In reply to this post by qiuchenjian-2
Github user sujith71955 commented on the issue:
https://github.com/apache/carbondata/pull/2779 LGTM Even though SPARK-PR#21815 changes is a workaround solution and in future its subjected to change. Need a caution here. Can you please add this PR reference in the modified code of the carbon for future reference. Thanks for your effort. --- |
In reply to this post by qiuchenjian-2
Github user sujith71955 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221445524 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.execution.strategy + +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.catalyst.{InternalRow, TableIdentifier} +import org.apache.spark.sql.catalyst.expressions.{Attribute, SortOrder} +import org.apache.spark.sql.catalyst.plans.physical.Partitioning +import org.apache.spark.sql.execution.FileSourceScanExec +import org.apache.spark.sql.execution.datasources.{HadoopFsRelation, LogicalRelation} + +/** + * Physical plan node for scanning data. It is applied for both tables + * USING carbondata and STORED AS CARBONDATA. + */ +class CarbonDataSourceScan( + override val output: Seq[Attribute], + val rdd: RDD[InternalRow], + @transient override val relation: HadoopFsRelation, + val partitioning: Partitioning, + val md: Map[String, String], + identifier: Option[TableIdentifier], + @transient private val logicalRelation: LogicalRelation) + extends FileSourceScanExec( + relation, + output, + relation.dataSchema, + Seq.empty, + Seq.empty, + identifier) { + + override lazy val supportsBatch: Boolean = true + + override lazy val (outputPartitioning, outputOrdering): (Partitioning, Seq[SortOrder]) = + (partitioning, Nil) + + override lazy val metadata: Map[String, String] = md --- End diff -- nit: made lazy since spark 2.3.2 version (SPARK-PR#21815) --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221446372 --- Diff: integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala --- @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.execution.strategy + +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.catalyst.{InternalRow, TableIdentifier} +import org.apache.spark.sql.catalyst.expressions.{Attribute, SortOrder} +import org.apache.spark.sql.catalyst.plans.physical.Partitioning +import org.apache.spark.sql.execution.FileSourceScanExec +import org.apache.spark.sql.execution.datasources.{HadoopFsRelation, LogicalRelation} + +/** + * Physical plan node for scanning data. It is applied for both tables + * USING carbondata and STORED AS CARBONDATA. + */ +class CarbonDataSourceScan( + override val output: Seq[Attribute], + val rdd: RDD[InternalRow], + @transient override val relation: HadoopFsRelation, + val partitioning: Partitioning, + val md: Map[String, String], + identifier: Option[TableIdentifier], + @transient private val logicalRelation: LogicalRelation) + extends FileSourceScanExec( + relation, + output, + relation.dataSchema, + Seq.empty, + Seq.empty, + identifier) { + + override lazy val supportsBatch: Boolean = true + + override lazy val (outputPartitioning, outputOrdering): (Partitioning, Seq[SortOrder]) = + (partitioning, Nil) + + override lazy val metadata: Map[String, String] = md --- End diff -- fixed --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2779#discussion_r221446374 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala --- @@ -296,7 +296,7 @@ object CarbonReflectionUtils { classOf[LogicalPlan], classOf[Seq[Attribute]], classOf[SparkPlan]) - method.invoke(dataSourceObj, mode, query, query.output, physicalPlan) + method.invoke(dataSourceObj, mode, query, query.output.map(_.name), physicalPlan) --- End diff -- added some comments too --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/656/ --- |
In reply to this post by qiuchenjian-2
Github user sujith71955 commented on the issue:
https://github.com/apache/carbondata/pull/2779 LGTM --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8920/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2779 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/852/ --- |
In reply to this post by qiuchenjian-2
Github user sujith71955 commented on the issue:
https://github.com/apache/carbondata/pull/2779 LGTM --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/2779 the failed test case is not related to this pr, right? --- |
Free forum by Nabble | Edit this page |