Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #1595: [WIP][Spark-2.2]Carbon-Spark2.2 Add | Modify ...

Classic

List

Threaded

98 messages Options

12345

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Alter Ad...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1595

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/502/

---

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1595

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/503/

---

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1595

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1765/

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155251303

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
@@ -413,19 +414,35 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
relation: BaseRelation,
predicates: Seq[Expression]): (Seq[Expression], Seq[Filter]) = {

+ def isComplexAttribute(attribute: Attribute) = attribute.dataType match {
+ case ArrayType(dataType, _) => true
+ case StructType(_) => true
+ case _ => false
+ }
+
+ // In case of ComplexType dataTypes no filters should be pushed down. IsNotNull is being
+ // explicitly added by spark and pushed. That also has to be handled and pushed back to
+ // Spark for handling.
+ val predicatesWithoutComplex = predicates.filter(predicate => predicate.collect {
+ case a: Attribute if (isComplexAttribute(a)) => a
+ }.size == 0 )
+
// For conciseness, all Catalyst filter expressions of type `expressions.Expression` below are
// called `predicate`s, while all data source filters of type `sources.Filter` are simply called
// `filter`s.

val translated: Seq[(Expression, Filter)] =
for {
- predicate <- predicates
+ predicate <- predicatesWithoutComplex
filter <- translateFilter(predicate)
} yield predicate -> filter

+
+ // val newTranslated = translated.toMap.filter()
--- End diff --

remove commented code

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155251715

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
@@ -413,19 +414,35 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
relation: BaseRelation,
predicates: Seq[Expression]): (Seq[Expression], Seq[Filter]) = {

+ def isComplexAttribute(attribute: Attribute) = attribute.dataType match {
+ case ArrayType(dataType, _) => true
+ case StructType(_) => true
+ case _ => false
+ }
+
+ // In case of ComplexType dataTypes no filters should be pushed down. IsNotNull is being
+ // explicitly added by spark and pushed. That also has to be handled and pushed back to
+ // Spark for handling.
+ val predicatesWithoutComplex = predicates.filter(predicate => predicate.collect {
--- End diff --

format properly.
use
```
predicates.filter{predicate =>
predicate.collect {
```

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155254371

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala ---
@@ -49,10 +49,22 @@ case class CarbonIUDAnalysisRule(sparkSession: SparkSession) extends Rule[Logica

val projList = Seq(UnresolvedAlias(UnresolvedStar(alias.map(Seq(_)))), tupleId)

+ val tableRelation = if (sparkSession.version.startsWith("2.1")) {
+ relation
+ } else if (sparkSession.version.startsWith("2.2")) {
+ alias match {
+ case Some(a) => CarbonReflectionUtils.getSubqueryAlias(sparkSession, alias, relation,
--- End diff --

move line down after =>

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155254744

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
@@ -1223,5 +1224,63 @@ case class CarbonPreInsertionCasts(sparkSession: SparkSession) extends Rule[Logi
}
}

+ /**
+ * Transform the logical plan with average(col1) aggregation type to sum(col1) and count(col1).
+ *
+ * @param logicalPlan
+ * @return
+ */
+ private def transformAggregatePlan(logicalPlan: LogicalPlan): LogicalPlan = {
--- End diff --

why is this methods added?

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155255158

--- Diff: integration/spark2/src/main/spark2.2/CarbonSessionState.scala ---
@@ -125,6 +132,18 @@ class CarbonSessionCatalog(
}
}

+
+class CarbonAnalyzer(catalog: SessionCatalog,
--- End diff --

Better move to some common place, don't duplicate

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155255769

--- Diff: integration/spark2/src/main/spark2.2/CarbonSessionState.scala ---
@@ -256,6 +292,57 @@ class CarbonSqlAstBuilder(conf: SQLConf, parser: CarbonSpark2SqlParser) extends
}
}

+ override def visitChangeColumn(ctx: ChangeColumnContext): LogicalPlan = {
--- End diff --

Better you can catch the command at strategy rather than changing in parser

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155255796

--- Diff: integration/spark2/src/main/spark2.2/CarbonSessionState.scala ---
@@ -256,6 +292,57 @@ class CarbonSqlAstBuilder(conf: SQLConf, parser: CarbonSpark2SqlParser) extends
}
}

+ override def visitChangeColumn(ctx: ChangeColumnContext): LogicalPlan = {
+
+ val newColumn = visitColType(ctx.colType)
+ if (!ctx.identifier.getText.equalsIgnoreCase(newColumn.name)) {
+ throw new MalformedCarbonCommandException(
+ "Column names provided are different. Both the column names should be same")
+ }
+
+ val (typeString, values) : (String, Option[List[(Int, Int)]]) = newColumn.dataType match {
+ case d:DecimalType => ("decimal", Some(List((d.precision, d.scale))))
+ case _ => (newColumn.dataType.typeName.toLowerCase, None)
+ }
+
+ val alterTableChangeDataTypeModel =
+ AlterTableDataTypeChangeModel(new CarbonSpark2SqlParser().parseDataType(typeString, values),
+ new CarbonSpark2SqlParser()
+ .convertDbNameToLowerCase(Option(ctx.tableIdentifier().db).map(_.getText)),
+ ctx.tableIdentifier().table.getText.toLowerCase,
+ ctx.identifier.getText.toLowerCase,
+ newColumn.name.toLowerCase)
+
+ CarbonAlterTableDataTypeChangeCommand(alterTableChangeDataTypeModel)
+ }
+
+
+ override def visitAddTableColumns(ctx: AddTableColumnsContext): LogicalPlan = {
--- End diff --

Better you can catch the command at strategy rather than changing in parser

---

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155284080

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
@@ -413,19 +414,35 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
relation: BaseRelation,
predicates: Seq[Expression]): (Seq[Expression], Seq[Filter]) = {

+ def isComplexAttribute(attribute: Attribute) = attribute.dataType match {
+ case ArrayType(dataType, _) => true
+ case StructType(_) => true
+ case _ => false
+ }
+
+ // In case of ComplexType dataTypes no filters should be pushed down. IsNotNull is being
+ // explicitly added by spark and pushed. That also has to be handled and pushed back to
+ // Spark for handling.
+ val predicatesWithoutComplex = predicates.filter(predicate => predicate.collect {
+ case a: Attribute if (isComplexAttribute(a)) => a
+ }.size == 0 )
+
// For conciseness, all Catalyst filter expressions of type `expressions.Expression` below are
// called `predicate`s, while all data source filters of type `sources.Filter` are simply called
// `filter`s.

val translated: Seq[(Expression, Filter)] =
for {
- predicate <- predicates
+ predicate <- predicatesWithoutComplex
filter <- translateFilter(predicate)
} yield predicate -> filter

+
+ // val newTranslated = translated.toMap.filter()
--- End diff --

Done

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155284294

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
@@ -413,19 +414,35 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
relation: BaseRelation,
predicates: Seq[Expression]): (Seq[Expression], Seq[Filter]) = {

+ def isComplexAttribute(attribute: Attribute) = attribute.dataType match {
+ case ArrayType(dataType, _) => true
+ case StructType(_) => true
+ case _ => false
+ }
+
+ // In case of ComplexType dataTypes no filters should be pushed down. IsNotNull is being
+ // explicitly added by spark and pushed. That also has to be handled and pushed back to
+ // Spark for handling.
+ val predicatesWithoutComplex = predicates.filter(predicate => predicate.collect {
--- End diff --

Done

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155284758

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala ---
@@ -49,10 +49,22 @@ case class CarbonIUDAnalysisRule(sparkSession: SparkSession) extends Rule[Logica

val projList = Seq(UnresolvedAlias(UnresolvedStar(alias.map(Seq(_)))), tupleId)

+ val tableRelation = if (sparkSession.version.startsWith("2.1")) {
+ relation
+ } else if (sparkSession.version.startsWith("2.2")) {
+ alias match {
+ case Some(a) => CarbonReflectionUtils.getSubqueryAlias(sparkSession, alias, relation,
--- End diff --

Done

---

qiuchenjian-2

[GitHub] carbondata pull request #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 I...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1595#discussion_r155289969

--- Diff: integration/spark2/src/main/spark2.2/CarbonSessionState.scala ---
@@ -125,6 +132,18 @@ class CarbonSessionCatalog(
}
}

+
+class CarbonAnalyzer(catalog: SessionCatalog,
--- End diff --

Moving to common place is difficult as CarbonAnalyzer takes different parameter in Spark 2.1 amd Spark2.2. In 2.1 it takes CatalystConf and in 2.2 it takes SQLConf.

---

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1595

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2146/

---

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1595: [CARBONDATA-1868][Spark-2.2]Carbon-Spark2.2 Integrat...

In reply to this post by qiuchenjian-2

12345