Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] [carbondata] kevinjmh edited a comment on issue #3357: [CARBONDATA-3491] Return updated/deleted rows count when execute update/delete sql

Classic

List

Threaded

1 message

GitBox

[GitHub] [carbondata] kevinjmh edited a comment on issue #3357: [CARBONDATA-3491] Return updated/deleted rows count when execute update/delete sql

kevinjmh edited a comment on issue #3357: [CARBONDATA-3491] Return updated/deleted rows count when execute update/delete sql
URL: https://github.com/apache/carbondata/pull/3357#issuecomment-520737311

Modification content I did in mentioned class:
```
override def output: Seq[Attribute] =
Seq(AttributeReference("Total Rows Updated", LongType, nullable = false)())
```

plan when run success is similar to yours but has one more line `Total Rows Updated: bigint`:
```
== Parsed Logical Plan ==
UpdateTable 'UnresolvedRelation `test_return_row_count`, [b], select 'ddd' from test_return_row_count, test_return_row_count, where a = 'ccc'

== Analyzed Logical Plan ==
Total Rows Updated: bigint
ProjectForUpdate 'UnresolvedRelation `test_return_row_count`, [b]
+- Project [a#103, b#104, c#105, tupleId#282, b-updatedColumn#283]
+- Filter (a#103 = ccc)
+- SubqueryAlias test_return_row_count
+- SubqueryAlias test_return_row_count
+- Project [a#103, b#104, c#105, tupleId#282, ddd AS b-updatedColumn#283]
+- SubqueryAlias test_return_row_count
+- Project [a#103, b#104, c#105, UDF:getTupleId() AS tupleId#282]
+- SubqueryAlias test_return_row_count
+- SubqueryAlias test_return_row_count
+- Relation[a#103,b#104,c#105] CarbonDatasourceHadoopRelation

== Optimized Logical Plan ==
CarbonProjectForUpdateCommand Project [a#103, c#105, UDF:getTupleId() AS tupleId#282, ddd AS b-updatedColumn#283], test_return_row_count, [b]

== Physical Plan ==
Execute CarbonProjectForUpdateCommand
+- CarbonProjectForUpdateCommand Project [a#103, c#105, UDF:getTupleId() AS tupleId#282, ddd AS b-updatedColumn#283], test_return_row_count, [b]
```

I encount your problem after multi-run. By debugging, I found that Spark2.3 converts almost all column to string by adding one more select operation in `Dataset.scala` （ see https://github.com/apache/spark/pull/20214/ ）when codes calls `df.shows`, and `CheckAnalysis` do check whether any attibute is missing input/same name.

So, if this pr add a output without source and calls `df.show`, it will get error. But it can pass if you don't call `df.show`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

With regards,
Apache Git Services