[GitHub] [carbondata] kevinjmh edited a comment on issue #3357: [CARBONDATA-3491] Return updated/deleted rows count when execute update/delete sql

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kevinjmh edited a comment on issue #3357: [CARBONDATA-3491] Return updated/deleted rows count when execute update/delete sql

GitBox
kevinjmh edited a comment on issue #3357: [CARBONDATA-3491] Return updated/deleted rows count when execute update/delete sql
URL: https://github.com/apache/carbondata/pull/3357#issuecomment-520737311
 
 
   Modification content I did in mentioned class:
   ```
     override def output: Seq[Attribute] =
       Seq(AttributeReference("Total Rows Updated", LongType, nullable = false)())
   ```
   
   plan when run success is similar to yours but has one more line `Total Rows Updated: bigint`:
   ```
   == Parsed Logical Plan ==
   UpdateTable 'UnresolvedRelation `test_return_row_count`, [b], select 'ddd' from test_return_row_count, test_return_row_count, where a = 'ccc'
   
   == Analyzed Logical Plan ==
   Total Rows Updated: bigint
   ProjectForUpdate 'UnresolvedRelation `test_return_row_count`, [b]
   +- Project [a#103, b#104, c#105, tupleId#282, b-updatedColumn#283]
      +- Filter (a#103 = ccc)
         +- SubqueryAlias test_return_row_count
            +- SubqueryAlias test_return_row_count
               +- Project [a#103, b#104, c#105, tupleId#282, ddd AS b-updatedColumn#283]
                  +- SubqueryAlias test_return_row_count
                     +- Project [a#103, b#104, c#105, UDF:getTupleId() AS tupleId#282]
                        +- SubqueryAlias test_return_row_count
                           +- SubqueryAlias test_return_row_count
                              +- Relation[a#103,b#104,c#105] CarbonDatasourceHadoopRelation
   
   == Optimized Logical Plan ==
   CarbonProjectForUpdateCommand Project [a#103, c#105, UDF:getTupleId() AS tupleId#282, ddd AS b-updatedColumn#283], test_return_row_count, [b]
   
   == Physical Plan ==
   Execute CarbonProjectForUpdateCommand
      +- CarbonProjectForUpdateCommand Project [a#103, c#105, UDF:getTupleId() AS tupleId#282, ddd AS b-updatedColumn#283], test_return_row_count, [b]
   ```
   
   I encount your problem after multi-run. By debugging, I found that Spark2.3 converts almost all column to string by adding one more select operation in `Dataset.scala` ( see https://github.com/apache/spark/pull/20214/ )when codes calls `df.shows`, and `CheckAnalysis` do check whether any attibute is missing input/same name.
   
   So, if this pr add a output without source and calls `df.show`, it will get error. But it can pass if you don't call `df.show`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services