[jira] [Commented] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565016#comment-16565016 ]

Prasanna Ravichandran commented on CARBONDATA-2534:
---------------------------------------------------

Now the MV creation is working with the substring function without any error but when the user queries the MV query, it is not accessing the data from the MV datamap.

*Terminal:*

> create datamap mv_substr using 'mv' as select sum(salary),substring(empname,2,5),designation from originTable group by substring(empname,2,5),designation;
+---------+--+
| Result |
+---------+--+
+---------+--+
No rows selected (0.661 seconds)

> explain select sum(salary),substring(empname,2,5),designation from originTable group by substring(empname,2,5),designation;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| plan |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| == CarbonData Profiler ==
Table Scan on origintable
 - total blocklets: 2
 - filter: none
 - pruned by Main DataMap
 - skipped blocklets: 0
 |
| == Physical Plan ==
*HashAggregate(keys=[substring(empname#18267, 2, 5)#18352, designation#18268], functions=[sum(cast(salary#18279 as bigint))])
+- Exchange hashpartitioning(substring(empname#18267, 2, 5)#18352, designation#18268, 200)
 +- *HashAggregate(keys=[substring(empname#18267, 2, 5) AS substring(empname#18267, 2, 5)#18352, designation#18268], functions=[partial_sum(cast(salary#18279 as bigint))])
 +- *FileScan carbondata *b011.origintable*[empname#18267,designation#18268,salary#18279] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
2 rows selected (0.432 seconds)

> MV Dataset - MV creation is not working with the substring()
> -------------------------------------------------------------
>
>                 Key: CARBONDATA-2534
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2534
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query
>         Environment: 3 node opensource ANT cluster
>            Reporter: Prasanna Ravichandran
>            Priority: Minor
>              Labels: CarbonData, MV, Materialistic_Views
>             Fix For: 1.5.0, 1.4.1
>
>         Attachments: MV_substring.docx, data.csv
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> MV creation is not working with the sub string function. We are getting the spark.sql.AnalysisException while trying to create a MV with the substring and aggregate function. 
> *Spark -shell test queries:*
>  scala> carbon.sql("create datamap mv_substr using 'mv' as select sum(salary),substring(empname,2,5),designation from originTable group by substring(empname,2,5),designation").show(200,false)
> *org.apache.spark.sql.AnalysisException: Cannot create a table having a column whose name contains commas in Hive metastore. Table: `default`.`mv_substr_table`; Column: substring_empname,_2,_5;*
>  *at* org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150)
>  at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148)
>  at scala.collection.immutable.List.foreach(List.scala:381)
>  at org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148)
>  at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222)
>  at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>  at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>  at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
>  at org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
>  at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110)
>  at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316)
>  at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119)
>  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>  at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
>  at org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108)
>  at org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97)
>  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155)
>  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
>  at org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126)
>  at org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68)
>  at org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103)
>  at org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53)
>  at org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118)
>  at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90)
>  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>  at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
>  at org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108)
>  at org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97)
>  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155)
>  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
>  ... 48 elided



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)