Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

Classic

List

28 messages Options

Options

12

GitBox

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

ShreelekhyaG opened a new pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841

### Why is this PR needed?
drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

### What changes were proposed in this PR?
made view manager as a single instance for all sessions, so that changes executed from one session gets reflected in others.

### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)

### Is any new testcase added?
- No

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-657739220

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1641/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-657739506

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3382/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-658045205

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1644/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-658045347

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3385/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

In reply to this post by GitBox

Indhumathi27 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-658118242

@ShreelekhyaG Please create a jira and update the PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent c…

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-660054214

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1680/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent c…

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-660061656

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3423/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

akashrn5 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r457843327

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/MVTest.scala
##########
@@ -178,6 +180,13 @@ class MVTest extends QueryTest with BeforeAndAfterAll {
}
}

+ test("test drop mv must fail if not exists") {
+ val ex = intercept[MalformedMVCommandException] {
+ sql("drop materialized view MV11")

Review comment:
please give meaningful MV name based on test scenario

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##########
@@ -955,9 +955,9 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
sql(" insert into mvtable1 select 'n4',12,12")
sql("update mvtable1 set(name) = ('updatedName')").show()
checkAnswer(sql("select count(*) from mvtable1 where name = 'updatedName'"),Seq(Row(4)))
+ sql(s"drop materialized view MV11")

Review comment:
why this file change required, please revert if not required, i cant see any valid changes.

##########
File path: core/src/main/java/org/apache/carbondata/core/view/MVManager.java
##########
@@ -143,9 +143,9 @@ public void createSchema(String databaseName, MVSchema viewSchema)
/**
* Drops the mv schema from storage
*
- * @param viewName index name
+ * @param viewName mv name
*/
- public void deleteSchema(String databaseName, String viewName) throws IOException {
+ public synchronized void deleteSchema(String databaseName, String viewName) throws IOException {

Review comment:
why we need `synchronized` here? As i can see, if two people are dropping same, then one will be successful and other will get exception as MV does not exists. Any special case or scenario you observed?

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/view/CarbonDropMVCommand.scala
##########
@@ -91,6 +92,14 @@ case class CarbonDropMVCommand(

this.dropTableCommand = dropTableCommand
}
+ else {

Review comment:
move else block to above line and correct the formatting

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458011089

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {

object MVManagerInSpark {

- private val MANAGER_MAP_BY_SESSION =
- new util.HashMap[SparkSession, MVManagerInSpark]()
+ private var viewManager: MVManagerInSpark = null

+ // returns single MVManager instance for all the current sessions.
def get(session: SparkSession): MVManagerInSpark = {
- var viewManager = MANAGER_MAP_BY_SESSION.get(session)
if (viewManager == null) {
- MANAGER_MAP_BY_SESSION.synchronized {
- viewManager = MANAGER_MAP_BY_SESSION.get(session)
+ this.synchronized {
if (viewManager == null) {
viewManager = new MVManagerInSpark(session)
- MANAGER_MAP_BY_SESSION.put(session, viewManager)
session.sparkContext.addSparkListener(new SparkListener {

Review comment:
If we have multiple sessions, only one session(i.e., the one which created MVManagerInSpark instance) is added to listeners. Others are not added.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458023893

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {

object MVManagerInSpark {

- private val MANAGER_MAP_BY_SESSION =
- new util.HashMap[SparkSession, MVManagerInSpark]()
+ private var viewManager: MVManagerInSpark = null

+ // returns single MVManager instance for all the current sessions.

Review comment:
Probably MVManagerInSpark instance is maintained per SparkSession to support multi-tenency. Each user may have an independent and isolated set of databases/tables/conf etc sharing same sparkcontext. I think, we still need this isolation. right ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458023893

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {

object MVManagerInSpark {

- private val MANAGER_MAP_BY_SESSION =
- new util.HashMap[SparkSession, MVManagerInSpark]()
+ private var viewManager: MVManagerInSpark = null

+ // returns single MVManager instance for all the current sessions.

Review comment:
Probably a MVManagerInSpark instance was maintained per SparkSession to support multi-tenency. Each user may have an independent and isolated set of databases/tables/conf etc sharing same sparkcontext. I believe, we still need this isolation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458068490

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##########
@@ -955,9 +955,9 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
sql(" insert into mvtable1 select 'n4',12,12")
sql("update mvtable1 set(name) = ('updatedName')").show()
checkAnswer(sql("select count(*) from mvtable1 where name = 'updatedName'"),Seq(Row(4)))
+ sql(s"drop materialized view MV11")

Review comment:
Here, `drop materialized view` command is given after table drop and without `ifexists` check it fails.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458068693

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/MVTest.scala
##########
@@ -178,6 +180,13 @@ class MVTest extends QueryTest with BeforeAndAfterAll {
}
}

+ test("test drop mv must fail if not exists") {
+ val ex = intercept[MalformedMVCommandException] {
+ sql("drop materialized view MV11")

Review comment:
Done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458094901

##########
File path: core/src/main/java/org/apache/carbondata/core/view/MVManager.java
##########
@@ -143,9 +143,9 @@ public void createSchema(String databaseName, MVSchema viewSchema)
/**
* Drops the mv schema from storage
*
- * @param viewName index name
+ * @param viewName mv name
*/
- public void deleteSchema(String databaseName, String viewName) throws IOException {
+ public synchronized void deleteSchema(String databaseName, String viewName) throws IOException {

Review comment:
added so that another thread could skip some steps. as this occurs only few times when multiple threads run, removing it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-661915225

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3457/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-661917823

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1715/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

ShreelekhyaG commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-662451331

retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458854322

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {

object MVManagerInSpark {

- private val MANAGER_MAP_BY_SESSION =
- new util.HashMap[SparkSession, MVManagerInSpark]()
+ private var viewManager: MVManagerInSpark = null

+ // returns single MVManager instance for all the current sessions.

Review comment:
There should be only one MVManagerInSpark instance sharing the same sparkcontext. If not, in different spark sessions, the query will not hit MV.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458871503

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {

object MVManagerInSpark {

- private val MANAGER_MAP_BY_SESSION =
- new util.HashMap[SparkSession, MVManagerInSpark]()
+ private var viewManager: MVManagerInSpark = null

+ // returns single MVManager instance for all the current sessions.
def get(session: SparkSession): MVManagerInSpark = {
- var viewManager = MANAGER_MAP_BY_SESSION.get(session)
if (viewManager == null) {
- MANAGER_MAP_BY_SESSION.synchronized {
- viewManager = MANAGER_MAP_BY_SESSION.get(session)
+ this.synchronized {
if (viewManager == null) {
viewManager = new MVManagerInSpark(session)
- MANAGER_MAP_BY_SESSION.put(session, viewManager)
session.sparkContext.addSparkListener(new SparkListener {

Review comment:
This is not needed for MV, removing it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

12