[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

GitBox

ShreelekhyaG opened a new pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841


    ### Why is this PR needed?
   drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.
   
    ### What changes were proposed in this PR?
   made view manager as a single instance for all sessions, so that changes executed from one session gets reflected in others.
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
   
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-657739220


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1641/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-657739506


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3382/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-658045205


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1644/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-658045347


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3385/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on pull request #3841: [WIP] : drop materialized view when executed concurrently from 4 concurrent c…

GitBox
In reply to this post by GitBox

Indhumathi27 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-658118242


   @ShreelekhyaG Please create a jira and update the PR


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent c…

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-660054214


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1680/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent c…

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-660061656


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3423/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

akashrn5 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r457843327



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/MVTest.scala
##########
@@ -178,6 +180,13 @@ class MVTest extends QueryTest with BeforeAndAfterAll {
     }
   }
 
+  test("test drop mv must fail if not exists") {
+    val ex = intercept[MalformedMVCommandException] {
+      sql("drop materialized view MV11")

Review comment:
       please give meaningful MV name based on test scenario

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##########
@@ -955,9 +955,9 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(" insert into mvtable1 select 'n4',12,12")
     sql("update mvtable1 set(name) = ('updatedName')").show()
     checkAnswer(sql("select count(*) from mvtable1 where name = 'updatedName'"),Seq(Row(4)))
+    sql(s"drop materialized view MV11")

Review comment:
       why this file change required, please revert if not required, i cant see any valid changes.

##########
File path: core/src/main/java/org/apache/carbondata/core/view/MVManager.java
##########
@@ -143,9 +143,9 @@ public void createSchema(String databaseName, MVSchema viewSchema)
   /**
    * Drops the mv schema from storage
    *
-   * @param viewName index name
+   * @param viewName mv name
    */
-  public void deleteSchema(String databaseName, String viewName) throws IOException {
+  public synchronized void deleteSchema(String databaseName, String viewName) throws IOException {

Review comment:
       why we need `synchronized` here? As i can see, if two people are dropping same, then one will be successful and other will get exception as MV does not exists. Any special case or scenario you observed?

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/view/CarbonDropMVCommand.scala
##########
@@ -91,6 +92,14 @@ case class CarbonDropMVCommand(
 
         this.dropTableCommand = dropTableCommand
       }
+      else {

Review comment:
       move else block to above line and correct the formatting




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458011089



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {
 
 object MVManagerInSpark {
 
-  private val MANAGER_MAP_BY_SESSION =
-    new util.HashMap[SparkSession, MVManagerInSpark]()
+  private var viewManager: MVManagerInSpark = null
 
+  // returns single MVManager instance for all the current sessions.
   def get(session: SparkSession): MVManagerInSpark = {
-    var viewManager = MANAGER_MAP_BY_SESSION.get(session)
     if (viewManager == null) {
-      MANAGER_MAP_BY_SESSION.synchronized {
-        viewManager = MANAGER_MAP_BY_SESSION.get(session)
+      this.synchronized {
         if (viewManager == null) {
           viewManager = new MVManagerInSpark(session)
-          MANAGER_MAP_BY_SESSION.put(session, viewManager)
           session.sparkContext.addSparkListener(new SparkListener {

Review comment:
       If we have multiple sessions, only one session(i.e., the one which created MVManagerInSpark instance) is added to listeners. Others are not added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458023893



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {
 
 object MVManagerInSpark {
 
-  private val MANAGER_MAP_BY_SESSION =
-    new util.HashMap[SparkSession, MVManagerInSpark]()
+  private var viewManager: MVManagerInSpark = null
 
+  // returns single MVManager instance for all the current sessions.

Review comment:
       Probably MVManagerInSpark instance is maintained per SparkSession to support multi-tenency.  Each user may have an independent and isolated set of databases/tables/conf etc sharing same sparkcontext. I think, we still need this isolation. right ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458023893



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {
 
 object MVManagerInSpark {
 
-  private val MANAGER_MAP_BY_SESSION =
-    new util.HashMap[SparkSession, MVManagerInSpark]()
+  private var viewManager: MVManagerInSpark = null
 
+  // returns single MVManager instance for all the current sessions.

Review comment:
       Probably a MVManagerInSpark instance was maintained per SparkSession to support multi-tenency.  Each user may have an independent and isolated set of databases/tables/conf etc sharing same sparkcontext. I believe, we still need this isolation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458068490



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##########
@@ -955,9 +955,9 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(" insert into mvtable1 select 'n4',12,12")
     sql("update mvtable1 set(name) = ('updatedName')").show()
     checkAnswer(sql("select count(*) from mvtable1 where name = 'updatedName'"),Seq(Row(4)))
+    sql(s"drop materialized view MV11")

Review comment:
       Here, `drop materialized view` command is given after table drop and without `ifexists` check it fails.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458068693



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/view/MVTest.scala
##########
@@ -178,6 +180,13 @@ class MVTest extends QueryTest with BeforeAndAfterAll {
     }
   }
 
+  test("test drop mv must fail if not exists") {
+    val ex = intercept[MalformedMVCommandException] {
+      sql("drop materialized view MV11")

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458094901



##########
File path: core/src/main/java/org/apache/carbondata/core/view/MVManager.java
##########
@@ -143,9 +143,9 @@ public void createSchema(String databaseName, MVSchema viewSchema)
   /**
    * Drops the mv schema from storage
    *
-   * @param viewName index name
+   * @param viewName mv name
    */
-  public void deleteSchema(String databaseName, String viewName) throws IOException {
+  public synchronized void deleteSchema(String databaseName, String viewName) throws IOException {

Review comment:
       added so that another thread could skip some steps. as this occurs only few times when multiple threads run, removing it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-661915225


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3457/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-661917823


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1715/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-662451331


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458854322



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {
 
 object MVManagerInSpark {
 
-  private val MANAGER_MAP_BY_SESSION =
-    new util.HashMap[SparkSession, MVManagerInSpark]()
+  private var viewManager: MVManagerInSpark = null
 
+  // returns single MVManager instance for all the current sessions.

Review comment:
       There should be only one MVManagerInSpark instance sharing the same sparkcontext. If not, in different spark sessions, the query will not hit MV.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458871503



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##########
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends MVManager {
 
 object MVManagerInSpark {
 
-  private val MANAGER_MAP_BY_SESSION =
-    new util.HashMap[SparkSession, MVManagerInSpark]()
+  private var viewManager: MVManagerInSpark = null
 
+  // returns single MVManager instance for all the current sessions.
   def get(session: SparkSession): MVManagerInSpark = {
-    var viewManager = MANAGER_MAP_BY_SESSION.get(session)
     if (viewManager == null) {
-      MANAGER_MAP_BY_SESSION.synchronized {
-        viewManager = MANAGER_MAP_BY_SESSION.get(session)
+      this.synchronized {
         if (viewManager == null) {
           viewManager = new MVManagerInSpark(session)
-          MANAGER_MAP_BY_SESSION.put(session, viewManager)
           session.sparkContext.addSparkListener(new SparkListener {

Review comment:
       This is not needed for MV, removing it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12