kunal642 opened a new pull request #3581: WIP: Reduced an HDFS call and listing of tables in refresh command
URL: https://github.com/apache/carbondata/pull/3581 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
CarbonDataQA1 commented on issue #3581: WIP: Reduced an HDFS call and listing of tables in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-575053854 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1660/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kumarvishal09 commented on issue #3581: WIP: Reduced an HDFS call and listing of tables in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-575460318 @kunal642 pls fix the ci failure ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kumarvishal09 commented on issue #3581: WIP: Reduced an HDFS call and listing of tables in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-575462633 @kunal642 Please add PR detail description ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on issue #3581: WIP: Reduced an HDFS call and listing of tables in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-575467554 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3581: WIP: Reduced an HDFS call and listing of tables in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-575472992 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1674/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Reduced an HDFS call and listing of tables in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-575510796 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1675/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-575530349 @kumarvishal09 CI Passed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368263868 ########## File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala ########## @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand( // then do the below steps // 2.2.1 validate that all the aggregate tables are copied at the store location. // 2.2.2 Register the aggregate tables - val tablePath = CarbonEnv.getTablePath(databaseNameOp, tableName.toLowerCase)(sparkSession) - val identifier = AbsoluteTableIdentifier.from(tablePath, databaseName, tableName.toLowerCase) // 2.1 check if the table already register with hive then ignore and continue with the next // schema - if (!sparkSession.sessionState.catalog.listTables(databaseName) - .exists(_.table.equalsIgnoreCase(tableName))) { + val provider = try { + sparkSession.sessionState.catalog + .getTableMetadata(TableIdentifier(tableName, databaseNameOp)).provider + } catch { + case _: NoSuchTableException => + None + } + if (provider.isEmpty || + provider.get.equalsIgnoreCase("org.apache.spark.sql.CarbonSource") || Review comment: There are many places we are doing this check, it is getting repeated in many places, not clean. Can you make a util function and use it in all places ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368274021 ########## File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala ########## @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand( // then do the below steps // 2.2.1 validate that all the aggregate tables are copied at the store location. // 2.2.2 Register the aggregate tables - val tablePath = CarbonEnv.getTablePath(databaseNameOp, tableName.toLowerCase)(sparkSession) - val identifier = AbsoluteTableIdentifier.from(tablePath, databaseName, tableName.toLowerCase) // 2.1 check if the table already register with hive then ignore and continue with the next // schema - if (!sparkSession.sessionState.catalog.listTables(databaseName) - .exists(_.table.equalsIgnoreCase(tableName))) { + val provider = try { + sparkSession.sessionState.catalog + .getTableMetadata(TableIdentifier(tableName, databaseNameOp)).provider + } catch { + case _: NoSuchTableException => + None + } + if (provider.isEmpty || + provider.get.equalsIgnoreCase("org.apache.spark.sql.CarbonSource") || Review comment: ok ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
xuchuanyin commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368288557 ########## File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala ########## @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand( // then do the below steps // 2.2.1 validate that all the aggregate tables are copied at the store location. // 2.2.2 Register the aggregate tables - val tablePath = CarbonEnv.getTablePath(databaseNameOp, tableName.toLowerCase)(sparkSession) Review comment: the above comments are outdated and should be updated to keep up with your modification ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368375832 ########## File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala ########## @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand( // then do the below steps // 2.2.1 validate that all the aggregate tables are copied at the store location. // 2.2.2 Register the aggregate tables - val tablePath = CarbonEnv.getTablePath(databaseNameOp, tableName.toLowerCase)(sparkSession) Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368375838 ########## File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala ########## @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand( // then do the below steps // 2.2.1 validate that all the aggregate tables are copied at the store location. // 2.2.2 Register the aggregate tables - val tablePath = CarbonEnv.getTablePath(databaseNameOp, tableName.toLowerCase)(sparkSession) - val identifier = AbsoluteTableIdentifier.from(tablePath, databaseName, tableName.toLowerCase) // 2.1 check if the table already register with hive then ignore and continue with the next // schema - if (!sparkSession.sessionState.catalog.listTables(databaseName) - .exists(_.table.equalsIgnoreCase(tableName))) { + val provider = try { + sparkSession.sessionState.catalog + .getTableMetadata(TableIdentifier(tableName, databaseNameOp)).provider + } catch { + case _: NoSuchTableException => + None + } + if (provider.isEmpty || + provider.get.equalsIgnoreCase("org.apache.spark.sql.CarbonSource") || Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-576114924 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1695/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-576118520 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1696/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-576233640 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1703/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-576991766 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
asfgit closed pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |