MarvinLitt opened a new pull request #3855: URL: https://github.com/apache/carbondata/pull/3855 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-661143216 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1699/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-661143881 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3441/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
vikramahuja1001 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-662251240 @MarvinLitt , please add proper description as to why this PR is needed and what changes are proposed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-662774014 > @MarvinLitt , please add proper description as to why this PR is needed and what changes are proposed done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r459579617 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala ########## @@ -316,4 +324,17 @@ object IndexServer extends ServerInterface { Array(new Service("security.indexserver.protocol.acl", classOf[ServerInterface])) } } + + def startAgingFolders(): Unit = { + val runnable = new Runnable() { + def run() { + val age = System.currentTimeMillis() - agePeriod.toLong + CarbonUtil.agingTempFolderForIndexServer(age) + LOGGER.info(s"Complete age temp folder ${CarbonUtil.getIndexServerTempPath}") + } + } + val ags: ScheduledExecutorService = Executors.newSingleThreadScheduledExecutor + ags.scheduleAtFixedRate(runnable, 1000, 3600000, TimeUnit.MICROSECONDS) + LOGGER.info("index server temp folders aging thread start") Review comment: Please add this log inside the thread so that we can know each time the deletion thread was executed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r459584046 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala ########## @@ -316,4 +324,17 @@ object IndexServer extends ServerInterface { Array(new Service("security.indexserver.protocol.acl", classOf[ServerInterface])) } } + + def startAgingFolders(): Unit = { + val runnable = new Runnable() { + def run() { + val age = System.currentTimeMillis() - agePeriod.toLong + CarbonUtil.agingTempFolderForIndexServer(age) + LOGGER.info(s"Complete age temp folder ${CarbonUtil.getIndexServerTempPath}") + } + } + val ags: ScheduledExecutorService = Executors.newSingleThreadScheduledExecutor + ags.scheduleAtFixedRate(runnable, 1000, 3600000, TimeUnit.MICROSECONDS) Review comment: Why are you scheduling this after 3s.. I think this scheduling rate is too fast. Better to execute this thread at a fixed delay of 1 hour. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r459584942 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala ########## @@ -316,4 +324,17 @@ object IndexServer extends ServerInterface { Array(new Service("security.indexserver.protocol.acl", classOf[ServerInterface])) } } + + def startAgingFolders(): Unit = { Review comment: Please rename this method to something like "scheduleCleanUpThread" ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r459588452 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexJobs.scala ########## @@ -69,6 +69,12 @@ class DistributedIndexJob extends AbstractIndexJob { if (null != splitFolderPath && !splitFolderPath.deleteFile()) { LOGGER.error("Problem while deleting the temp directory:" + splitFolderPath.getAbsolutePath) + } else { + // if the path build with getQueryId already exists, + // the splitFolderPath should be null, need delete + if (null == splitFolderPath) { + CarbonUtil.deleteTempFolderForIndexServer(indexFormat.getQueryId) Review comment: We are already deleting the query id folders in this PR, why is deleting the "/tmp/indexservertmp" required then?? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r459589936 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java ########## @@ -485,4 +486,30 @@ public boolean equals(Object o) { public int hashCode() { return Objects.hash(file.getAbsolutePath()); } + + @Override + public List<CarbonFile> listDirs() throws IOException { + if (!file.isDirectory()) { + return new ArrayList<CarbonFile>(); + } + Collection<File> fileCollection = FileUtils.listFilesAndDirs(file, + DirectoryFileFilter.DIRECTORY, null); + if (fileCollection.isEmpty()) { + return new ArrayList<CarbonFile>(); + } + List<CarbonFile> carbonFiles = new ArrayList<CarbonFile>(); + for (File file : fileCollection) { + if (file.isDirectory()) { + File[] files = file.listFiles(); Review comment: Why list files is needed again? Are you trying to list recursive? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r460346175 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexJobs.scala ########## @@ -69,6 +69,12 @@ class DistributedIndexJob extends AbstractIndexJob { if (null != splitFolderPath && !splitFolderPath.deleteFile()) { LOGGER.error("Problem while deleting the temp directory:" + splitFolderPath.getAbsolutePath) + } else { + // if the path build with getQueryId already exists, + // the splitFolderPath should be null, need delete + if (null == splitFolderPath) { + CarbonUtil.deleteTempFolderForIndexServer(indexFormat.getQueryId) Review comment: if the /tmp/indexservertmp/gropid has exists then the splitFolderPath will return null. before our code do not deal with that scene。 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r460346175 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexJobs.scala ########## @@ -69,6 +69,12 @@ class DistributedIndexJob extends AbstractIndexJob { if (null != splitFolderPath && !splitFolderPath.deleteFile()) { LOGGER.error("Problem while deleting the temp directory:" + splitFolderPath.getAbsolutePath) + } else { + // if the path build with getQueryId already exists, + // the splitFolderPath should be null, need delete + if (null == splitFolderPath) { + CarbonUtil.deleteTempFolderForIndexServer(indexFormat.getQueryId) Review comment: if the /tmp/indexservertmp/gropid has exists then the splitFolderPath will return null. before our code do not deal with that scene。 found this issue by test case. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r460346573 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala ########## @@ -316,4 +324,17 @@ object IndexServer extends ServerInterface { Array(new Service("security.indexserver.protocol.acl", classOf[ServerInterface])) } } + + def startAgingFolders(): Unit = { + val runnable = new Runnable() { + def run() { + val age = System.currentTimeMillis() - agePeriod.toLong + CarbonUtil.agingTempFolderForIndexServer(age) + LOGGER.info(s"Complete age temp folder ${CarbonUtil.getIndexServerTempPath}") + } + } + val ags: ScheduledExecutorService = Executors.newSingleThreadScheduledExecutor + ags.scheduleAtFixedRate(runnable, 1000, 3600000, TimeUnit.MICROSECONDS) Review comment: delay 1 hour may be to late, if something wrong happens, 1hour is enough to cause a relatively large impcact。 may be we can delay for 5 min, what do you think? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r460346871 ########## File path: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java ########## @@ -485,4 +486,30 @@ public boolean equals(Object o) { public int hashCode() { return Objects.hash(file.getAbsolutePath()); } + + @Override + public List<CarbonFile> listDirs() throws IOException { + if (!file.isDirectory()) { + return new ArrayList<CarbonFile>(); + } + Collection<File> fileCollection = FileUtils.listFilesAndDirs(file, + DirectoryFileFilter.DIRECTORY, null); + if (fileCollection.isEmpty()) { + return new ArrayList<CarbonFile>(); + } + List<CarbonFile> carbonFiles = new ArrayList<CarbonFile>(); + for (File file : fileCollection) { + if (file.isDirectory()) { + File[] files = file.listFiles(); Review comment: try to find all folders from /tmp/indexservertmp. no list recursive, just find the direct directory under /tmp/indexservertmp. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r460666952 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala ########## @@ -316,4 +324,17 @@ object IndexServer extends ServerInterface { Array(new Service("security.indexserver.protocol.acl", classOf[ServerInterface])) } } + + def startAgingFolders(): Unit = { + val runnable = new Runnable() { + def run() { + val age = System.currentTimeMillis() - agePeriod.toLong + CarbonUtil.agingTempFolderForIndexServer(age) + LOGGER.info(s"Complete age temp folder ${CarbonUtil.getIndexServerTempPath}") + } + } + val ags: ScheduledExecutorService = Executors.newSingleThreadScheduledExecutor + ags.scheduleAtFixedRate(runnable, 1000, 3600000, TimeUnit.MICROSECONDS) + LOGGER.info("index server temp folders aging thread start") Review comment: under run func there already has logs. def run() { val age = System.currentTimeMillis() - agePeriod.toLong CarbonUtil.agingTempFolderForIndexServer(age) LOGGER.info(s"Complete age temp folder ${CarbonUtil.getIndexServerTempPath}") } ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r460667653 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala ########## @@ -316,4 +324,17 @@ object IndexServer extends ServerInterface { Array(new Service("security.indexserver.protocol.acl", classOf[ServerInterface])) } } + + def startAgingFolders(): Unit = { Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#discussion_r460675486 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala ########## @@ -316,4 +324,17 @@ object IndexServer extends ServerInterface { Array(new Service("security.indexserver.protocol.acl", classOf[ServerInterface])) } } + + def startAgingFolders(): Unit = { + val runnable = new Runnable() { + def run() { + val age = System.currentTimeMillis() - agePeriod.toLong + CarbonUtil.agingTempFolderForIndexServer(age) + LOGGER.info(s"Complete age temp folder ${CarbonUtil.getIndexServerTempPath}") + } + } + val ags: ScheduledExecutorService = Executors.newSingleThreadScheduledExecutor + ags.scheduleAtFixedRate(runnable, 1000, 3600000, TimeUnit.MICROSECONDS) Review comment: the rate is 3 hours, about the delay time, i thinks it is ok, delay 1s or delay 5min or delay 1hour the effect is almost the same. The test cases are covered here. If there is too much delay, the execution of test cases will be affected. so kunal is there no need to modify the delay here? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-664237649 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1757/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-664239278 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3499/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
MarvinLitt commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-666257057 @kunal642 @vikramahuja1001 please check this pr ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |