GitHub user manishgupta88 opened a pull request:
https://github.com/apache/carbondata/pull/1147 [WIP][CARBONDATA-1277] Dictionary generation failure if there is failure in closing output steam in HDFS Analysis: If there is any failure while closing the output stream of dictionary file in HDFS then on next data load, update or insert into operation dictionary generation fails. This is because we open the dictionary file in append mode and when we try to get the output stream for that file HDFS throws an exception that Lease is already acquired by some other client. Fix: Recover the lease through carbondata code if exception is for lease failure You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata hdfs_lease_recovery_exception Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1147.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1147 ---- commit 6ffcf18068b914a86bb83241300c7d8e7ba44c07 Author: manishgupta88 <[hidden email]> Date: 2017-07-08T10:16:25Z Problem: Dictionary generation failure if there is failure in closing output steam in HDFS Analysis: If there is any failure while closing the output stream of dictionary file in HDFS then on next data load, update or insert into operation dictionary generation fails. This is because we open the dictionary file in append mode and when we try to get the output stream for that file HDFS throws an exception that Lease is already acquired by some other client. Fix: Recover the lease through carbondata code if exception is for lease failure ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1147 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1147 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/370/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2958/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/371/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2959/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1147#discussion_r126415422 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1287,6 +1287,12 @@ public static final String CARBON_BAD_RECORDS_ACTION_DEFAULT = "FORCE"; + @CarbonProperty + public static final String CARBON_LEASE_RECOVERY_RETRY_COUNT = + "carbon.lease.recovery.retry.count"; + public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL = --- End diff -- add attribute for this also --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1147#discussion_r126418899 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/HDFSUtils.java --- @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.util.path; + +import java.io.FileNotFoundException; +import java.io.IOException; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.util.CarbonProperties; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hdfs.DistributedFileSystem; +import org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException; + +/** + * Implementation for HDFS utility methods + */ +public class HDFSUtils { --- End diff -- Make it hdfsLeaseUtils --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1147#discussion_r126419166 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/HDFSUtils.java --- @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.util.path; + +import java.io.FileNotFoundException; +import java.io.IOException; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.util.CarbonProperties; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hdfs.DistributedFileSystem; +import org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException; + +/** + * Implementation for HDFS utility methods + */ +public class HDFSUtils { + + private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MIN = 1; + private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MAX = 5; --- End diff -- make max 50 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1147#discussion_r126420489 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/HDFSUtils.java --- @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.util.path; + +import java.io.FileNotFoundException; +import java.io.IOException; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.util.CarbonProperties; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hdfs.DistributedFileSystem; +import org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException; + +/** + * Implementation for HDFS utility methods + */ +public class HDFSUtils { + + private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MIN = 1; + private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MAX = 5; + private static final String CARBON_LEASE_RECOVERY_RETRY_COUNT_DEFAULT = "3"; + private static final int CARBON_LEASE_RECOVERY_RETRY_INTERVAL_MIN = 100; + private static final int CARBON_LEASE_RECOVERY_RETRY_INTERVAL_MAX = 10000; + private static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL_DEFAULT = "1000"; + + /** + * LOGGER + */ + private static final LogService LOGGER = + LogServiceFactory.getLogService(HDFSUtils.class.getName()); + + /** + * This method will validate whether the exception thrown if for lease recovery from HDFS + * + * @param message + * @return + */ + public static boolean checkExceptionMessageForLeaseRecovery(String message) { + // depending on the scenario few more cases can be added for validating lease recovery exception + if (null != message && message.contains("Failed to APPEND_FILE")) { + return true; + } + return false; + } + + /** + * This method will make attempts to recover lease on a file using the + * distributed file system utility. + * + * @param filePath + * @return + * @throws IOException + */ + public static boolean recoverFileLease(String filePath) throws IOException { + LOGGER.info("Trying to recover lease on file: " + filePath); + FileFactory.FileType fileType = FileFactory.getFileType(filePath); + switch (fileType) { + case ALLUXIO: + case HDFS: + case VIEWFS: + DistributedFileSystem dfs = null; + Path path = FileFactory.getPath(filePath); + FileSystem fs = FileFactory.getFileSystem(path); + dfs = (DistributedFileSystem) fs; + int maxAttempts = getLeaseRecoveryRetryCount(); + int retryInterval = getLeaseRecoveryRetryInterval(); + boolean leaseRecovered = false; + IOException ioException = null; + for (int retryCount = 1; retryCount <= maxAttempts; retryCount++) { + try { + leaseRecovered = dfs.recoverLease(path); --- End diff -- check viwefs lease recovery mechanism --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2999/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/410/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3001/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1147 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/412/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on the issue:
https://github.com/apache/carbondata/pull/1147 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/1147 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |