[ https://issues.apache.org/jira/browse/CARBONDATA-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated CARBONDATA-3037: -------------------------------- Description: ##Introduce When read data by using CarbonData SDK from S3 , It throw some exception. ##Problem {code:java} log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:454) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:75) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:66) at org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile.<init>(HDFSCarbonFile.java:41) at org.apache.carbondata.core.datastore.filesystem.S3CarbonFile.<init>(S3CarbonFile.java:41) at org.apache.carbondata.core.datastore.impl.DefaultFileTypeProvider.getCarbonFile(DefaultFileTypeProvider.java:53) at org.apache.carbondata.core.datastore.impl.FileFactory.getCarbonFile(FileFactory.java:99) at org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:183) at org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:178) at org.apache.carbondata.core.metadata.schema.SchemaReader.readCarbonTableFromStore(SchemaReader.java:41) at org.apache.carbondata.core.metadata.schema.table.CarbonTable.buildFromTablePath(CarbonTable.java:288) at org.apache.carbondata.core.datamap.DataMapStoreManager.getCarbonTable(DataMapStoreManager.java:496) at org.apache.carbondata.core.datamap.DataMapStoreManager.clearDataMaps(DataMapStoreManager.java:460) at org.apache.carbondata.sdk.file.CarbonReaderBuilder.build(CarbonReaderBuilder.java:180) at org.apache.carbondata.examples.sdk.SDKS3ReadExample.main(SDKS3ReadExample.java:67) Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232) at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) at com.amazonaws.http.conn.$Proxy7.getConnection(Unknown Source) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384) ... 20 more Process finished with exit code 1 {code} ##Analysis The default value of fs.s3a.connection.maximum is 15. When read the 16th file, it will throw ConnectionPoolTimeoutException because the connect not enougth. org.apache.hadoop.fs.s3a#initialize AWSCredentialsProviderChain credentials = new AWSCredentialsProviderChain(new AWSCredentialsProvider[]{new BasicAWSCredentialsProvider(accessKey, secretKey), new InstanceProfileCredentialsProvider(), new AnonymousAWSCredentialsProvider()}); this.bucket = name.getHost(); ClientConfiguration awsConf = new ClientConfiguration(); awsConf.setMaxConnections(conf.getInt("fs.s3a.connection.maximum", 15)); boolean secureConnections = conf.getBoolean("fs.s3a.connection.ssl.enabled", true); awsConf.setProtocol(secureConnections?Protocol.HTTPS:Protocol.HTTP); awsConf.setMaxErrorRetry(conf.getInt("fs.s3a.attempts.maximum", 10)); awsConf.setConnectionTimeout(conf.getInt("fs.s3a.connection.establish.timeout", '썐')); awsConf.setSock ##Solution: 1. temporary solution add configuration.set("fs.s3a.connection.maximum", "1660"); in configuration Configuration configuration = new Configuration(); configuration.set(ACCESS_KEY, args[0]); configuration.set(SECRET_KEY, args[1]); configuration.set(ENDPOINT, args[2]); configuration.set("fs.s3a.connection.maximum", "166"); CarbonReader reader = CarbonReader .builder(path, "_temp") .withHadoopConf(configuration) .build(); 2. final solution release the connect was: ## Introduce When read data by using CarbonData SDK from S3 , It throw some exception. ##Problem {code:java} log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:454) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:75) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:66) at org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile.<init>(HDFSCarbonFile.java:41) at org.apache.carbondata.core.datastore.filesystem.S3CarbonFile.<init>(S3CarbonFile.java:41) at org.apache.carbondata.core.datastore.impl.DefaultFileTypeProvider.getCarbonFile(DefaultFileTypeProvider.java:53) at org.apache.carbondata.core.datastore.impl.FileFactory.getCarbonFile(FileFactory.java:99) at org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:183) at org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:178) at org.apache.carbondata.core.metadata.schema.SchemaReader.readCarbonTableFromStore(SchemaReader.java:41) at org.apache.carbondata.core.metadata.schema.table.CarbonTable.buildFromTablePath(CarbonTable.java:288) at org.apache.carbondata.core.datamap.DataMapStoreManager.getCarbonTable(DataMapStoreManager.java:496) at org.apache.carbondata.core.datamap.DataMapStoreManager.clearDataMaps(DataMapStoreManager.java:460) at org.apache.carbondata.sdk.file.CarbonReaderBuilder.build(CarbonReaderBuilder.java:180) at org.apache.carbondata.examples.sdk.SDKS3ReadExample.main(SDKS3ReadExample.java:67) Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232) at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) at com.amazonaws.http.conn.$Proxy7.getConnection(Unknown Source) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384) ... 20 more Process finished with exit code 1 {code} ##Analysis The default value of fs.s3a.connection.maximum is 15. When read the 16th file, it will throw ConnectionPoolTimeoutException because the connect not enougth. org.apache.hadoop.fs.s3a#initialize AWSCredentialsProviderChain credentials = new AWSCredentialsProviderChain(new AWSCredentialsProvider[]{new BasicAWSCredentialsProvider(accessKey, secretKey), new InstanceProfileCredentialsProvider(), new AnonymousAWSCredentialsProvider()}); this.bucket = name.getHost(); ClientConfiguration awsConf = new ClientConfiguration(); awsConf.setMaxConnections(conf.getInt("fs.s3a.connection.maximum", 15)); boolean secureConnections = conf.getBoolean("fs.s3a.connection.ssl.enabled", true); awsConf.setProtocol(secureConnections?Protocol.HTTPS:Protocol.HTTP); awsConf.setMaxErrorRetry(conf.getInt("fs.s3a.attempts.maximum", 10)); awsConf.setConnectionTimeout(conf.getInt("fs.s3a.connection.establish.timeout", '썐')); awsConf.setSock ##Solution: 1. temporary solution add configuration.set("fs.s3a.connection.maximum", "1660"); in configuration Configuration configuration = new Configuration(); configuration.set(ACCESS_KEY, args[0]); configuration.set(SECRET_KEY, args[1]); configuration.set(ENDPOINT, args[2]); configuration.set("fs.s3a.connection.maximum", "166"); CarbonReader reader = CarbonReader .builder(path, "_temp") .withHadoopConf(configuration) .build(); 2. final solution release the connect > Throw ConnectionPoolTimeoutException when carbondata SDK read data from S3 > -------------------------------------------------------------------------- > > Key: CARBONDATA-3037 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3037 > Project: CarbonData > Issue Type: Improvement > Affects Versions: 1.5.0 > Reporter: xubo245 > Assignee: xubo245 > Priority: Major > > ##Introduce > > When read data by using CarbonData SDK from S3 , It throw some exception. > ##Problem > > > {code:java} > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. > Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool > at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:454) > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) > at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) > at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976) > at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956) > at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892) > at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77) > at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:75) > at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:66) > at org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile.<init>(HDFSCarbonFile.java:41) > at org.apache.carbondata.core.datastore.filesystem.S3CarbonFile.<init>(S3CarbonFile.java:41) > at org.apache.carbondata.core.datastore.impl.DefaultFileTypeProvider.getCarbonFile(DefaultFileTypeProvider.java:53) > at org.apache.carbondata.core.datastore.impl.FileFactory.getCarbonFile(FileFactory.java:99) > at org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:183) > at org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:178) > at org.apache.carbondata.core.metadata.schema.SchemaReader.readCarbonTableFromStore(SchemaReader.java:41) > at org.apache.carbondata.core.metadata.schema.table.CarbonTable.buildFromTablePath(CarbonTable.java:288) > at org.apache.carbondata.core.datamap.DataMapStoreManager.getCarbonTable(DataMapStoreManager.java:496) > at org.apache.carbondata.core.datamap.DataMapStoreManager.clearDataMaps(DataMapStoreManager.java:460) > at org.apache.carbondata.sdk.file.CarbonReaderBuilder.build(CarbonReaderBuilder.java:180) > at org.apache.carbondata.examples.sdk.SDKS3ReadExample.main(SDKS3ReadExample.java:67) > Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool > at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232) > at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199) > at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) > at com.amazonaws.http.conn.$Proxy7.getConnection(Unknown Source) > at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456) > at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384) > ... 20 more > > Process finished with exit code 1 > {code} > ##Analysis > The default value of fs.s3a.connection.maximum is 15. When read the 16th file, it will throw ConnectionPoolTimeoutException because the connect not enougth. > > org.apache.hadoop.fs.s3a#initialize > > AWSCredentialsProviderChain credentials = new AWSCredentialsProviderChain(new AWSCredentialsProvider[]{new BasicAWSCredentialsProvider(accessKey, secretKey), new InstanceProfileCredentialsProvider(), new AnonymousAWSCredentialsProvider()}); > this.bucket = name.getHost(); > ClientConfiguration awsConf = new ClientConfiguration(); > awsConf.setMaxConnections(conf.getInt("fs.s3a.connection.maximum", 15)); > boolean secureConnections = conf.getBoolean("fs.s3a.connection.ssl.enabled", true); > awsConf.setProtocol(secureConnections?Protocol.HTTPS:Protocol.HTTP); > awsConf.setMaxErrorRetry(conf.getInt("fs.s3a.attempts.maximum", 10)); > awsConf.setConnectionTimeout(conf.getInt("fs.s3a.connection.establish.timeout", '썐')); > awsConf.setSock > ##Solution: > 1. temporary solution > add configuration.set("fs.s3a.connection.maximum", "1660"); in configuration > Configuration configuration = new Configuration(); > configuration.set(ACCESS_KEY, args[0]); > configuration.set(SECRET_KEY, args[1]); > configuration.set(ENDPOINT, args[2]); > configuration.set("fs.s3a.connection.maximum", "166"); > CarbonReader reader = CarbonReader > .builder(path, "_temp") > .withHadoopConf(configuration) > .build(); > 2. final solution > release the connect -- This message was sent by Atlassian JIRA (v7.6.3#76005) |
Free forum by Nabble | Edit this page |