GitHub user QiangCai opened a pull request:
https://github.com/apache/carbondata/pull/2544 [WIP][CarbonStore] Support ingesting data from DIS Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/QiangCai/carbondata support_dis Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2544.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2544 ---- commit 0adf03ba5f666a79c73a73ef1b9bb1e34dfee814 Author: QiangCai <qiangcai@...> Date: 2018-07-19T06:50:38Z fix task locality issue dependency commit 0f0f6be778a10cd8d8443bb13455bb4883d7823b Author: QiangCai <qiangcai@...> Date: 2018-07-24T03:18:59Z support ingesting data from DIS ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2544 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7424/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2544#discussion_r204966885 --- Diff: pom.xml --- @@ -110,7 +110,7 @@ <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <snappy.version>1.1.2.6</snappy.version> - <hadoop.version>2.7.2</hadoop.version> + <hadoop.version>2.8.3</hadoop.version> --- End diff -- can we directly upgrade the hadoop version to 2.8.3 without any other changes? --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2544#discussion_r204966779 --- Diff: store/sql/pom.xml --- @@ -35,6 +36,33 @@ </exclusion> </exclusions> </dependency> + <dependency> + <groupId>org.apache.hadoop</groupId> + <artifactId>hadoop-aws</artifactId> + <version>${hadoop.version}</version> + <exclusions> + <exclusion> + <groupId>com.fasterxml.jackson.core</groupId> + <artifactId>*</artifactId> + </exclusion> + </exclusions> + </dependency> + <dependency> + <groupId>com.amazonaws</groupId> + <artifactId>aws-java-sdk-s3</artifactId> + <version> 1.10.6</version> --- End diff -- trim the useless spaces --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2544#discussion_r204967361 --- Diff: store/sql/src/main/java/org/apache/carbondata/dis/DisProducer.java --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.dis; + +import java.nio.ByteBuffer; +import java.nio.charset.Charset; +import java.text.SimpleDateFormat; +import java.util.ArrayList; +import java.util.Date; +import java.util.List; +import java.util.Random; +import java.util.Timer; +import java.util.TimerTask; +import java.util.concurrent.ThreadLocalRandom; +import java.util.concurrent.atomic.AtomicLong; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; + +import com.huaweicloud.dis.DIS; +import com.huaweicloud.dis.DISClientBuilder; +import com.huaweicloud.dis.exception.DISClientException; +import com.huaweicloud.dis.http.exception.ResourceAccessException; +import com.huaweicloud.dis.iface.data.request.PutRecordsRequest; +import com.huaweicloud.dis.iface.data.request.PutRecordsRequestEntry; +import com.huaweicloud.dis.iface.data.response.PutRecordsResult; +import com.huaweicloud.dis.iface.data.response.PutRecordsResultEntry; + +public class DisProducer { + + private static AtomicLong eventId = new AtomicLong(0); + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DisProducer.class.getName()); + + public static void main(String[] args) { + if (args.length < 6) { + System.err.println( + "Usage: DisProducer <stream name> <endpoint> <region> <ak> <sk> <project id> "); + return; + } + + DIS dic = DISClientBuilder.standard().withEndpoint(args[1]).withAk(args[3]).withSk(args[4]) + .withProjectId(args[5]).withRegion(args[2]).build(); + + Sensor sensor = new Sensor(dic, args[0]); + Timer timer = new Timer(); + timer.schedule(sensor, 0, 5000); + + } + + static class Sensor extends TimerTask { + private DIS dic; + + private String streamName; + + private SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); + + private Random random = new Random(); + + private int i = 0; + private int flag = 1; + + Sensor(DIS dic, String streamName) { + this.dic = dic; + this.streamName = streamName; + } + + @Override public void run() { + uploadData(); + //recordSensor(); + } + + private void uploadData() { + PutRecordsRequest putRecordsRequest = new PutRecordsRequest(); + putRecordsRequest.setStreamName(streamName); + List<PutRecordsRequestEntry> putRecordsRequestEntryList = new ArrayList<>(); + PutRecordsRequestEntry putRecordsRequestEntry = new PutRecordsRequestEntry(); + putRecordsRequestEntry.setData(ByteBuffer.wrap(recordSensor())); + putRecordsRequestEntry + .setPartitionKey(String.valueOf(ThreadLocalRandom.current().nextInt(1000000))); + putRecordsRequestEntryList.add(putRecordsRequestEntry); + putRecordsRequest.setRecords(putRecordsRequestEntryList); + + LOGGER.info("========== BEGIN PUT ============"); + + PutRecordsResult putRecordsResult = null; + try { + putRecordsResult = dic.putRecords(putRecordsRequest); + } catch (DISClientException e) { + LOGGER.error(e, + "Failed to get a normal response, please check params and retry." + e.getMessage()); + } catch (ResourceAccessException e) { + LOGGER.error(e, "Failed to access endpoint. " + e.getMessage()); + } catch (Exception e) { + LOGGER.error(e, e.getMessage()); + } + + if (putRecordsResult != null) { + LOGGER.info("Put " + putRecordsResult.getRecords().size() + " records[" + ( + putRecordsResult.getRecords().size() - putRecordsResult.getFailedRecordCount().get()) + + " successful / " + putRecordsResult.getFailedRecordCount() + " failed]."); + + for (int j = 0; j < putRecordsResult.getRecords().size(); j++) { + PutRecordsResultEntry putRecordsRequestEntry1 = putRecordsResult.getRecords().get(j); + if (putRecordsRequestEntry1.getErrorCode() != null) { + LOGGER.error("[" + new String( + putRecordsRequestEntryList.get(j).getData().array(), Charset.defaultCharset()) + + "] put failed, errorCode [" + putRecordsRequestEntry1.getErrorCode() + + "], errorMessage [" + putRecordsRequestEntry1.getErrorMessage() + "]"); + } else { + LOGGER.info("[" + new String( --- End diff -- why not use stringbuilder or stringformat --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2544 please rebase --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2544 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7539/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2544 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6293/ --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2544#discussion_r205995493 --- Diff: store/sql/src/main/java/org/apache/carbondata/dis/DisProducer.java --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.dis; + +import java.nio.ByteBuffer; +import java.nio.charset.Charset; +import java.text.SimpleDateFormat; +import java.util.ArrayList; +import java.util.Date; +import java.util.List; +import java.util.Random; +import java.util.Timer; +import java.util.TimerTask; +import java.util.concurrent.ThreadLocalRandom; +import java.util.concurrent.atomic.AtomicLong; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; + +import com.huaweicloud.dis.DIS; +import com.huaweicloud.dis.DISClientBuilder; +import com.huaweicloud.dis.exception.DISClientException; +import com.huaweicloud.dis.http.exception.ResourceAccessException; +import com.huaweicloud.dis.iface.data.request.PutRecordsRequest; +import com.huaweicloud.dis.iface.data.request.PutRecordsRequestEntry; +import com.huaweicloud.dis.iface.data.response.PutRecordsResult; +import com.huaweicloud.dis.iface.data.response.PutRecordsResultEntry; + +public class DisProducer { + + private static AtomicLong eventId = new AtomicLong(0); + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DisProducer.class.getName()); + + public static void main(String[] args) { + if (args.length < 6) { + System.err.println( + "Usage: DisProducer <stream name> <endpoint> <region> <ak> <sk> <project id> "); + return; + } + + DIS dic = DISClientBuilder.standard().withEndpoint(args[1]).withAk(args[3]).withSk(args[4]) + .withProjectId(args[5]).withRegion(args[2]).build(); + + Sensor sensor = new Sensor(dic, args[0]); + Timer timer = new Timer(); + timer.schedule(sensor, 0, 5000); + + } + + static class Sensor extends TimerTask { + private DIS dic; + + private String streamName; + + private SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); + + private Random random = new Random(); + + private int i = 0; + private int flag = 1; + + Sensor(DIS dic, String streamName) { + this.dic = dic; + this.streamName = streamName; + } + + @Override public void run() { + uploadData(); + //recordSensor(); + } + + private void uploadData() { + PutRecordsRequest putRecordsRequest = new PutRecordsRequest(); + putRecordsRequest.setStreamName(streamName); + List<PutRecordsRequestEntry> putRecordsRequestEntryList = new ArrayList<>(); + PutRecordsRequestEntry putRecordsRequestEntry = new PutRecordsRequestEntry(); + putRecordsRequestEntry.setData(ByteBuffer.wrap(recordSensor())); + putRecordsRequestEntry + .setPartitionKey(String.valueOf(ThreadLocalRandom.current().nextInt(1000000))); + putRecordsRequestEntryList.add(putRecordsRequestEntry); + putRecordsRequest.setRecords(putRecordsRequestEntryList); + + LOGGER.info("========== BEGIN PUT ============"); + + PutRecordsResult putRecordsResult = null; + try { + putRecordsResult = dic.putRecords(putRecordsRequest); + } catch (DISClientException e) { + LOGGER.error(e, + "Failed to get a normal response, please check params and retry." + e.getMessage()); + } catch (ResourceAccessException e) { + LOGGER.error(e, "Failed to access endpoint. " + e.getMessage()); + } catch (Exception e) { + LOGGER.error(e, e.getMessage()); + } + + if (putRecordsResult != null) { + LOGGER.info("Put " + putRecordsResult.getRecords().size() + " records[" + ( + putRecordsResult.getRecords().size() - putRecordsResult.getFailedRecordCount().get()) + + " successful / " + putRecordsResult.getFailedRecordCount() + " failed]."); + + for (int j = 0; j < putRecordsResult.getRecords().size(); j++) { + PutRecordsResultEntry putRecordsRequestEntry1 = putRecordsResult.getRecords().get(j); + if (putRecordsRequestEntry1.getErrorCode() != null) { + LOGGER.error("[" + new String( + putRecordsRequestEntryList.get(j).getData().array(), Charset.defaultCharset()) + + "] put failed, errorCode [" + putRecordsRequestEntry1.getErrorCode() + + "], errorMessage [" + putRecordsRequestEntry1.getErrorMessage() + "]"); + } else { + LOGGER.info("[" + new String( --- End diff -- ok --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2544 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7594/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2544 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6057/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2544 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7595/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2544 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6058/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2544 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6368/ --- |
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2544 merging into carbonstore branch --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |