GitHub user manishnalla1994 opened a pull request:
https://github.com/apache/carbondata/pull/2980 [CARBONDATA-3017] Map DDL Support Support Create DDL for Map type. This PR is dependant on PR#2979 for the change of delimiters. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishnalla1994/carbondata MapDDL5Dec Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2980.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2980 ---- commit 322b52a64c840317a6905664e8de16327e3635e0 Author: Manish Nalla <manishnalla1994@...> Date: 2018-10-16T09:48:08Z MapDDLSupport commit 3d119888a80e7d8f9cab59e477984b56af1309f6 Author: manishnalla1994 <manish.nalla1994@...> Date: 2018-12-07T08:18:31Z Added Testcases and Local Dict Support commit 5fe06801360fc04bab9c1239ea8d007f37bc69d4 Author: manishnalla1994 <manish.nalla1994@...> Date: 2018-12-07T13:28:54Z Test Files for Map commit 4cc8ba13b234a13b9a3cef541e37f492153e7d1b Author: manishnalla1994 <manish.nalla1994@...> Date: 2018-12-07T14:44:12Z Changed TestCases and Supported 2 new delimiters ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2980 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1657/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9917/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1869/ --- |
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r239991055 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/RowParserImpl.java --- @@ -34,8 +37,12 @@ private int numberOfColumns; public RowParserImpl(DataField[] output, CarbonDataLoadConfiguration configuration) { - String[] complexDelimiters = + String[] tempComplexDelimiters = (String[]) configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS); + Queue<String> complexDelimiters = new LinkedList<>(); + for (int i = 0; i < 4; i++) { --- End diff -- âi < 4â the 4 is not clearï¼ it's better to exchage is by constant --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240147303 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala --- @@ -0,0 +1,452 @@ +/* + + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + * + http://www.apache.org/licenses/LICENSE-2.0 + * + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + */ +package org.apache.carbondata.spark.testsuite.createTable.TestCreateDDLForComplexMapType + +import java.io.File +import java.util + +import org.apache.hadoop.conf.Configuration +import org.apache.spark.sql.{AnalysisException, Row} +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk + +class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll { + private val conf: Configuration = new Configuration(false) + + val rootPath = new File(this.getClass.getResource("/").getPath + + "../../../..").getCanonicalPath + + val path = s"$rootPath/examples/spark2/src/main/resources/maptest2.csv" + + private def checkForLocalDictionary(dimensionRawColumnChunks: util + .List[DimensionRawColumnChunk]): Boolean = { + var isLocalDictionaryGenerated = false + import scala.collection.JavaConversions._ + for (dimensionRawColumnChunk <- dimensionRawColumnChunks) { + if (dimensionRawColumnChunk.getDataChunkV3 + .isSetLocal_dictionary) { + isLocalDictionaryGenerated = true + } --- End diff -- You can directly use scala filter operation to check for local dictionary --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240151744 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/LoadOption.java --- @@ -119,6 +119,10 @@ "complex_delimiter_level_2", Maps.getOrDefault(options, "complex_delimiter_level_2", ":")); + optionsFinal.put( + "complex_delimiter_level_3", + Maps.getOrDefault(options, "complex_delimiter_level_3", "\\\\003")); + --- End diff -- Remove hardcoding --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240150644 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -188,11 +188,13 @@ case class CarbonLoadDataCommand( val carbonLoadModel = new CarbonLoadModel() val tableProperties = table.getTableInfo.getFactTable.getTableProperties val optionsFinal = LoadOption.fillOptionWithDefaultValue(options.asJava) + // These two delimiters are non configurable and hardcoded for map type + optionsFinal.put("complex_delimiter_level_3", "\003") + optionsFinal.put("complex_delimiter_level_4", "\004") --- End diff -- Remove hardcoded values. Create one `ComplexDelimeterEnum` and use the values from there --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240152242 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/parser/CarbonParserFactory.java --- @@ -51,23 +54,37 @@ public static GenericParser createParser(CarbonColumn carbonColumn, String[] com * delimiters * @return GenericParser */ - private static GenericParser createParser(CarbonColumn carbonColumn, String[] complexDelimiters, + private static GenericParser createParser(CarbonColumn carbonColumn, + Queue<String> complexDelimiters, String nullFormat, int depth) { + if (depth > 2) { + return null; --- End diff -- I think we should throw exception with proper error message if depth is more than 2 --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240151175 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/DataLoadProcessBuilder.java --- @@ -222,8 +222,8 @@ public static CarbonDataLoadConfiguration createConfiguration(CarbonLoadModel lo configuration.setSegmentId(loadModel.getSegmentId()); configuration.setTaskNo(loadModel.getTaskNo()); configuration.setDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS, - new String[] { loadModel.getComplexDelimiterLevel1(), - loadModel.getComplexDelimiterLevel2() }); + new String[] { loadModel.getComplexDelimiterLevel1(), loadModel.getComplexDelimiterLevel2(), + loadModel.getComplexDelimiterLevel3(), loadModel.getComplexDelimiterLevel4() }); --- End diff -- Instead of adding these many delimeters methods in loadmodel, create a list or array of complex delimeters in loadmodel and then use it here --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240145684 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java --- @@ -338,12 +338,15 @@ public static CarbonLoadModel getLoadModel(Configuration conf) throws IOExceptio SKIP_EMPTY_LINE, carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SKIP_EMPTY_LINE))); - String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":"); + String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":" + "," + "\\\\003"); String[] split = complexDelim.split(","); model.setComplexDelimiterLevel1(split[0]); if (split.length > 1) { model.setComplexDelimiterLevel2(split[1]); } + if (split.length > 2) { + model.setComplexDelimiterLevel3(split[2]); + } --- End diff -- Modify the above code as below `if (split.length > 2) { model.setComplexDelimiterLevel2(split[1]); model.setComplexDelimiterLevel3(split[2]); } else if (split.length > 1) { model.setComplexDelimiterLevel2(split[1]); }` --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240126484 --- Diff: examples/spark2/src/main/resources/maptest2.csv --- @@ -0,0 +1,2 @@ +1\002Nalla\0012\002Singh\0011\002Gupta\0014\002Kumar +10\002Nallaa\00120\002Sissngh\001100\002Gusspta\00140\002Kumar --- End diff -- Move the above newly added files to spark-common-test --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240153125 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/RowParserImpl.java --- @@ -34,8 +37,12 @@ private int numberOfColumns; public RowParserImpl(DataField[] output, CarbonDataLoadConfiguration configuration) { - String[] complexDelimiters = + String[] tempComplexDelimiters = (String[]) configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS); + Queue<String> complexDelimiters = new LinkedList<>(); + for (int i = 0; i < 4; i++) { --- End diff -- Avoid using hardcoded values like `4` in the loops while coding....instead iterate over tempComplexDelimiters length --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240126217 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java --- @@ -338,12 +338,15 @@ public static CarbonLoadModel getLoadModel(Configuration conf) throws IOExceptio SKIP_EMPTY_LINE, carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SKIP_EMPTY_LINE))); - String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":"); + String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":" + "," + "\\\\003"); --- End diff -- Move all the default delimters to constants and use it everywhere --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240151368 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModel.java --- @@ -631,7 +651,7 @@ public void setFactTimeStamp(long factTimeStamp) { } public String[] getDelimiters() { - return new String[] { complexDelimiterLevel1, complexDelimiterLevel2 }; + return new String[] { complexDelimiterLevel1, complexDelimiterLevel2, complexDelimiterLevel3 }; --- End diff -- Modify this class and related changes in other classes as per the above comment --- |
In reply to this post by qiuchenjian-2
Github user manishnalla1994 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2980#discussion_r240591596 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/RowParserImpl.java --- @@ -34,8 +37,12 @@ private int numberOfColumns; public RowParserImpl(DataField[] output, CarbonDataLoadConfiguration configuration) { - String[] complexDelimiters = + String[] tempComplexDelimiters = (String[]) configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS); + Queue<String> complexDelimiters = new LinkedList<>(); + for (int i = 0; i < 4; i++) { --- End diff -- Done. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1705/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9965/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1915/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1916/ --- |
Free forum by Nabble | Edit this page |