GitHub user akashrn5 opened a pull request:
https://github.com/apache/carbondata/pull/2401 [CARBONDATA-2585][CARBONDATA-2586][Local Dictionary]Local dictionary support for alter table, preaggregate and varchar datatype What changes were proposed in this pull request? In this PR, local dictionary support is added for alter table, preaggregate and varChar datatype All the validations related to above features are taken care in this PR How was this patch tested? UTs and SDV test cases are added in the same PR You can merge this pull request into a Git repository by running: $ git pull https://github.com/akashrn5/incubator-carbondata local_dict_for_alter_preagg Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2401.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2401 ---- ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6481/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5312/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6482/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5313/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2401 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5403/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2401 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5404/ --- |
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on the issue:
https://github.com/apache/carbondata/pull/2401 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6512/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5344/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6539/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5368/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2401 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5443/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6570/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2401 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5398/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2401#discussion_r198387868 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java --- @@ -1110,9 +1110,18 @@ private static void setLocalDictInfo(CarbonTable table, TableInfo tableInfo) { String localDictionaryThreshold = tableInfo.getFactTable().getTableProperties() .get(CarbonCommonConstants.LOCAL_DICTIONARY_THRESHOLD); if (null != isLocalDictionaryEnabled) { - table.setLocalDictionaryEnabled(Boolean.parseBoolean(isLocalDictionaryEnabled)); + // if table is altered then, for same key in table properties, multiple values will be --- End diff -- remove this code if unused --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2401#discussion_r198389740 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -3067,44 +3067,56 @@ public static void setLocalDictColumnsToWrapperSchema(List<ColumnSchema> columns } // if complex column is defined in local dictionary include column, then get the child // columns and set the string datatype child type as local dictionary column - if (column.getNumberOfChild() > 0 && null != localDictIncludeColumnsOfMainTable) { - listOfDictionaryIncludeColumns = localDictIncludeColumnsOfMainTable.split(","); + if (column.getNumberOfChild() > 0 && null != localDictIncludeColumns) { + listOfDictionaryIncludeColumns = localDictIncludeColumns.trim().split("\\s*,\\s*"); --- End diff -- If possible move trim and split outside of loop as it output will be same and it is happening in loop which will impact the performance. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2401#discussion_r198436867 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java --- @@ -1110,9 +1110,18 @@ private static void setLocalDictInfo(CarbonTable table, TableInfo tableInfo) { String localDictionaryThreshold = tableInfo.getFactTable().getTableProperties() .get(CarbonCommonConstants.LOCAL_DICTIONARY_THRESHOLD); if (null != isLocalDictionaryEnabled) { - table.setLocalDictionaryEnabled(Boolean.parseBoolean(isLocalDictionaryEnabled)); + // if table is altered then, for same key in table properties, multiple values will be + // appended, so after splitting, get the first value for enable and threshold, and that will + // be latest configurations for the table for further local dictionary generation, because for + // alter table add operation, local dictionary enable and threshold configuration is not + // allowed + String[] isLocalDictEnabled = isLocalDictionaryEnabled.split(","); --- End diff -- Please don't handle the default property here. Better handle in parser like don't add default values if the flow through alter --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2401#discussion_r198440267 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -3041,24 +3041,30 @@ public static String getBlockId(AbsoluteTableIdentifier identifier, String fileP * @param mainTableProperties */ public static void setLocalDictColumnsToWrapperSchema(List<ColumnSchema> columns, - Map<String, String> mainTableProperties) { - String isLocalDictEnabledForMainTable = - mainTableProperties.get(CarbonCommonConstants.LOCAL_DICTIONARY_ENABLE); - String localDictIncludeColumnsOfMainTable = - mainTableProperties.get(CarbonCommonConstants.LOCAL_DICTIONARY_INCLUDE); - String localDictExcludeColumnsOfMainTable = - mainTableProperties.get(CarbonCommonConstants.LOCAL_DICTIONARY_EXCLUDE); + Map<String, String> mainTableProperties, String isLocalDictEnabledForMainTable) { String[] listOfDictionaryIncludeColumns = null; String[] listOfDictionaryExcludeColumns = null; + String localDictIncludeColumns = + mainTableProperties.get(CarbonCommonConstants.LOCAL_DICTIONARY_INCLUDE); + String localDictExcludeColumns = + mainTableProperties.get(CarbonCommonConstants.LOCAL_DICTIONARY_EXCLUDE); + if (null != localDictIncludeColumns) { + listOfDictionaryIncludeColumns = localDictIncludeColumns.trim().split("\\s*,\\s*"); --- End diff -- what is this split required for? --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2401#discussion_r198441426 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -615,4 +618,114 @@ object CarbonScalaUtil { sparkSession, schema)._1.asInstanceOf[Object] } + + /** + * this method validates the local dictionary columns configurations + * + * @param tableProperties + * @param localDictColumns + */ + def validateLocalDictionaryColumns(tableProperties: mutable.Map[String, String], + localDictColumns: Seq[String]): Unit = { + var dictIncludeColumns: Seq[String] = Seq[String]() + + // check if the duplicate columns are specified in table schema + if (localDictColumns.distinct.lengthCompare(localDictColumns.size) != 0) { + val duplicateColumns = (dictIncludeColumns ++ localDictColumns) --- End diff -- why adding empty Seq here? --- |
Free forum by Nabble | Edit this page |