Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2252 @ravipesala @kumarvishal09 @jackylk Based on @ravipesala 's advice, I changed the name of internal datatype from `text` to `varchar` Later I'll raise another PR to make it possible for user to directly use `varchar(size)` in DDL statement. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6259/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5097/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5229/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2252#discussion_r194350793 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/VariableLengthDimensionColumnPage.java --- @@ -30,21 +30,19 @@ /** * Constructor for this class - * @param dataChunks - * @param invertedIndex - * @param invertedIndexReverse - * @param numberOfRows */ public VariableLengthDimensionColumnPage(byte[] dataChunks, int[] invertedIndex, - int[] invertedIndexReverse, int numberOfRows) { + int[] invertedIndexReverse, int numberOfRows, boolean isVarcharEncoded) { --- End diff -- Its better to pass store type it self from the caller instead for boolean --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2252#discussion_r194479546 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/SafeVariableLengthDimensionDataChunkStore.java --- @@ -56,7 +57,8 @@ public SafeVariableLengthDimensionDataChunkStore(boolean isInvertedIndex, int nu * @param invertedIndexReverse inverted index reverse to be stored * @param data data to be stored */ - @Override public void putArray(final int[] invertedIndex, final int[] invertedIndexReverse, + @Override + public void putArray(final int[] invertedIndex, final int[] invertedIndexReverse, --- End diff -- @xuchuanyin In Case of varchar column I think we will not be able to store complete data in single byte array as it may overflow, in case of short it is fine as maximum it will be of 32000(max number of bytes in each value) * 32000(number of records in page) + 64000(length for each value in a page), so it will be 1024064000 value and as maximum u can create an array of integer max value so it is fine, but in varchar case length of each value can be maximum of Integer max it self so we will not be able to store in single byte array. This handling is required in both data loading and query. --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2252#discussion_r194597446 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/VariableLengthDimensionColumnPage.java --- @@ -30,21 +30,19 @@ /** * Constructor for this class - * @param dataChunks - * @param invertedIndex - * @param invertedIndexReverse - * @param numberOfRows */ public VariableLengthDimensionColumnPage(byte[] dataChunks, int[] invertedIndex, - int[] invertedIndexReverse, int numberOfRows) { + int[] invertedIndexReverse, int numberOfRows, boolean isVarcharEncoded) { --- End diff -- OK~ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2252#discussion_r194603907 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/SafeVariableLengthDimensionDataChunkStore.java --- @@ -56,7 +57,8 @@ public SafeVariableLengthDimensionDataChunkStore(boolean isInvertedIndex, int nu * @param invertedIndexReverse inverted index reverse to be stored * @param data data to be stored */ - @Override public void putArray(final int[] invertedIndex, final int[] invertedIndexReverse, + @Override + public void putArray(final int[] invertedIndex, final int[] invertedIndexReverse, --- End diff -- @kumarvishal09 yeah, you are right. Considering that scenario will make the code much more complex. As we all know, a maximum of string/bytearray is about 2GB. I think it is enough for current scenarios. 2GB can support average 67104 bytes per column (2GB/32000) in one column page. Having discussed with @jackylk , we have the following conclusions: 1. Add an error message to indicate that the maximum size of one column page is 2GB. 2. We can reduce the row size in the page to support longer characters, for example if the content is 67104*2, user can reduce the row size from 32000 to 16000. 1 will be included in this PR and 2 will be implemented in another PR. Is that OK? @kumarvishal09 --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5265/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6306/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5144/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2252 retest it please --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2252 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6325/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5163/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6338/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5176/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5288/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin closed the pull request at:
https://github.com/apache/carbondata/pull/2252 --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2252 I deleted the origin branch unexpectedly,so I raised another PR #2379 --- |
Free forum by Nabble | Edit this page |