Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2654 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/482/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2654 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/330/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2654 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/507/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2654 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8577/ --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:
https://github.com/apache/carbondata/pull/2654 LGTM --- |
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2654 Seriously? Have you checked this PR on legacy store? @kevinjmh tested in local days ago and raised this problem but didn't get any feedback. --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:
https://github.com/apache/carbondata/pull/2654 I ran a test on table with bloom datamap created before applying this PR, and query it after this PR merged, but the answer is not correct. Can you check it? Procedure to reproduce: - switch master code before this PR merged - create table with no-dict measure column (set the measure column as sort column) - create bloom datamap on the measure column - load some data into table - query on the measure column, get a result - switch to code after this PR merged - do the same query and compare the result --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/2654 > @dhatchayani What about the legacy store? > For example, for the the non-dict-primitive column, in old store in BloomFilter datamap, it stores the bytes and during query we will convert it to bytes, but in the new store during query we will convert it to primitive object, which will cause mismatch. In the legacy store it is stored as bytes, in the new store it is stored as primitive object, but while retrieving back from the query the query result is unified to bytes only --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2654#discussion_r218669311 --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/datamap/IndexDataMapRebuildRDD.scala --- @@ -264,8 +264,17 @@ class RawBytesReadSupport(segmentProperties: SegmentProperties, indexColumns: Ar rtn(i) = if (indexCol2IdxInDictArray.contains(col.getColName)) { surrogatKeys(indexCol2IdxInDictArray(col.getColName)).toInt.asInstanceOf[Integer] } else if (indexCol2IdxInNoDictArray.contains(col.getColName)) { - data(0).asInstanceOf[ByteArrayWrapper].getNoDictionaryKeyByIndex( + val bytes = data(0).asInstanceOf[ByteArrayWrapper].getNoDictionaryKeyByIndex( indexCol2IdxInNoDictArray(col.getColName)) + // no dictionary primitive columns are expected to be in original data while loading, + // so convert it to original data + if (DataTypeUtil.isPrimitiveColumn(col.getDataType)) { + val dataFromBytes = DataTypeUtil + .getDataBasedOnDataTypeForNoDictionaryColumn(bytes, col.getDataType) + dataFromBytes --- End diff -- i think measure null and no dictionary null values are different, can u please give me any scenario which fall into no dictionary null case? --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2654#discussion_r218669857 --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java --- @@ -331,8 +332,18 @@ private BloomQueryModel buildQueryModelInternal(CarbonColumn carbonColumn, // for dictionary/date columns, convert the surrogate key to bytes internalFilterValue = CarbonUtil.getValueAsBytes(DataTypes.INT, convertedValue); } else { - // for non dictionary dimensions, is already bytes, - internalFilterValue = (byte[]) convertedValue; + // for non dictionary dimensions, numeric columns will be of original data, + // so convert the data to bytes + if (DataTypeUtil.isPrimitiveColumn(carbonColumn.getDataType())) { + if (convertedValue == null) { + convertedValue = DataConvertUtil.getNullValueForMeasure(carbonColumn.getDataType(), + carbonColumn.getColumnSchema().getScale()); + } + internalFilterValue = + CarbonUtil.getValueAsBytes(carbonColumn.getDataType(), convertedValue); --- End diff -- > I ran a test on table with bloom datamap created before applying this PR, and query it after this PR merged, but the answer is not correct. Can you check it? > > Procedure to reproduce: > > * switch master code before this PR merged > * create table with no-dict measure column (set the measure column as sort column) > * create bloom datamap on the measure column > * load some data into table > * query on the measure column, get a result > * switch to code after this PR merged > * do the same query and compare the result I will check this issue and update asap --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/2654 > I ran a test on table with bloom datamap created before applying this PR, and query it after this PR merged, but the answer is not correct. Can you check it? > > Procedure to reproduce: > > * switch master code before this PR merged > * create table with no-dict measure column (set the measure column as sort column) > * create bloom datamap on the measure column > * load some data into table > * query on the measure column, get a result > * switch to code after this PR merged > * do the same query and compare the result @kevinjmh Issue is reproduced and this is the issue with compatibility because of the data written in new store is of different format. That i will correct it in the next PR. --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |