[GitHub] carbondata pull request #2654: [WIP] Adaptive Encoding for Primitive data ty...

classic Classic list List threaded Threaded
193 messages Options
1 ... 78910
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/482/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/330/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/507/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8577/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2654: [CARBONDATA-2896] Adaptive Encoding for Primi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2654


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    Seriously? Have you checked this PR on legacy store? @kevinjmh  tested in local days ago and raised this problem but didn't get any feedback.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    I ran a test on table with bloom datamap  created before applying this PR, and query it after this PR merged, but the answer is not correct. Can you check  it?
   
    Procedure to reproduce:
   
    - switch master code before this PR merged
    - create table with no-dict measure column (set the measure column as sort column)
    - create bloom datamap on the measure column
    - load some data into table
    - query on the measure column, get a result
    - switch to code after this PR merged
    - do the same query and compare the result


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    > @dhatchayani What about the legacy store?
    > For example, for the the non-dict-primitive column, in old store in BloomFilter datamap, it stores the bytes and during query we will convert it to bytes, but in the new store during query we will convert it to primitive object, which will cause mismatch.
   
   
    In the legacy store it is stored as bytes, in the new store it is stored as primitive object, but while retrieving back from the query the  query result is unified to bytes only


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2654: [CARBONDATA-2896] Adaptive Encoding for Primi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2654#discussion_r218669311
 
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/datamap/IndexDataMapRebuildRDD.scala ---
    @@ -264,8 +264,17 @@ class RawBytesReadSupport(segmentProperties: SegmentProperties, indexColumns: Ar
           rtn(i) = if (indexCol2IdxInDictArray.contains(col.getColName)) {
             surrogatKeys(indexCol2IdxInDictArray(col.getColName)).toInt.asInstanceOf[Integer]
           } else if (indexCol2IdxInNoDictArray.contains(col.getColName)) {
    -        data(0).asInstanceOf[ByteArrayWrapper].getNoDictionaryKeyByIndex(
    +        val bytes = data(0).asInstanceOf[ByteArrayWrapper].getNoDictionaryKeyByIndex(
               indexCol2IdxInNoDictArray(col.getColName))
    +        // no dictionary primitive columns are expected to be in original data while loading,
    +        // so convert it to original data
    +        if (DataTypeUtil.isPrimitiveColumn(col.getDataType)) {
    +          val dataFromBytes = DataTypeUtil
    +            .getDataBasedOnDataTypeForNoDictionaryColumn(bytes, col.getDataType)
    +          dataFromBytes
    --- End diff --
   
    i think measure null and no dictionary null values are different, can u please give me any scenario which fall into no dictionary null case?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2654: [CARBONDATA-2896] Adaptive Encoding for Primi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2654#discussion_r218669857
 
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java ---
    @@ -331,8 +332,18 @@ private BloomQueryModel buildQueryModelInternal(CarbonColumn carbonColumn,
           // for dictionary/date columns, convert the surrogate key to bytes
           internalFilterValue = CarbonUtil.getValueAsBytes(DataTypes.INT, convertedValue);
         } else {
    -      // for non dictionary dimensions, is already bytes,
    -      internalFilterValue = (byte[]) convertedValue;
    +      // for non dictionary dimensions, numeric columns will be of original data,
    +      // so convert the data to bytes
    +      if (DataTypeUtil.isPrimitiveColumn(carbonColumn.getDataType())) {
    +        if (convertedValue == null) {
    +          convertedValue = DataConvertUtil.getNullValueForMeasure(carbonColumn.getDataType(),
    +              carbonColumn.getColumnSchema().getScale());
    +        }
    +        internalFilterValue =
    +            CarbonUtil.getValueAsBytes(carbonColumn.getDataType(), convertedValue);
    --- End diff --
   
    > I ran a test on table with bloom datamap created before applying this PR, and query it after this PR merged, but the answer is not correct. Can you check it?
    >
    > Procedure to reproduce:
    >
    > * switch master code before this PR merged
    > * create table with no-dict measure column (set the measure column as sort column)
    > * create bloom datamap on the measure column
    > * load some data into table
    > * query on the measure column, get a result
    > * switch to code after this PR merged
    > * do the same query and compare the result
   
    I will check this issue and update asap


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    > I ran a test on table with bloom datamap created before applying this PR, and query it after this PR merged, but the answer is not correct. Can you check it?
    >
    > Procedure to reproduce:
    >
    > * switch master code before this PR merged
    > * create table with no-dict measure column (set the measure column as sort column)
    > * create bloom datamap on the measure column
    > * load some data into table
    > * query on the measure column, get a result
    > * switch to code after this PR merged
    > * do the same query and compare the result
   
    @kevinjmh Issue is reproduced and this is the issue with compatibility because of the data written in new store is of different format. That i will correct it in the next PR.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2654
 
    OK


---
1 ... 78910