Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2654: [WIP] Adaptive Encoding for Primitive data ty...

Classic

List

Threaded

193 messages Options

1 ... 45678910

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2654

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8223/

---

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2654

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/152/

---

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata pull request #2654: [CARBONDATA-2896] Adaptive Encoding for Primi...

In reply to this post by qiuchenjian-2

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2654#discussion_r214809720

--- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/ColumnPageWrapper.java ---
@@ -58,26 +70,64 @@ public int fillSurrogateKey(int rowId, int chunkIndex, int[] outputSurrogateKey)

@Override
public int fillVector(ColumnVectorInfo[] vectorInfo, int chunkIndex) {
- throw new UnsupportedOperationException("internal error");
+ ColumnVectorInfo columnVectorInfo = vectorInfo[chunkIndex];
+ CarbonColumnVector vector = columnVectorInfo.vector;
+ int offset = columnVectorInfo.offset;
+ int vectorOffset = columnVectorInfo.vectorOffset;
+ int len = offset + columnVectorInfo.size;
+ for (int i = offset; i < len; i++) {
+ fillRow(i, vector, vectorOffset++);
+ }
+ return chunkIndex + 1;
+ }
+
+ /**
+ * Fill the data to the vector
+ *
+ * @param rowId
+ * @param vector
+ * @param vectorRow
+ */
+ private void fillRow(int rowId, CarbonColumnVector vector, int vectorRow) {
+ byte[] value = getChunkData(rowId);
+ int length = value.length;
+ DimensionDataVectorProcessor.putDataToVector(vector, value, vectorRow, length);
}

@Override
public int fillVector(int[] filteredRowId, ColumnVectorInfo[] vectorInfo, int chunkIndex) {
- throw new UnsupportedOperationException("internal error");
+ ColumnVectorInfo columnVectorInfo = vectorInfo[chunkIndex];
+ CarbonColumnVector vector = columnVectorInfo.vector;
+ int offset = columnVectorInfo.offset;
+ int vectorOffset = columnVectorInfo.vectorOffset;
+ int len = offset + columnVectorInfo.size;
+ for (int i = offset; i < len; i++) {
+ fillRow(filteredRowId[i], vector, vectorOffset++);
+ }
+ return chunkIndex + 1;
}

@Override public byte[] getChunkData(int rowId) {
+ int rowIdCopy = rowId;
--- End diff --

Can u add comment why we need to store actual rowid, and please change the variable name

---

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2654

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/229/

---

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/2654

@dhatchayani What about the legacy store?
For example, for the the non-dict-primitive column, in old store in BloomFilter datamap, it stores the bytes and during query we will convert it to bytes, but in the new store during query we will convert it to primitive object, which will cause mismatch.

---

qiuchenjian-2

[GitHub] carbondata pull request #2654: [CARBONDATA-2896] Adaptive Encoding for Primi...

In reply to this post by qiuchenjian-2

Github user kevinjmh commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2654#discussion_r215153728

--- Diff: integration/spark2/src/main/scala/org/apache/carbondata/datamap/IndexDataMapRebuildRDD.scala ---
@@ -264,8 +264,17 @@ class RawBytesReadSupport(segmentProperties: SegmentProperties, indexColumns: Ar
rtn(i) = if (indexCol2IdxInDictArray.contains(col.getColName)) {
surrogatKeys(indexCol2IdxInDictArray(col.getColName)).toInt.asInstanceOf[Integer]
} else if (indexCol2IdxInNoDictArray.contains(col.getColName)) {
- data(0).asInstanceOf[ByteArrayWrapper].getNoDictionaryKeyByIndex(
+ val bytes = data(0).asInstanceOf[ByteArrayWrapper].getNoDictionaryKeyByIndex(
indexCol2IdxInNoDictArray(col.getColName))
+ // no dictionary primitive columns are expected to be in original data while loading,
+ // so convert it to original data
+ if (DataTypeUtil.isPrimitiveColumn(col.getDataType)) {
+ val dataFromBytes = DataTypeUtil
+ .getDataBasedOnDataTypeForNoDictionaryColumn(bytes, col.getDataType)
+ dataFromBytes
--- End diff --

if isPrimitiveColumn, need null check and get null value for measure

---

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2654

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8337/

---

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2654: [CARBONDATA-2896] Adaptive Encoding for Primitive da...

In reply to this post by qiuchenjian-2

1 ... 45678910