Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2635: [CARBONDATA-2856] Fix bug in bloom index on m...

Classic

List

Threaded

32 messages Options

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/2635

retest this please

---

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2635

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8084/

---

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2635

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/24/

---

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2635

LGTM

---

qiuchenjian-2

[GitHub] carbondata pull request #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bl...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2635#discussion_r213563811

--- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomDataMapWriter.java ---
@@ -91,30 +91,28 @@
@Override
protected byte[] convertDictionaryValue(int indexColIdx, Object value) {
// input value from onPageAdded in load process is byte[]
- byte[] fakeMdkBytes;
- // this means that we need to pad some fake bytes
- // to get the whole MDK in corresponding position
- if (columnarSplitter.getBlockKeySize().length > indexCol2MdkIdx.size()) {
- int totalSize = 0;
- for (int size : columnarSplitter.getBlockKeySize()) {
- totalSize += size;
- }
- fakeMdkBytes = new byte[totalSize];

- // put this bytes to corresponding position
- int thisKeyIdx = indexCol2MdkIdx.get(indexColumns.get(indexColIdx).getColName());
- int destPos = 0;
- for (int keyIdx = 0; keyIdx < columnarSplitter.getBlockKeySize().length; keyIdx++) {
- if (thisKeyIdx == keyIdx) {
- System.arraycopy(value, 0,
- fakeMdkBytes, destPos, columnarSplitter.getBlockKeySize()[thisKeyIdx]);
- break;
- }
- destPos += columnarSplitter.getBlockKeySize()[keyIdx];
+ // This is used to deal with the multiple global dictionary column as index columns.
+ // The KeyGenerator works with the whole MDK while the value here only represent part of it,
+ // so we need to pad fake bytes to it in corresponding position.
+ int totalSize = 0;
+ for (int size : columnarSplitter.getBlockKeySize()) {
+ totalSize += size;
+ }
+ byte[] fakeMdkBytes = new byte[totalSize];
+
+ // put this bytes to corresponding position
+ int thisKeyIdx = indexCol2MdkIdx.get(indexColumns.get(indexColIdx).getColName());
+ int destPos = 0;
+ for (int keyIdx = 0; keyIdx < columnarSplitter.getBlockKeySize().length; keyIdx++) {
+ if (thisKeyIdx == keyIdx) {
+ System.arraycopy(value, 0, fakeMdkBytes, destPos,
--- End diff --

I am not quite sure about this, please @ravipesala have a look

---

qiuchenjian-2

[GitHub] carbondata pull request #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bl...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2635#discussion_r213640742

--- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomDataMapWriter.java ---
@@ -91,30 +91,28 @@
@Override
protected byte[] convertDictionaryValue(int indexColIdx, Object value) {
// input value from onPageAdded in load process is byte[]
- byte[] fakeMdkBytes;
- // this means that we need to pad some fake bytes
- // to get the whole MDK in corresponding position
- if (columnarSplitter.getBlockKeySize().length > indexCol2MdkIdx.size()) {
- int totalSize = 0;
- for (int size : columnarSplitter.getBlockKeySize()) {
- totalSize += size;
- }
- fakeMdkBytes = new byte[totalSize];

- // put this bytes to corresponding position
- int thisKeyIdx = indexCol2MdkIdx.get(indexColumns.get(indexColIdx).getColName());
- int destPos = 0;
- for (int keyIdx = 0; keyIdx < columnarSplitter.getBlockKeySize().length; keyIdx++) {
- if (thisKeyIdx == keyIdx) {
- System.arraycopy(value, 0,
- fakeMdkBytes, destPos, columnarSplitter.getBlockKeySize()[thisKeyIdx]);
- break;
- }
- destPos += columnarSplitter.getBlockKeySize()[keyIdx];
+ // This is used to deal with the multiple global dictionary column as index columns.
+ // The KeyGenerator works with the whole MDK while the value here only represent part of it,
+ // so we need to pad fake bytes to it in corresponding position.
+ int totalSize = 0;
+ for (int size : columnarSplitter.getBlockKeySize()) {
+ totalSize += size;
+ }
+ byte[] fakeMdkBytes = new byte[totalSize];
+
+ // put this bytes to corresponding position
+ int thisKeyIdx = indexCol2MdkIdx.get(indexColumns.get(indexColIdx).getColName());
+ int destPos = 0;
+ for (int keyIdx = 0; keyIdx < columnarSplitter.getBlockKeySize().length; keyIdx++) {
+ if (thisKeyIdx == keyIdx) {
+ System.arraycopy(value, 0, fakeMdkBytes, destPos,
--- End diff --

Please don't copy, directly convert to int using ` CarbonUtil.getSurrogateInternal(data, startOffsetOfData, columnValueSize)`

---

qiuchenjian-2

[GitHub] carbondata pull request #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bl...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2635#discussion_r213897347

--- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomDataMapWriter.java ---
@@ -91,30 +91,28 @@
@Override
protected byte[] convertDictionaryValue(int indexColIdx, Object value) {
// input value from onPageAdded in load process is byte[]
- byte[] fakeMdkBytes;
- // this means that we need to pad some fake bytes
- // to get the whole MDK in corresponding position
- if (columnarSplitter.getBlockKeySize().length > indexCol2MdkIdx.size()) {
- int totalSize = 0;
- for (int size : columnarSplitter.getBlockKeySize()) {
- totalSize += size;
- }
- fakeMdkBytes = new byte[totalSize];

- // put this bytes to corresponding position
- int thisKeyIdx = indexCol2MdkIdx.get(indexColumns.get(indexColIdx).getColName());
- int destPos = 0;
- for (int keyIdx = 0; keyIdx < columnarSplitter.getBlockKeySize().length; keyIdx++) {
- if (thisKeyIdx == keyIdx) {
- System.arraycopy(value, 0,
- fakeMdkBytes, destPos, columnarSplitter.getBlockKeySize()[thisKeyIdx]);
- break;
- }
- destPos += columnarSplitter.getBlockKeySize()[keyIdx];
+ // This is used to deal with the multiple global dictionary column as index columns.
+ // The KeyGenerator works with the whole MDK while the value here only represent part of it,
+ // so we need to pad fake bytes to it in corresponding position.
+ int totalSize = 0;
+ for (int size : columnarSplitter.getBlockKeySize()) {
+ totalSize += size;
+ }
+ byte[] fakeMdkBytes = new byte[totalSize];
+
+ // put this bytes to corresponding position
+ int thisKeyIdx = indexCol2MdkIdx.get(indexColumns.get(indexColIdx).getColName());
+ int destPos = 0;
+ for (int keyIdx = 0; keyIdx < columnarSplitter.getBlockKeySize().length; keyIdx++) {
+ if (thisKeyIdx == keyIdx) {
+ System.arraycopy(value, 0, fakeMdkBytes, destPos,
--- End diff --

nice, it works. :+1:

---

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2635

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6474/

---

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2635

LGTM

---

qiuchenjian-2

[GitHub] carbondata pull request #2635: [CARBONDATA-2856][BloomDataMap] Fix bug in bl...

In reply to this post by qiuchenjian-2

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2635

---