[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

qiuchenjian-2
GitHub user ajantha-bhat opened a pull request:

    https://github.com/apache/carbondata/pull/2895

    [HOTFIX] Fix NPE in spark, when same vector reads files with local dictionary and without local dictionary

    problem: NPE in spark, when same vector reads files with local dictionary and without local dictionary
   
    cause: when two carbondata files are present, one with local dictionary and one without local dictionary. If same vector is used to read this files [can happen if task is launched to group of files]. If  local dictionary files are found first, dictionary is set for that vector. But it was never reset for another file reading.
   
    solution: reset dictionary once batch is processed,set only for local dictionary batch processing.
   
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed? NA
     
     - [ ] Any backward compatibility impacted? NA
     
     - [ ] Document update required? NA
   
     - [ ] Testing done
    yes, cluster testing done.      
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.  NA
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ajantha-bhat/carbondata master_new

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2895.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2895
   
----
commit 99c7621336e3cf180bfa0c3a326a2f1fafe51631
Author: ajantha-bhat <ajanthabhat@...>
Date:   2018-11-05T10:00:27Z

    Fix vectcor reading with local dictionary and without local dictionary

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1270/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9534/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1485/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2895#discussion_r230814030
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java ---
    @@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
         int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
         int rowsNum = data.length / columnValueSize;
         CarbonColumnVector vector = vectorInfo.vector;
    -    if (!dictionary.isDictionaryUsed()) {
    -      vector.setDictionary(dictionary);
    -      dictionary.setDictionaryUsed();
    -    }
    +    vector.setDictionary(dictionary);
    --- End diff --
   
    Is the same handling required in `fillRow` method also in the same class?..If required then `isDictionaryUsed` and `setDictionaryUsed` API's will not be required and those can also be removed from the interface


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    LGTM apart from a minor comment


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1503/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1291/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9552/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2895#discussion_r230997947
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java ---
    @@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
         int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
         int rowsNum = data.length / columnValueSize;
         CarbonColumnVector vector = vectorInfo.vector;
    -    if (!dictionary.isDictionaryUsed()) {
    -      vector.setDictionary(dictionary);
    -      dictionary.setDictionaryUsed();
    -    }
    +    vector.setDictionary(dictionary);
    --- End diff --
   
    I have checked this while coding, fill row is our method. So, no issues.
    Only this vector is spark vector and dictionary needs to clear for it.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2895#discussion_r230999455
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java ---
    @@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
         int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
         int rowsNum = data.length / columnValueSize;
         CarbonColumnVector vector = vectorInfo.vector;
    -    if (!dictionary.isDictionaryUsed()) {
    -      vector.setDictionary(dictionary);
    -      dictionary.setDictionaryUsed();
    -    }
    +    vector.setDictionary(dictionary);
    --- End diff --
   
    Both the method are called in carbon flow for vector filling. One is direct fill case and the other one is old vector fill flow. Please cross check once


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2895#discussion_r231001948
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java ---
    @@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, byte[] d
         int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
         int rowsNum = data.length / columnValueSize;
         CarbonColumnVector vector = vectorInfo.vector;
    -    if (!dictionary.isDictionaryUsed()) {
    -      vector.setDictionary(dictionary);
    -      dictionary.setDictionaryUsed();
    -    }
    +    vector.setDictionary(dictionary);
    --- End diff --
   
    done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1514/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1301/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9562/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1302/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    LGTM..can be merged once build is passed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1515/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9563/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2895: [HOTFIX] Fix NPE in spark, when same vector reads fi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2895
 
    retest this please


---
123