[GitHub] incubator-carbondata pull request #603: Fix the bug of inverted index that s...

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #603: Fix the bug of inverted index that s...

qiuchenjian-2
GitHub user Zhangshunyu opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/603

    Fix the bug of inverted index that store inverted index in metadata by using Encoding.INVERTED_INDEX.

    ## Why raise this pr?
    This pr is the same to #222 , because pr 222 can not reopen, so open a new pr.
   
    Problem: In current code, inverted index in ddl info is not stored into store, and when we restart the cluster, query might mismatch.
    To fix problem 1, current code set always true to use inverted index, and we can not configure inverted index now. We should fix this problem from its root cause.
    ## How to solve?
   
    Using the Encoding as the indentifier to check whether using inverted index, this Encoding is in thrift format now, so we no need to modify the thrift format.
   
    Here it is the same to the query logic in CompressedDimensionChunkFileBasedReader:
   
        if (CarbonUtil.hasEncoding(dimensionColumnChunk.get(blockIndex).getEncodingList(),
            Encoding.INVERTED_INDEX)) {
          invertedIndexes = CarbonUtil
              .getUnCompressColumnIndex(dimensionColumnChunk.get(blockIndex).getRowIdPageLength(),
                  fileReader.readByteArray(filePath,
                      dimensionColumnChunk.get(blockIndex).getRowIdPageOffset(),
                      dimensionColumnChunk.get(blockIndex).getRowIdPageLength()), numberComressor);
          // get the reverse index
          invertedIndexesReverse = getInvertedReverseIndex(invertedIndexes);
        }
    it also use Encoding.INVERTED_INDEX to check whether one column is use inverted index.
   
    ## How to test?
   
    Pass all the test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Zhangshunyu/incubator-carbondata fix_index

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/603.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #603
   
----
commit 6c6202b10c9e3bf60830d2fb995b55e5109f59bd
Author: Zhangshunyu <[hidden email]>
Date:   2016-09-08T07:48:03Z

    Save useInvertedIndex info into thrift store

commit 4d50a6a38901af619a56f4352ae0208be9775e50
Author: Zhangshunyu <[hidden email]>
Date:   2016-09-08T07:48:15Z

    Save useInvertedIndex info into thrift store

commit a36bf38ffa7e766e09aa4513d4a4b96ef1218639
Author: Zhangshunyu <[hidden email]>
Date:   2016-09-08T09:46:12Z

    Fix the judge of no_dic_col

commit e3ee06abd032affd098c1d31b6bbab41ce27b747
Author: Zhangshunyu <[hidden email]>
Date:   2016-09-08T10:04:20Z

    add commont

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/916/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Zhangshunyu commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/919/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Zhangshunyu commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    @ravipesala Could you pls help to check what's the meaning of the CI report taht can not build project?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    @Zhangshunyu checkstyle is failing.
    CarbonCSVBasedSeqGenStep.java:48: error: Wrong order for 'org.apache.carbondata.core.constants.CarbonCommonConstants' import


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    @Zhangshunyu
    As you mentioned "Problem: In current code, inverted index in ddl info is not stored into store, and when we restart the cluster, query might mismatch."   this issue can be reproducible regularly or occasionally ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Zhangshunyu commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    @chenliang613 this will not happen, becasue it is set to true in current code to avoid this problem, but inverted index is not configurable, so i raise this pr to make is can be configurable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/956/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #603: Fix the bug of inverted index that s...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/603#discussion_r103073261
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
    @@ -804,6 +804,7 @@
       public static final String COLUMN_PROPERTIES = "columnproperties";
       // table block size in MB
       public static final String TABLE_BLOCKSIZE = "table_blocksize";
    +  public static final String NO_INVERTED_INDEX = "no_inverted_index";
    --- End diff --
   
    please add a comment to describe it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #603: Fix the bug of inverted index that store in...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/603
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/971/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #603: Fix the bug of inverted index that s...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/603


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---