[GitHub] carbondata pull request #2886: [WIP]make inverted index false by defaut

classic Classic list List threaded Threaded
55 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2886: [WIP]make inverted index false by defaut

qiuchenjian-2
GitHub user akashrn5 opened a pull request:

    https://github.com/apache/carbondata/pull/2886

    [WIP]make inverted index false by defaut

    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/akashrn5/incubator-carbondata disable_inverted_index

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2886.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2886
   
----
commit c8b434294ee52ccd1a5445ce694f99f738908b0c
Author: akashrn5 <akashnilugal@...>
Date:   2018-10-31T08:43:48Z

    make inverted index false by defaut

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [WIP]make inverted index false by defaut

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1180/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [WIP]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1393/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [WIP]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9444/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [WIP]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1403/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [WIP]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9449/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [WIP]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1189/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1407/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1195/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9456/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    @akashrn5 Please expose these properties from SDK and fileformat as well.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2886: [CARBONDATA-3065]make inverted index false by...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2886#discussion_r230260481
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
    @@ -359,8 +359,13 @@ private CarbonCommonConstants() {
       public static final String TABLE_BLOCKSIZE = "table_blocksize";
       // table blocklet size in MB
       public static final String TABLE_BLOCKLET_SIZE = "table_blocklet_size";
    -  // set in column level to disable inverted index
    +  /**
    +   * set in column level to disable inverted index
    +   * @Deprecated :This property is deprecated, it is kep just for compatibility
    --- End diff --
   
    spelling: kept


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    The InvertedIndex/NoInvertedIndex setting is confusing.
    1. the value `isInvertedIndex` assigned to different IndexCodec  in `createEncoderForDimensionLegacy` requires us to set the column both SortColumns and use InvertedIndex. What if I set it in INVERTED_INDEX but not in SORT_COLUMNS?
    2. what the boolean value `isInvertedIndex` in IndexCodec do is to control whether to do RLE on datapage?
   
    These make the setting not a direct switch to control how the data proceed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    @kevinjmh
    1. when you set in INVERTED_INDEX , but not in SORT_COLUMNS, then data will not be sorted, only RLE will be applied on data.
    2. `isInvertedIndex ` basically this boolean cannot say that, it is do RLE, basically RLE will be applied on both inverted and no inverted case, but after checking `isInvertedIndex ` it decides to sort based on isSort check



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    @akashrn5 thanks for reply.
   
    1. Let's take a detail case. you can check whether it is right.
     
     In `DictDimensionIndexCodec#createEncoder`, as the setting I said above
     `isSort`=false
     `isDoInvertedIndex` = true
     `isInvertedIndex`=`isSort`&&`isDoInvertedIndex` = false
    so, it will go to `indexStorage = new BlockIndexerStorageForNoInvertedIndexForShort(data, false);`.
    In the construction method, we can see that it only assigns the dataPage value. No RLE.
    ```
      public BlockIndexerStorageForNoInvertedIndexForShort(byte[][] dataPage, boolean applyRLE) {
        this.dataPage = dataPage;
        if (applyRLE) {
          List<byte[]> actualDataList = new ArrayList<>();
          for (int i = 0; i < dataPage.length; i++) {
            actualDataList.add(dataPage[i]);
          }
          rleEncodeOnData(actualDataList);
        }
      }
    ```
   
    2. If isInvertedIndex is TRUE, then the isSort check must be TRUE
   
    ```
    isInvertedIndex    =      isSort    &&        isDoInvertedIndex;
           ^                     ^                       ^
           |                     |                       |
    internalUsed          SORT_COLUMNS           INVERTED_INDEX
   
    ```
   
   



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    > @akashrn5 Please expose these properties from SDK and fileformat as well.
   
    @ravipesala handed for SDK and fileformat, please review


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    @kevinjmh basically, with my changes the column level sorting will be skipped and only row level sorting will be done in sort step.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2886: [CARBONDATA-3065]make inverted index false by...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2886#discussion_r230309345
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
    @@ -359,8 +359,13 @@ private CarbonCommonConstants() {
       public static final String TABLE_BLOCKSIZE = "table_blocksize";
       // table blocklet size in MB
       public static final String TABLE_BLOCKLET_SIZE = "table_blocklet_size";
    -  // set in column level to disable inverted index
    +  /**
    +   * set in column level to disable inverted index
    +   * @Deprecated :This property is deprecated, it is kep just for compatibility
    --- End diff --
   
    done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1238/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2886: [CARBONDATA-3065]make inverted index false by defaut

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2886
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1240/



---
123