[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

classic Classic list List threaded Threaded
65 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
GitHub user kunal642 opened a pull request:

    https://github.com/apache/carbondata/pull/1097

    [WIP] [CARBONDATA-1229] Skip dictionary and data writing if table is dropped

   

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kunal642/carbondata single_pass_fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1097.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1097
   
----
commit 34d79d721fe167ebd11a32c56d2d9b8848c6ca36
Author: kunal642 <[hidden email]>
Date:   2017-06-06T13:02:38Z

    updated map for dictionary generator

commit c77da5b724e0df3146fb7c38b55e2ffc0724355d
Author: kunal642 <[hidden email]>
Date:   2017-06-13T10:50:56Z

    dictionary server update

commit db18c8d286886aab786022811a6aa2f13a73ecb1
Author: kunal642 <[hidden email]>
Date:   2017-06-26T11:40:20Z

    added commit point validation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2713/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/638/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
    Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/138/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
    Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/143/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/643/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2718/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/644/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2719/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1097: [WIP] [CARBONDATA-1229] Skip dictionary and data wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1097
 
    Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/144/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124188452
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/ServerDictionaryGenerator.java ---
    @@ -40,21 +42,21 @@
       @Override
       public Integer generateKey(DictionaryMessage value)
           throws DictionaryGenerationException {
    -    TableDictionaryGenerator generator = tableMap.get(value.getTableUniqueName());
    -    assert generator != null : "Table initialization for generator is not done";
    +    TableDictionaryGenerator generator = tableMap.get(value.getTableUniqueId());
    +    if (generator == null) {
    +      throw new DictionaryGenerationException("Table Not Found");
    --- End diff --
   
    Change message in the assert statement....do not throw explicit exception


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124188599
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/ServerDictionaryGenerator.java ---
    @@ -40,21 +42,21 @@
       @Override
       public Integer generateKey(DictionaryMessage value)
           throws DictionaryGenerationException {
    -    TableDictionaryGenerator generator = tableMap.get(value.getTableUniqueName());
    -    assert generator != null : "Table initialization for generator is not done";
    +    TableDictionaryGenerator generator = tableMap.get(value.getTableUniqueId());
    +    if (generator == null) {
    +      throw new DictionaryGenerationException("Table Not Found");
    +    }
         return generator.generateKey(value);
       }
     
    -  public void initializeGeneratorForTable(DictionaryMessage key) {
    -    CarbonMetadata metadata = CarbonMetadata.getInstance();
    -    CarbonTable carbonTable = metadata.getCarbonTable(key.getTableUniqueName());
    +  public void initializeGeneratorForTable(DictionaryMessage key, CarbonTable carbonTable) {
         CarbonDimension dimension = carbonTable.getPrimitiveDimensionByName(
                 key.getTableUniqueName(), key.getColumnName());
         // initialize TableDictionaryGenerator first
         if (tableMap.get(key.getTableUniqueName()) == null) {
           synchronized (tableMap) {
             if (tableMap.get(key.getTableUniqueName()) == null) {
    -          tableMap.put(key.getTableUniqueName(), new TableDictionaryGenerator(dimension));
    +          tableMap.put(key.getTableUniqueName(), new TableDictionaryGenerator(dimension, carbonTable));
    --- End diff --
   
    make unique name as dbName_tableID


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124188774
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/ServerDictionaryGenerator.java ---
    @@ -64,24 +66,34 @@ public void initializeGeneratorForTable(DictionaryMessage key) {
         }
       }
     
    +  public void initializeGeneratorForColumn(DictionaryMessage key) {
    +    tableMap.get(key.getTableUniqueId()).updateGenerator(key);
    --- End diff --
   
    make sure get Method is not returning null from the map else NullPointerException will be thrown


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124188911
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/ServerDictionaryGenerator.java ---
    @@ -64,24 +66,34 @@ public void initializeGeneratorForTable(DictionaryMessage key) {
         }
       }
     
    +  public void initializeGeneratorForColumn(DictionaryMessage key) {
    +    tableMap.get(key.getTableUniqueId()).updateGenerator(key);
    +  }
    +
       public Integer size(DictionaryMessage key) {
    -    TableDictionaryGenerator generator = tableMap.get(key.getTableUniqueName());
    +    TableDictionaryGenerator generator = tableMap.get(key.getTableUniqueId());
         assert generator != null : "Table intialization for generator is not done";
    --- End diff --
   
    Change the message to "Table not found" as above assert statement


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124189008
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/ServerDictionaryGenerator.java ---
    @@ -64,24 +66,34 @@ public void initializeGeneratorForTable(DictionaryMessage key) {
         }
       }
     
    +  public void initializeGeneratorForColumn(DictionaryMessage key) {
    +    tableMap.get(key.getTableUniqueId()).updateGenerator(key);
    +  }
    +
       public Integer size(DictionaryMessage key) {
    -    TableDictionaryGenerator generator = tableMap.get(key.getTableUniqueName());
    +    TableDictionaryGenerator generator = tableMap.get(key.getTableUniqueId());
         assert generator != null : "Table intialization for generator is not done";
         return generator.size(key);
       }
     
       public void writeDictionaryData() throws Exception {
         for (String tableUniqueName: tableMap.keySet()) {
           TableDictionaryGenerator generator = tableMap.get(tableUniqueName);
    -      generator.writeDictionaryData(tableUniqueName);
    +      generator.writeDictionaryData();
         }
       }
     
    -  public void writeTableDictionaryData(String tableUniqueName) throws Exception {
    -    TableDictionaryGenerator generator = tableMap.get(tableUniqueName);
    -    if (generator != null) {
    -      generator.writeDictionaryData(tableUniqueName);
    -    }
    +  public void writeTableDictionaryData(String tableUniqueId, CarbonTable carbonTable) throws Exception {
    +      TableDictionaryGenerator generator = tableMap.get(tableUniqueId);
    +    boolean tableExists = FileFactory.isFileExist(carbonTable.getMetaDataFilepath() + CarbonCommonConstants.FILE_SEPARATOR + tableUniqueId,
    --- End diff --
   
    correct the indentation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124192252
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/key/DictionaryMessage.java ---
    @@ -55,6 +60,10 @@ public void readData(ByteBuf byteBuf) {
         byteBuf.readBytes(tableBytes);
         tableUniqueName = new String(tableBytes);
     
    +    byte[] tableIdBytes = new byte[byteBuf.readInt()];
    +    byteBuf.readBytes(tableIdBytes);
    +    tableUniqueId = new String(tableIdBytes);
    --- End diff --
   
    creating a string object for every message is a costly operation. Read bytes and return the byte array and maintain a map with byte array as key object


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124194437
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/server/DictionaryServer.java ---
    @@ -146,21 +147,31 @@ public void shutdown() throws Exception {
        * Write dictionary to the store.
        * @throws Exception
        */
    -  public void writeDictionary() throws Exception {
    +  public void writeDictionary(CarbonTable carbonTable) throws Exception {
         DictionaryMessage key = new DictionaryMessage();
         key.setType(DictionaryMessageType.WRITE_DICTIONARY);
    -    dictionaryServerHandler.processMessage(key);
    +    dictionaryServerHandler.processMessage(key, carbonTable);
    +  }
    +
    +  public void initializeDictionaryGenerator(CarbonTable carbonTable) throws Exception {
    +    DictionaryMessage key = new DictionaryMessage();
    +    key.setType(DictionaryMessageType.TABLE_INTIALIZATION);
    +    key.setColumnName("dummy");
    +    key.setTableUniqueId(carbonTable.getCarbonTableIdentifier().getTableId());
    +    key.setTableUniqueName(carbonTable.getTableUniqueName());
    --- End diff --
   
    check if tableUniqueName is required...I think we should be able to work using tableUniqueId


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124194537
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/server/DictionaryServer.java ---
    @@ -146,21 +147,31 @@ public void shutdown() throws Exception {
        * Write dictionary to the store.
        * @throws Exception
        */
    -  public void writeDictionary() throws Exception {
    +  public void writeDictionary(CarbonTable carbonTable) throws Exception {
         DictionaryMessage key = new DictionaryMessage();
         key.setType(DictionaryMessageType.WRITE_DICTIONARY);
    -    dictionaryServerHandler.processMessage(key);
    +    dictionaryServerHandler.processMessage(key, carbonTable);
    +  }
    +
    +  public void initializeDictionaryGenerator(CarbonTable carbonTable) throws Exception {
    +    DictionaryMessage key = new DictionaryMessage();
    +    key.setType(DictionaryMessageType.TABLE_INTIALIZATION);
    +    key.setColumnName("dummy");
    +    key.setTableUniqueId(carbonTable.getCarbonTableIdentifier().getTableId());
    +    key.setTableUniqueName(carbonTable.getTableUniqueName());
    +    dictionaryServerHandler.processMessage(key, carbonTable);
       }
     
       /**
        *  Write Dictionary for one table.
        * @throws Exception
        */
     
    -  public void writeTableDictionary(String uniqueTableName) throws Exception {
    +  public void writeTableDictionary(CarbonTable carbonTable) throws Exception {
         DictionaryMessage key = new DictionaryMessage();
    -    key.setTableUniqueName(uniqueTableName);
    +    key.setTableUniqueName(carbonTable.getTableUniqueName());
    --- End diff --
   
    same comment as above...check if we can remove tableUniqueName


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1097: [WIP] [CARBONDATA-1229] Skip dictionary and d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1097#discussion_r124194927
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala ---
    @@ -51,7 +51,7 @@ object CarbonSessionExample {
     
         spark.sql("DROP TABLE IF EXISTS carbon_table")
     
    -    // Create table
    +//     Create table
    --- End diff --
   
    This class need not be modified...remove the class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
1234