GitHub user QiangCai opened a pull request:
https://github.com/apache/carbondata/pull/1008 [CARBONDATA-1145] Fix single-pass issue for multi-task loading **Issue** The single-pass loading of partition table lost incremental dictionary. The query result of dictionary column is null. **Analyze** During the single-pass loading, the executor will send many initial message to server to initialize the dictionary generator. for each initial message, it will update the generator, will lead to lost some incremental dictionary. **Solution** Reuse the dictionary generator, no need to update it if exists. You can merge this pull request into a Git repository by running: $ git pull https://github.com/QiangCai/carbondata fixsinglepassforpatition Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1008.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1008 ---- commit 8b3f897878207fef56bc7445f925ebcc5876c037 Author: QiangCai <[hidden email]> Date: 2017-06-08T09:47:07Z fix single-pass issue for partition table ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1008 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2291/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1008 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/164/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-core' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/164/org.apache.carbondata$carbondata-core/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-core</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/164/org.apache.carbondata$carbondata-core/testReport/org.apache.carbondata.core.dictionary.generator/TableDictionaryGeneratorTest/updateGenerator/'><strong>org.apache.carbondata.core.dictionary.generator.TableDictionaryGeneratorTest.updateGenerator</strong></a></li></ul> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1008 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/358/<h2>Failed Tests: <span class='status-failure'>0</span></h2> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1008 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2473/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1008 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2483/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1008 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/368/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/368/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/368/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.bigdecimal/TestDimensionWithDecimalDataType/test_unsafe_with_bigdecimal/'><strong>org.apache.carbondata.spark.testsuite.bigdecimal.TestDimensionWithDecimalDataType.test unsafe with bigdecimal</strong></a></li></ul> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on the issue:
https://github.com/apache/carbondata/pull/1008 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1008 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2484/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1008 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/369/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/369/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/369/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.bigdecimal/TestDimensionWithDecimalDataType/test_unsafe_with_bigdecimal/'><strong>org.apache.carbondata.spark.testsuite.bigdecimal.TestDimensionWithDecimalDataType.test unsafe with bigdecimal</strong></a></li></ul> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on the issue:
https://github.com/apache/carbondata/pull/1008 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1008 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2485/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1008 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/370/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/370/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/370/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.bigdecimal/TestDimensionWithDecimalDataType/test_unsafe_with_bigdecimal/'><strong>org.apache.carbondata.spark.testsuite.bigdecimal.TestDimensionWithDecimalDataType.test unsafe with bigdecimal</strong></a></li></ul> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1008#discussion_r121982426 --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/IncrementalColumnDictionaryGenerator.java --- @@ -169,10 +172,22 @@ public IncrementalColumnDictionaryGenerator(CarbonDimension dimension, int maxVa } // write value to dictionary file if (reverseIncrementalCache.size() > 0) { - for (int index = 2; index < reverseIncrementalCache.size() + 2; index++) { - String value = reverseIncrementalCache.get(index); + String[] values = null; + synchronized (lock) { + // collect incremental dictionary + values = new String[currentDictionarySize - maxValue]; --- End diff -- Why it is collected to temporary list, why not directly write inside lock? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1008#discussion_r121982890 --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java --- @@ -115,7 +115,10 @@ public Integer size(DictionaryMessage key) { } public void updateGenerator(CarbonDimension dimension) { - columnMap.put(dimension.getColumnId(), - new IncrementalColumnDictionaryGenerator(dimension, 1)); + // reuse dictionary generator + if (null == columnMap.get(dimension.getColumnId())) { + columnMap.put(dimension.getColumnId(), --- End diff -- lock is required to be taken on columnMap. with double checking, otherwise still parallel intialization is possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1008#discussion_r121983478 --- Diff: core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java --- @@ -115,7 +115,10 @@ public Integer size(DictionaryMessage key) { } public void updateGenerator(CarbonDimension dimension) { - columnMap.put(dimension.getColumnId(), - new IncrementalColumnDictionaryGenerator(dimension, 1)); + // reuse dictionary generator + if (null == columnMap.get(dimension.getColumnId())) { + columnMap.put(dimension.getColumnId(), --- End diff -- ServerDictionaryGenerator.java initializeGeneratorForTable() -> if (tableMap.get(key.getTableUniqueName()) == null) also required lock on tableMap during double checking. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1008 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2488/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1008 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/373/<h2>Failed Tests: <span class='status-failure'>2</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/373/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>2</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/373/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.dataretention/DataRetentionConcurrencyTestCase/DataRetention_Concurrency_load_date/'><strong>org.apache.carbondata.spark.testsuite.dataretention.DataRetentionConcurrencyTestCase.DataRetention_Concurrency_load_date</strong></a></li><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/373/org.apache.carbondata$carbondata-spark-common-test/testR eport/org.apache.carbondata.spark.testsuite.bigdecimal/TestDimensionWithDecimalDataType/test_unsafe_with_bigdecimal/'><strong>org.apache.carbondata.spark.testsuite.bigdecimal.TestDimensionWithDecimalDataType.test unsafe with bigdecimal</strong></a></li></ul> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1008 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2489/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1008 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/374/<h2>Failed Tests: <span class='status-failure'>2</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/374/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>2</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/374/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.dataload/TestGlobalSortDataLoad/LOAD_with_DELETE/'><strong>org.apache.carbondata.spark.testsuite.dataload.TestGlobalSortDataLoad.LOAD with DELETE</strong></a></li><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/374/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.dataload/TestGlobalSortD ataLoad/Test_with_different_date_types/'><strong>org.apache.carbondata.spark.testsuite.dataload.TestGlobalSortDataLoad.Test with different date types</strong></a></li></ul> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |