GitHub user jackylk opened a pull request:
https://github.com/apache/carbondata/pull/1059 [CARBONDATA-1124] Use raw compression while encoding measures Use zera-copy raw compression form Snappy to encode measure (UnsafeColumnPage) You can merge this pull request into a Git repository by running: $ git pull https://github.com/jackylk/incubator-carbondata rawcomp Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1059.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1059 ---- commit ca8fa5148f0629f7b376eef3873d32b9f6c19806 Author: jackylk <[hidden email]> Date: 2017-06-19T06:52:03Z use raw compression ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1059 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/474/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1059 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2578/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1059 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/475/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1059 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2579/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1059 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/477/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1059 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2581/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1059#discussion_r123138707 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -326,9 +327,17 @@ public void encode(PrimitiveCodec codec) { } @Override - public byte[] compress(Compressor compressor) { - // TODO: use zero-copy raw compression - return super.compress(compressor); + public byte[] compress(Compressor compressor) throws MemoryException, IOException { + // use raw compression and copy to byte[] + int inputSize = pageSize << dataType.getSizeBits(); + int compressedMaxSize = compressor.maxCompressedLength(inputSize); + MemoryBlock compressed = UnsafeMemoryManager.allocateMemoryWithRetry(compressedMaxSize); --- End diff -- need to use UnsafeMemoryAllocator directly. Because UnsafeMemoryManager maybe use HeapMemoryAllocator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1059#discussion_r123139222 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -326,9 +327,17 @@ public void encode(PrimitiveCodec codec) { } @Override - public byte[] compress(Compressor compressor) { - // TODO: use zero-copy raw compression - return super.compress(compressor); + public byte[] compress(Compressor compressor) throws MemoryException, IOException { + // use raw compression and copy to byte[] + int inputSize = pageSize << dataType.getSizeBits(); + int compressedMaxSize = compressor.maxCompressedLength(inputSize); + MemoryBlock compressed = UnsafeMemoryManager.allocateMemoryWithRetry(compressedMaxSize); + long outSize = compressor.rawCompress(baseOffset, inputSize, compressed.getBaseOffset()); + assert outSize < Integer.MAX_VALUE; + byte[] output = new byte[(int) outSize]; + CarbonUnsafe.unsafe.copyMemory(compressed.getBaseObject(), compressed.getBaseOffset(), --- End diff -- if we use UnsafeMemoryAllocator, need to copyMemory(long, long, long). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1059#discussion_r123139699 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -326,9 +327,17 @@ public void encode(PrimitiveCodec codec) { } @Override - public byte[] compress(Compressor compressor) { - // TODO: use zero-copy raw compression - return super.compress(compressor); + public byte[] compress(Compressor compressor) throws MemoryException, IOException { + // use raw compression and copy to byte[] + int inputSize = pageSize << dataType.getSizeBits(); + int compressedMaxSize = compressor.maxCompressedLength(inputSize); + MemoryBlock compressed = UnsafeMemoryManager.allocateMemoryWithRetry(compressedMaxSize); --- End diff -- we need to judge which allocator we used --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1059#discussion_r123139760 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -326,9 +327,17 @@ public void encode(PrimitiveCodec codec) { } @Override - public byte[] compress(Compressor compressor) { - // TODO: use zero-copy raw compression - return super.compress(compressor); + public byte[] compress(Compressor compressor) throws MemoryException, IOException { + // use raw compression and copy to byte[] + int inputSize = pageSize << dataType.getSizeBits(); + int compressedMaxSize = compressor.maxCompressedLength(inputSize); + MemoryBlock compressed = UnsafeMemoryManager.allocateMemoryWithRetry(compressedMaxSize); + long outSize = compressor.rawCompress(baseOffset, inputSize, compressed.getBaseOffset()); + assert outSize < Integer.MAX_VALUE; + byte[] output = new byte[(int) outSize]; + CarbonUnsafe.unsafe.copyMemory(compressed.getBaseObject(), compressed.getBaseOffset(), --- End diff -- we need to judge which allocator we used also --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on the issue:
https://github.com/apache/carbondata/pull/1059 @jackylk Both UnsafeMemoryAllocator and HeapMemoryAllocator should be supported. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1059#discussion_r123155018 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -326,9 +327,17 @@ public void encode(PrimitiveCodec codec) { } @Override - public byte[] compress(Compressor compressor) { - // TODO: use zero-copy raw compression - return super.compress(compressor); + public byte[] compress(Compressor compressor) throws MemoryException, IOException { + // use raw compression and copy to byte[] + int inputSize = pageSize << dataType.getSizeBits(); + int compressedMaxSize = compressor.maxCompressedLength(inputSize); + MemoryBlock compressed = UnsafeMemoryManager.allocateMemoryWithRetry(compressedMaxSize); --- End diff -- I think it is better to change the logic in UnsafeMemoryManager to simplify the usage code. In UnsafeMemoryManager, it will use unsafe memory based on ``` boolean offHeap = Boolean.parseBoolean(CarbonProperties.getInstance() .getProperty(CarbonCommonConstants.ENABLE_OFFHEAP_SORT, CarbonCommonConstants.ENABLE_OFFHEAP_SORT_DEFAULT)); ``` So I suggest to change this part to make one configuration only for unsafe. Then I will remove other unsafe configuration like ENABLE_UNSAFE_COLUMN_PAGE_LOADING @ravipesala please check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1059#discussion_r123155037 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -326,9 +327,17 @@ public void encode(PrimitiveCodec codec) { } @Override - public byte[] compress(Compressor compressor) { - // TODO: use zero-copy raw compression - return super.compress(compressor); + public byte[] compress(Compressor compressor) throws MemoryException, IOException { + // use raw compression and copy to byte[] + int inputSize = pageSize << dataType.getSizeBits(); + int compressedMaxSize = compressor.maxCompressedLength(inputSize); + MemoryBlock compressed = UnsafeMemoryManager.allocateMemoryWithRetry(compressedMaxSize); + long outSize = compressor.rawCompress(baseOffset, inputSize, compressed.getBaseOffset()); + assert outSize < Integer.MAX_VALUE; + byte[] output = new byte[(int) outSize]; + CarbonUnsafe.unsafe.copyMemory(compressed.getBaseObject(), compressed.getBaseOffset(), --- End diff -- same as above comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1059 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2661/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1059 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/89/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1059 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/carbondata-pr-spark-1.6/580/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/580/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/580/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.dataretention/DataRetentionConcurrencyTestCase/DataRetention_Concurrency_load_date/'><strong>org.apache.carbondata.spark.testsuite.dataretention.DataRetentionConcurrencyTestCase.DataRetention_Concurrency_load_date</strong></a></li></ul> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1059 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2678/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1059 Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/105/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |