GitHub user manishgupta88 opened a pull request:
https://github.com/apache/carbondata/pull/2454 [WIP] [CARBONDATA-2701] Refactor code to store minimal required info in Block and Blocklet Cache Things done as part of this PR 1. Refactored code to keep only minimal information in block and blocklet cache. 2. Introduced segment properties holder at JVM level to hold the segment properties. As it is heavy object, new segment properties object will be created only when schema or cardinality is changed for a table. This PR depends on PR #2437 - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? NA - [ ] Document update required? No - [ ] Testing done Yes - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata refactor_segmentproperties Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2454.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2454 ---- commit c06de06046da4efe6dc606f410686dcea256d46f Author: manishgupta88 <tomanishgupta18@...> Date: 2018-06-25T06:43:00Z segregate block and blocklet cache commit a5017751f45a43ce75a98610214049e1c894e1e7 Author: manishgupta88 <tomanishgupta18@...> Date: 2018-07-04T15:30:54Z Refactor Block and Blocklet DataMap to store only segmentProeprties Index instead of segmentProperties ---- --- |
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2454 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5637/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2454 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5638/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6843/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6845/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5636/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6890/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5670/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6927/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5712/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2454 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5700/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2454 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5701/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6928/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5713/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2454#discussion_r200870385 --- Diff: core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java --- @@ -321,4 +328,43 @@ private static boolean isSameColumnSchemaList(List<ColumnSchema> indexFileColumn } return updatedValues; } + + /** + * Convert schema to binary + */ + public static byte[] convertSchemaToBinary(List<ColumnSchema> columnSchemas) throws IOException { + ByteArrayOutputStream stream = new ByteArrayOutputStream(); + DataOutput dataOutput = new DataOutputStream(stream); + dataOutput.writeShort(columnSchemas.size()); + for (ColumnSchema columnSchema : columnSchemas) { + if (columnSchema.getColumnReferenceId() == null) { + columnSchema.setColumnReferenceId(columnSchema.getColumnUniqueId()); + } + columnSchema.write(dataOutput); + } + byte[] byteArray = stream.toByteArray(); + // Compress with snappy to reduce the size of schema + return Snappy.rawCompress(byteArray, byteArray.length); --- End diff -- Use compressor factory. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2454#discussion_r200870442 --- Diff: core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java --- @@ -321,4 +328,43 @@ private static boolean isSameColumnSchemaList(List<ColumnSchema> indexFileColumn } return updatedValues; } + + /** + * Convert schema to binary + */ + public static byte[] convertSchemaToBinary(List<ColumnSchema> columnSchemas) throws IOException { + ByteArrayOutputStream stream = new ByteArrayOutputStream(); + DataOutput dataOutput = new DataOutputStream(stream); + dataOutput.writeShort(columnSchemas.size()); + for (ColumnSchema columnSchema : columnSchemas) { + if (columnSchema.getColumnReferenceId() == null) { + columnSchema.setColumnReferenceId(columnSchema.getColumnUniqueId()); + } + columnSchema.write(dataOutput); + } + byte[] byteArray = stream.toByteArray(); + // Compress with snappy to reduce the size of schema + return Snappy.rawCompress(byteArray, byteArray.length); + } + + /** + * Read column schema from binary + * + * @param schemaArray + * @throws IOException + */ + public static List<ColumnSchema> readColumnSchema(byte[] schemaArray) throws IOException { + // uncompress it. + schemaArray = Snappy.uncompress(schemaArray); --- End diff -- Same as abive --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2454#discussion_r200870697 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java --- @@ -17,7 +17,11 @@ package org.apache.carbondata.core.indexstore.blockletindex; import java.io.IOException; -import java.util.*; +import java.util.ArrayList; --- End diff -- Remove unnecessary changes --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5718/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2454 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6934/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2454 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5706/ --- |
Free forum by Nabble | Edit this page |