GitHub user kumarvishal09 opened a pull request:
https://github.com/apache/incubator-carbondata/pull/631 [WIP]Adding Header And Making Footer Optional Currently carbon does not support appendable format, so below changes is to support appendable format in V3 data file format by making footer option and added header in V3 carbon data file . You can merge this pull request into a Git repository by running: $ git pull https://github.com/kumarvishal09/incubator-carbondata AddingHeaderAndMakingFooterOptional Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/631.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #631 ---- commit 7b5cad74f6e050001994c2b4426f54b903051518 Author: kumarvishal <[hidden email]> Date: 2017-03-07T12:54:13Z Adding Header And Making Footer Optional ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/631 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1029/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104660524 --- Diff: pom.xml --- @@ -93,6 +93,7 @@ <modules> <module>common</module> + <module>format</module> --- End diff -- This is not required since we are using format jar from maven repo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104660895 --- Diff: pom.xml --- @@ -93,6 +93,7 @@ <modules> <module>common</module> + <module>format</module> --- End diff -- I have change the format to run all the test i am adding this otherwise build will fail, after passing all the testcases i will remove this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/631 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1030/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104709184 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -183,11 +183,23 @@ struct FileFooter{ } /** +* Footer for indexed carbon file +*/ +struct FileFooterVersion3{ + 1: required i64 num_rows; // Total number of rows in this file + 2: required SegmentInfo segment_info; // Segment info (will be same/repeated for all files in this segment) + 3: required list<BlockletIndex> blocklet_index_list; // blocklet index of all blocklets in this file + 4: optional list<BlockletInfo3> blocklet_info_list3; // Information about blocklets of all columns in this file + 5: optional dictionary.ColumnDictionaryChunk dictionary; // blocklet local dictionary +} + +/** * Header for appendable carbon file */ struct FileHeader{ 1: required i32 version; // version used for data compatibility 2: required list<schema.ColumnSchema> table_columns; // Description of columns in this file + 3: required bool is_Footer_Present; // to check whether footer is present or not --- End diff -- Making the field as `required` will loose backward compatability. Better to make as `optional` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104888546 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -183,11 +183,23 @@ struct FileFooter{ } /** +* Footer for indexed carbon file +*/ +struct FileFooterVersion3{ + 1: required i64 num_rows; // Total number of rows in this file + 2: required SegmentInfo segment_info; // Segment info (will be same/repeated for all files in this segment) + 3: required list<BlockletIndex> blocklet_index_list; // blocklet index of all blocklets in this file + 4: optional list<BlockletInfo3> blocklet_info_list3; // Information about blocklets of all columns in this file + 5: optional dictionary.ColumnDictionaryChunk dictionary; // blocklet local dictionary +} + +/** * Header for appendable carbon file */ struct FileHeader{ 1: required i32 version; // version used for data compatibility 2: required list<schema.ColumnSchema> table_columns; // Description of columns in this file + 3: required bool is_Footer_Present; // to check whether footer is present or not --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/631 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1035/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/631 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1036/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104905047 --- Diff: core/src/main/java/org/apache/carbondata/core/reader/CarbonFooterReaderV3.java --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.reader; + +import java.io.IOException; + +import org.apache.carbondata.format.FileFooterVersion3; + +import org.apache.thrift.TBase; + +/** + * Below class to read file footer of version3 + * carbon data file + */ +public class CarbonFooterReaderV3 { + + //Fact file path + private String filePath; + + //From which offset of file this metadata should be read + private long offset; + + public CarbonFooterReaderV3(String filePath, long offset) { + --- End diff -- remove empty line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104905629 --- Diff: core/src/main/java/org/apache/carbondata/core/reader/CarbonFooterReaderV3.java --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.reader; + +import java.io.IOException; + +import org.apache.carbondata.format.FileFooterVersion3; + +import org.apache.thrift.TBase; + +/** + * Below class to read file footer of version3 + * carbon data file + */ +public class CarbonFooterReaderV3 { + + //Fact file path + private String filePath; + + //From which offset of file this metadata should be read --- End diff -- Is this the offset of the footer? please modify comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104907778 --- Diff: core/src/main/java/org/apache/carbondata/core/reader/CarbonHeaderReader.java --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.reader; + +import java.io.IOException; + +import org.apache.carbondata.format.FileHeader; + +import org.apache.thrift.TBase; + +/** + * Below class to read file header of version3 + * carbon data file + */ +public class CarbonHeaderReader { + + //Fact file path + private String filePath; + + //From which offset of file this metadata should be read + private long offset; --- End diff -- Isn't this fix offset? why is it a variable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104907859 --- Diff: core/src/main/java/org/apache/carbondata/core/reader/CarbonHeaderReader.java --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.reader; + +import java.io.IOException; + +import org.apache.carbondata.format.FileHeader; + +import org.apache.thrift.TBase; + +/** + * Below class to read file header of version3 + * carbon data file + */ +public class CarbonHeaderReader { + + //Fact file path + private String filePath; + + //From which offset of file this metadata should be read + private long offset; + + public CarbonHeaderReader(String filePath, long offset) { + --- End diff -- remove empty line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104907969 --- Diff: core/src/main/java/org/apache/carbondata/core/reader/CarbonHeaderReader.java --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.reader; + +import java.io.IOException; + +import org.apache.carbondata.format.FileHeader; + +import org.apache.thrift.TBase; + +/** + * Below class to read file header of version3 + * carbon data file + */ +public class CarbonHeaderReader { + + //Fact file path + private String filePath; + + //From which offset of file this metadata should be read + private long offset; + + public CarbonHeaderReader(String filePath, long offset) { + + this.filePath = filePath; + this.offset = offset; + } + + /** + * It reads the metadata in FileFooter thrift object format. --- End diff -- description is incorrect, it reads the FileHeader --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104908630 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -183,11 +183,23 @@ struct FileFooter{ } /** +* Footer for indexed carbon file +*/ +struct FileFooterVersion3{ --- End diff -- name it as `FileFooter3` like others, `BlockletInfo3` indicate it is BlockletInfo version 3. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104909522 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -183,11 +183,23 @@ struct FileFooter{ } /** +* Footer for indexed carbon file +*/ +struct FileFooterVersion3{ + 1: required i64 num_rows; // Total number of rows in this file + 2: required SegmentInfo segment_info; // Segment info (will be same/repeated for all files in this segment) + 3: required list<BlockletIndex> blocklet_index_list; // blocklet index of all blocklets in this file + 4: optional list<BlockletInfo3> blocklet_info_list3; // Information about blocklets of all columns in this file + 5: optional dictionary.ColumnDictionaryChunk dictionary; // blocklet local dictionary +} + +/** * Header for appendable carbon file --- End diff -- Now this is not just for appendable file, it is for columnar format also --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104910022 --- Diff: format/src/main/thrift/carbondata_index.thrift --- @@ -41,6 +41,6 @@ struct IndexHeader{ struct BlockIndex{ 1: required i64 num_rows; // Total number of rows in this file 2: required string file_name; // Block file name - 3: required i64 offset; // Offset of block + 3: required i64 offset; // Offset of the footer --- End diff -- suggest to name it header_offset --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104910137 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -183,11 +183,23 @@ struct FileFooter{ } /** +* Footer for indexed carbon file +*/ +struct FileFooterVersion3{ + 1: required i64 num_rows; // Total number of rows in this file + 2: required SegmentInfo segment_info; // Segment info (will be same/repeated for all files in this segment) + 3: required list<BlockletIndex> blocklet_index_list; // blocklet index of all blocklets in this file + 4: optional list<BlockletInfo3> blocklet_info_list3; // Information about blocklets of all columns in this file + 5: optional dictionary.ColumnDictionaryChunk dictionary; // blocklet local dictionary +} + +/** * Header for appendable carbon file */ struct FileHeader{ 1: required i32 version; // version used for data compatibility 2: required list<schema.ColumnSchema> table_columns; // Description of columns in this file + 3: optional bool is_Footer_Present; // to check whether footer is present or not --- End diff -- make it non capitol --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/631#discussion_r104910765 --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java --- @@ -319,11 +322,12 @@ private void writeDataToFile(FileChannel channel, byte[][] dataChunkBytes) { // write the header try { if (fileChannel.size() == 0) { - ColumnarFormatVersion version = CarbonProperties.getInstance().getFormatVersion(); - byte[] header = (CarbonCommonConstants.CARBON_DATA_VERSION_HEADER + version).getBytes(); - ByteBuffer buffer = ByteBuffer.allocate(header.length); - buffer.put(header); - buffer.rewind(); + // below code is to write the file header + byte[] byteArray = --- End diff -- modify variable to fileHeader --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/631 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1038/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |