GitHub user xubo245 opened a pull request:
https://github.com/apache/carbondata/pull/2353 [CARBONDATA-2558] Optimize carbon schema reader interface of SDK Optimize carbon schema reader interface of SDK ``` 1.create CarbonSchemaReader and move schema read interface from CarbonReader to CarbonSchemaReader 2.change the return type from List to SDK Schema, remove the tableInfo return type 3.Optimize the schema ``` your contribution quickly and easily: - [ ] Any interfaces changed? YES - [ ] Any backward compatibility impacted? NA - [ ] Document update required? YES - [ ] Testing done change the test case - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NO You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CARBONDATA-2558-CarbonSchemaReader Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2353.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2353 ---- commit 5b72529323fd135ffe8cb1291d89f52e3e856bf8 Author: xubo245 <xubo29@...> Date: 2018-05-29T09:07:10Z [CARBONDATA-2558] Optimize carbon schema reader interface of SDK ---- --- |
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2353#discussion_r191356357 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java --- @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.DataInputStream; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.metadata.converter.SchemaConverter; +import org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl; +import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema; +import org.apache.carbondata.core.reader.CarbonHeaderReader; +import org.apache.carbondata.core.reader.CarbonIndexFileReader; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.path.CarbonTablePath; + +import static org.apache.carbondata.core.util.CarbonUtil.thriftColumnSchemaToWrapperColumnSchema; + +/** + * Schema reader for carbon files, including carbondata file, carbonindex file, and schema file + */ +public class CarbonSchemaReader { + + /** + * Read schema file and return the schema + * + * @param schemaFilePath complete path including schema file name + * @return schema object + * @throws IOException + */ + public static Schema readSchemaInSchemaFile(String schemaFilePath) throws IOException { + org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.readSchemaFile(schemaFilePath); + SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl(); + List<ColumnSchema> schemaList = schemaConverter + .fromExternalToWrapperTableInfo(tableInfo, "", "", "") + .getFactTable() + .getListOfColumns(); + return new Schema(schemaList); + } + + /** + * Read carbondata file and return the schema + * + * @param dataFilePath complete path including carbondata file name + * @return Schema object + * @throws IOException + */ + public static Schema readSchemaInDataFile(String dataFilePath) throws IOException { + CarbonHeaderReader reader = new CarbonHeaderReader(dataFilePath); + return new Schema(reader.readSchema()); + } + + /** + * Read carbonindex file and return the schema + * + * @param indexFilePath complete path including index file name + * @return schema object + * @throws IOException + */ + public static Schema readSchemaInIndexFile(String indexFilePath) throws IOException { + CarbonFile indexFile = + FileFactory.getCarbonFile(indexFilePath, FileFactory.getFileType(indexFilePath)); --- End diff -- Please attach a performance comparison between carbondata file and index file in S3. --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2353#discussion_r191360475 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java --- @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.DataInputStream; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.metadata.converter.SchemaConverter; +import org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl; +import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema; +import org.apache.carbondata.core.reader.CarbonHeaderReader; +import org.apache.carbondata.core.reader.CarbonIndexFileReader; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.path.CarbonTablePath; + +import static org.apache.carbondata.core.util.CarbonUtil.thriftColumnSchemaToWrapperColumnSchema; + +/** + * Schema reader for carbon files, including carbondata file, carbonindex file, and schema file + */ +public class CarbonSchemaReader { + + /** + * Read schema file and return the schema + * + * @param schemaFilePath complete path including schema file name + * @return schema object + * @throws IOException + */ + public static Schema readSchemaInSchemaFile(String schemaFilePath) throws IOException { + org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.readSchemaFile(schemaFilePath); + SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl(); + List<ColumnSchema> schemaList = schemaConverter + .fromExternalToWrapperTableInfo(tableInfo, "", "", "") + .getFactTable() + .getListOfColumns(); + return new Schema(schemaList); + } + + /** + * Read carbondata file and return the schema + * + * @param dataFilePath complete path including carbondata file name + * @return Schema object + * @throws IOException + */ + public static Schema readSchemaInDataFile(String dataFilePath) throws IOException { + CarbonHeaderReader reader = new CarbonHeaderReader(dataFilePath); + return new Schema(reader.readSchema()); + } + + /** + * Read carbonindex file and return the schema + * + * @param indexFilePath complete path including index file name + * @return schema object + * @throws IOException + */ + public static Schema readSchemaInIndexFile(String indexFilePath) throws IOException { + CarbonFile indexFile = + FileFactory.getCarbonFile(indexFilePath, FileFactory.getFileType(indexFilePath)); --- End diff -- This method is developed by ajantha in https://github.com/apache/carbondata/pull/2345, after his PR merged, I will change the PR to invoke his method. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2353 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4996/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2353 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6160/ --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2353 retest this please --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2353 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5138/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2353 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5002/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2353 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6164/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2353 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5141/ --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2353 @jackylk @sounakr CI pass, please review it --- |
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |