GitHub user jackylk opened a pull request:
https://github.com/apache/carbondata/pull/1750 [CARBONDATA-1969] Support Java API for create table and writer data It is nice to have Java API to create carbon table and write CSV data into the table. Application can use this API to write data and then query by SparkSQL In this PR, a new module called store-sdk is added. No changes to existing module. This PR is on top of #1749 - [X] Any interfaces changed? new API added - [X] Any backward compatibility impacted? No - [X] Document update required? Yes - [X] Testing done Testcase added - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/jackylk/incubator-carbondata writer_api_latest Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1750.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1750 ---- commit 5d4376bdf7bfd99ff72da355cea0103b568f78c0 Author: Jacky Li <jacky.likun@...> Date: 2018-01-02T15:46:14Z add external table support commit e599cdb1c15b9c47af108e2f54786b11ad1488b5 Author: Jacky Li <jacky.likun@...> Date: 2018-01-02T16:01:45Z add testcase commit a7d634283906331ff3ddb7ec2f2dd178d44d42ff Author: Jacky Li <jacky.likun@...> Date: 2018-01-02T16:03:35Z recover commit c8d82b0d1bac16ede62b93bb8ca1f80a84ad5dcf Author: Jacky Li <jacky.likun@...> Date: 2018-01-02T17:52:23Z add sdk ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1750 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2491/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1750 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1266/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1750 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2659/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1750 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2504/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1750 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1280/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1750 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2668/ --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1750#discussion_r161988359 --- Diff: store/sdk/src/main/java/org/apache/carbondata/store/TableBuilder.java --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.store; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; +import org.apache.carbondata.core.metadata.CarbonMetadata; +import org.apache.carbondata.core.metadata.converter.SchemaConverter; +import org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.metadata.schema.table.DataMapSchema; +import org.apache.carbondata.core.metadata.schema.table.TableInfo; +import org.apache.carbondata.core.metadata.schema.table.TableSchema; +import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema; +import org.apache.carbondata.core.util.path.CarbonStorePath; +import org.apache.carbondata.core.util.path.CarbonTablePath; +import org.apache.carbondata.core.writer.ThriftWriter; +import org.apache.carbondata.format.SchemaEvolutionEntry; +import org.apache.carbondata.store.api.Table; + +public class TableBuilder { + + private String databaseName; + private String tableName; + private String tablePath; + private TableSchema tableSchema; + + private TableBuilder() { } + + public static TableBuilder newInstance() { + return new TableBuilder(); + } + + public Table create() throws IOException { + if (tableName == null || tablePath == null || tableSchema == null) { + throw new IllegalArgumentException("must provide table name and table path"); + } + + if (databaseName == null) { + databaseName = "default"; + } + + TableInfo tableInfo = new TableInfo(); + tableInfo.setDatabaseName(databaseName); + tableInfo.setTableUniqueName(databaseName + "_" + tableName); + tableInfo.setFactTable(tableSchema); + tableInfo.setTablePath(tablePath); + tableInfo.setLastUpdatedTime(System.currentTimeMillis()); + tableInfo.setDataMapSchemaList(new ArrayList<DataMapSchema>(0)); + AbsoluteTableIdentifier identifier = tableInfo.getOrCreateAbsoluteTableIdentifier(); + + CarbonTablePath carbonTablePath = CarbonStorePath.getCarbonTablePath( + identifier.getTablePath(), + identifier.getCarbonTableIdentifier()); + String schemaFilePath = carbonTablePath.getSchemaFilePath(); + String schemaMetadataPath = CarbonTablePath.getFolderContainingFile(schemaFilePath); + CarbonMetadata.getInstance().loadTableMetadata(tableInfo); + SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl(); + org.apache.carbondata.format.TableInfo thriftTableInfo = + schemaConverter.fromWrapperToExternalTableInfo( + tableInfo, + tableInfo.getDatabaseName(), + tableInfo.getFactTable().getTableName()); + org.apache.carbondata.format.SchemaEvolutionEntry schemaEvolutionEntry = + new SchemaEvolutionEntry( + tableInfo.getLastUpdatedTime()); + thriftTableInfo.getFact_table().getSchema_evolution().getSchema_evolution_history() + .add(schemaEvolutionEntry); + FileFactory.FileType fileType = FileFactory.getFileType(schemaMetadataPath); + if (!FileFactory.isFileExist(schemaMetadataPath, fileType)) { + FileFactory.mkdirs(schemaMetadataPath, fileType); + } + ThriftWriter thriftWriter = new ThriftWriter(schemaFilePath, false); + thriftWriter.open(); + thriftWriter.write(thriftTableInfo); + thriftWriter.close(); --- End diff -- Its not safe here to call close without finally block Writer close can not be ensured if any IOException occurs at thriftWriter.write(thriftTableInfo) --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1750#discussion_r161989164 --- Diff: store/sdk/src/test/scala/org/apache/carbondata/store/TestCarbonFileWriter.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.store + +import java.io.File + +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.metadata.datatype.{DataTypes, StructField} +import org.apache.carbondata.store.api.{CarbonStore, SchemaBuilder} + +class TestCarbonFileWriter extends QueryTest with BeforeAndAfterAll { + + test("test write carbon table and read as external table") { + sql("DROP TABLE IF EXISTS source") + + val tablePath = "./db1/tc1" + cleanTestTable(tablePath) + createTestTable(tablePath) + + sql(s"CREATE EXTERNAL TABLE source STORED BY 'carbondata' LOCATION '$tablePath'") + checkAnswer(sql("SELECT count(*) from source"), Row(1000)) + + sql("DROP TABLE IF EXISTS source") + } + + test("test write carbon table and read by refresh table") { + sql("DROP DATABASE IF EXISTS db1 CASCADE") + + val tablePath = "./db1/tc1" + cleanTestTable(tablePath) + createTestTable(tablePath) + + sql("CREATE DATABASE db1 LOCATION './db1'") + sql("REFRESH TABLE db1.tc1") + checkAnswer(sql("SELECT count(*) from db1.tc1"), Row(1000)) + + sql("DROP DATABASE IF EXISTS db1 CASCADE") + } + + private def cleanTestTable(tablePath: String) = { + if (new File(tablePath).exists()) { + new File(tablePath).delete() + } + } + + private def createTestTable(tablePath: String): Unit = { + val carbon = CarbonStore.build() + + val schema = SchemaBuilder.newInstance + .addColumn(new StructField("name", DataTypes.STRING), true) + .addColumn(new StructField("age", DataTypes.INT), false) + .addColumn(new StructField("height", DataTypes.DOUBLE), false) + .create + + val table = carbon.createTable("t1", schema, tablePath) + val segment = table.newBatchSegment() + + segment.open() + val writer = segment.newWriter() + (1 to 1000).foreach { _ => writer.writeRow(Array[String]("amy", "1", "2.3")) } + writer.close() --- End diff -- Stream close can not be ensured here without finally. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1750 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3292/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1750 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3297/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1750 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2060/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1750 merged with #1798 --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |