[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write data into...

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write data into...

qiuchenjian-2
GitHub user xubo245 opened a pull request:

    https://github.com/apache/carbondata/pull/2226

    [CARBONDATA-2384] SDK support write data into S3

    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xubo245/carbondata CARBONDATA-2384-SDKS3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2226
   
----
commit 5dc8d9312c1266ab643f8f6834c404a77c41451a
Author: xubo245 <601450868@...>
Date:   2018-04-25T07:09:44Z

    [CARBONDATA-2384] SDK support write data into S3

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write data into S3

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4215/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write data into S3

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5382/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write data into S3

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5383/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5385/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4218/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4521/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4522/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4219/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5386/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4523/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4525/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184033067
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
   
    Please write test cases for Non Transactional table in reader and writer.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184036532
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
   
    @sounakr @xubo245
     What does the meaning of `Transactional table` in carbondata?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4241/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5408/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4546/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184360833
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
   
    @sounakr I change the code to support configure write Non Transactional table. But not carbonReader don't support read Non Transactional table. I will raise another PR to support this function.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184361451
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
   
     If  isTransactionalTable set false, writes the carbondata and carbonindex files in a flat folder structure. Please check https://github.com/apache/carbondata/blob/master/docs/sdk-writer-guide.md. @xuchuanyin


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4262/



---
12