[GitHub] carbondata pull request #2749: [WIP] simplify SDK API interfaces

classic Classic list List threaded Threaded
44 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
GitHub user ajantha-bhat opened a pull request:

    https://github.com/apache/carbondata/pull/2749

    [WIP] simplify SDK API interfaces

    [WIP] simplify SDK API interfaces
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ajantha-bhat/carbondata sdk

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2749.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2749
   
----
commit b56a71146cc865bb37b3a140f5328639d069ac6c
Author: ajantha-bhat <ajanthabhat@...>
Date:   2018-09-21T11:02:56Z

    remove AK SK

commit b44b76b100fe90282b9470f63149549b42c3c89e
Author: ajantha-bhat <ajanthabhat@...>
Date:   2018-09-21T12:24:34Z

    withHadoopconf and withThreads

commit d565845a38af9fbfaa08acf9063427a00aa9ba0b
Author: ajantha-bhat <ajanthabhat@...>
Date:   2018-09-21T15:31:59Z

    build API changes

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/399/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8647/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/577/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/406/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8655/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/585/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/408/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [WIP] simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8657/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/587/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/409/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8658/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/588/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2749#discussion_r219703810
 
    --- Diff: docs/sdk-guide.md ---
    @@ -377,91 +366,69 @@ public CarbonWriterBuilder withLoadOptions(Map<String, String> options);
     public CarbonWriterBuilder withTableProperties(Map<String, String> options);
     ```
     
    -
     ```
     /**
    -* this writer is not thread safe, use buildThreadSafeWriterForCSVInput in multi thread environment
    -* Build a {@link CarbonWriter}, which accepts row in CSV format object
    -* @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
    -* @param configuration hadoop configuration object.
    -* @return CSVCarbonWriter
    -* @throws IOException
    -* @throws InvalidLoadOptionException
    +* To make sdk writer thread safe.
    +*
    +* @param numOfThreads should number of threads in which writer is called in multi-thread scenario
    +*                     default sdk writer is not thread safe.
    +*                     can use one writer instance in one thread only.
    +* @return updated CarbonWriterBuilder
     */
    -public CarbonWriter buildWriterForCSVInput(org.apache.carbondata.sdk.file.Schema schema, Configuration configuration) throws IOException, InvalidLoadOptionException;
    +public CarbonWriterBuilder withThreadSafe(short numOfThreads);
     ```
     
     ```
     /**
    -* Can use this writer in multi-thread instance.
    -* Build a {@link CarbonWriter}, which accepts row in CSV format
    -* @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
    -* @param numOfThreads number of threads() in which .write will be called.    
    -* @param configuration hadoop configuration object          
    -* @return CSVCarbonWriter
    -* @throws IOException
    -* @throws InvalidLoadOptionException
    +* To support hadoop configuration
    +*
    +* @param conf hadoop configuration support, can set s3a AK,SK,end point and other conf with this
    +* @return updated CarbonWriterBuilder
     */
    -public CarbonWriter buildThreadSafeWriterForCSVInput(Schema schema, short numOfThreads, Configuration configuration)
    -  throws IOException, InvalidLoadOptionException;
    +public CarbonWriterBuilder withHadoopConf(Configuration conf)
     ```
     
    -
    -```  
    +```
     /**
    -* this writer is not thread safe, use buildThreadSafeWriterForAvroInput in multi thread environment
    -* Build a {@link CarbonWriter}, which accepts Avro format object
    -* @param avroSchema avro Schema object {org.apache.avro.Schema}
    -* @param configuration hadoop configuration object
    -* @return AvroCarbonWriter
    -* @throws IOException
    -* @throws InvalidLoadOptionException
    +* to build a {@link CarbonWriter}, which accepts row in CSV format
    +*
    +* @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
    +* @return CarbonWriterBuilder
     */
    -public CarbonWriter buildWriterForAvroInput(org.apache.avro.Schema schema, Configuration configuration) throws IOException, InvalidLoadOptionException;
    +public CarbonWriterBuilder forCsvInput(Schema schema);
    --- End diff --
   
    Better change as `withCsvInput`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2749#discussion_r219732821
 
    --- Diff: docs/sdk-guide.md ---
    @@ -377,91 +366,69 @@ public CarbonWriterBuilder withLoadOptions(Map<String, String> options);
     public CarbonWriterBuilder withTableProperties(Map<String, String> options);
     ```
     
    -
     ```
     /**
    -* this writer is not thread safe, use buildThreadSafeWriterForCSVInput in multi thread environment
    -* Build a {@link CarbonWriter}, which accepts row in CSV format object
    -* @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
    -* @param configuration hadoop configuration object.
    -* @return CSVCarbonWriter
    -* @throws IOException
    -* @throws InvalidLoadOptionException
    +* To make sdk writer thread safe.
    +*
    +* @param numOfThreads should number of threads in which writer is called in multi-thread scenario
    +*                     default sdk writer is not thread safe.
    +*                     can use one writer instance in one thread only.
    +* @return updated CarbonWriterBuilder
     */
    -public CarbonWriter buildWriterForCSVInput(org.apache.carbondata.sdk.file.Schema schema, Configuration configuration) throws IOException, InvalidLoadOptionException;
    +public CarbonWriterBuilder withThreadSafe(short numOfThreads);
     ```
     
     ```
     /**
    -* Can use this writer in multi-thread instance.
    -* Build a {@link CarbonWriter}, which accepts row in CSV format
    -* @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
    -* @param numOfThreads number of threads() in which .write will be called.    
    -* @param configuration hadoop configuration object          
    -* @return CSVCarbonWriter
    -* @throws IOException
    -* @throws InvalidLoadOptionException
    +* To support hadoop configuration
    +*
    +* @param conf hadoop configuration support, can set s3a AK,SK,end point and other conf with this
    +* @return updated CarbonWriterBuilder
     */
    -public CarbonWriter buildThreadSafeWriterForCSVInput(Schema schema, short numOfThreads, Configuration configuration)
    -  throws IOException, InvalidLoadOptionException;
    +public CarbonWriterBuilder withHadoopConf(Configuration conf)
     ```
     
    -
    -```  
    +```
     /**
    -* this writer is not thread safe, use buildThreadSafeWriterForAvroInput in multi thread environment
    -* Build a {@link CarbonWriter}, which accepts Avro format object
    -* @param avroSchema avro Schema object {org.apache.avro.Schema}
    -* @param configuration hadoop configuration object
    -* @return AvroCarbonWriter
    -* @throws IOException
    -* @throws InvalidLoadOptionException
    +* to build a {@link CarbonWriter}, which accepts row in CSV format
    +*
    +* @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
    +* @return CarbonWriterBuilder
     */
    -public CarbonWriter buildWriterForAvroInput(org.apache.avro.Schema schema, Configuration configuration) throws IOException, InvalidLoadOptionException;
    +public CarbonWriterBuilder forCsvInput(Schema schema);
    --- End diff --
   
    done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/424/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/426/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/605/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8675/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2749: [CARBONDATA-2961] Simplify SDK API interfaces

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2749
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/433/



---
123