[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

classic Classic list List threaded Threaded
57 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212538229
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java ---
    @@ -268,6 +268,49 @@ public synchronized OutputCommitter getOutputCommitter(TaskAttemptContext contex
             executorService);
       }
     
    +  public RecordWriter<NullWritable, ObjectArrayWritable> getMultiThreadRecordWriter(
    --- End diff --
   
    1) Added comments.
   
    2) cannot change argument of  number of threads to getRecordWriter() as it is a override method from FileOutputFormat. Hence added new method.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212538378
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java ---
    @@ -434,4 +477,61 @@ public CarbonLoadModel getLoadModel() {
           return loadModel;
         }
       }
    +
    +  public static class CarbonMultiRecordWriter
    --- End diff --
   
    This CarbonMultiRecordWriter is used only with cabonTable output format. Hence better to have as a subclass itself.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212538635
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java ---
    @@ -659,9 +659,14 @@ public static boolean isRawDataRequired(CarbonDataLoadConfiguration configuratio
        * @return
        */
       public static List<CarbonIterator<Object[]>>[] partitionInputReaderIterators(
    -      CarbonIterator<Object[]>[] inputIterators) {
    +      CarbonIterator<Object[]>[] inputIterators, short sdkUserCores) {
         // Get the number of cores configured in property.
    -    int numberOfCores = CarbonProperties.getInstance().getNumberOfCores();
    +    int numberOfCores;
    +    if (sdkUserCores != 0) {
    --- End diff --
   
    At SDK input level it self it is handled. negative numbers cannot come here.
   
    But I will change.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212540287
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java ---
    @@ -434,4 +477,61 @@ public CarbonLoadModel getLoadModel() {
           return loadModel;
         }
       }
    +
    +  public static class CarbonMultiRecordWriter
    --- End diff --
   
    only two methods are there and both need override. Doesn't get any reward from doing this.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212542081
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ---
    @@ -339,6 +339,28 @@ public CarbonWriter buildWriterForCSVInput(Schema schema)
         return new CSVCarbonWriter(loadModel);
       }
     
    +  /**
    +   * Build a {@link CarbonWriter}, which accepts row in CSV format
    +   * @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
    +   * @param numOfThreads number of threads() in which .write will be called.
    +   * @return CSVCarbonWriter
    +   * @throws IOException
    +   * @throws InvalidLoadOptionException
    +   */
    +  public CarbonWriter buildWriterForCSVInput(Schema schema, short numOfThreads)
    --- End diff --
   
    Agree. Changed the name and updated the doc.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212542105
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ---
    @@ -360,6 +382,33 @@ public CarbonWriter buildWriterForAvroInput(org.apache.avro.Schema avroSchema)
         return new AvroCarbonWriter(loadModel);
       }
     
    +  /**
    +   * Build a {@link CarbonWriter}, which accepts Avro object
    +   * @param avroSchema avro Schema object {org.apache.avro.Schema}
    +   * @param numOfThreads number of threads() in which .write will be called.
    +   * @return AvroCarbonWriter
    +   * @throws IOException
    +   * @throws InvalidLoadOptionException
    +   */
    +  public CarbonWriter buildWriterForAvroInput(org.apache.avro.Schema avroSchema, short numOfThreads)
    --- End diff --
   
    done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6375/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8028/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6751/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    @ravipesala : PR is ready. please check


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212600099
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java ---
    @@ -434,4 +477,61 @@ public CarbonLoadModel getLoadModel() {
           return loadModel;
         }
       }
    +
    +  public static class CarbonMultiRecordWriter
    --- End diff --
   
    done. removed this


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2653: [CARBONDATA-2874] Support SDK writer as threa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2653#discussion_r212600140
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java ---
    @@ -434,4 +477,61 @@ public CarbonLoadModel getLoadModel() {
           return loadModel;
         }
       }
    +
    +  public static class CarbonMultiRecordWriter
    --- End diff --
   
    done. overrided


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    @ravipesala : Fixed all the comments please review


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6389/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8045/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6391/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6767/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8043/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6392/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2653: [CARBONDATA-2874] Support SDK writer as thread safe ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2653
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8046/



---
123