[GitHub] carbondata pull request #2850: Added concurrent reading through SDK

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2850#discussion_r230272267
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
    @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) {
         return builder(tablePath, tableName);
       }
     
    +  /**
    +   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
    +   * CarbonReader objects, each iterating through some 'carbondata' files
    +   * and return that list of CarbonReader objects
    +   *
    +   * If the no. of files is greater than maxSplits, then break the
    +   * CarbonReader into maxSplits splits, with each split iterating
    +   * through >= 1 file.
    +   *
    +   * If the no. of files is less than maxSplits, then return list of
    +   * CarbonReader with size as the no. of files, with each CarbonReader
    +   * iterating through exactly one file
    +   *
    +   * @param maxSplits: Int
    +   * @return list of {@link CarbonReader} objects
    +   */
    +  public List<CarbonReader> split(int maxSplits) throws IOException {
    --- End diff --
   
    @ravipesala : Adding to builder will break builder pattern, recently we removed arguments from build() and make it as separate API for SDK writer. Reader also followed same.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2850#discussion_r230297864
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
    @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) {
         return builder(tablePath, tableName);
       }
     
    +  /**
    +   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
    +   * CarbonReader objects, each iterating through some 'carbondata' files
    +   * and return that list of CarbonReader objects
    +   *
    +   * If the no. of files is greater than maxSplits, then break the
    +   * CarbonReader into maxSplits splits, with each split iterating
    +   * through >= 1 file.
    +   *
    +   * If the no. of files is less than maxSplits, then return list of
    +   * CarbonReader with size as the no. of files, with each CarbonReader
    +   * iterating through exactly one file
    +   *
    +   * @param maxSplits: Int
    +   * @return list of {@link CarbonReader} objects
    +   */
    +  public List<CarbonReader> split(int maxSplits) throws IOException {
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kunal642 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2850#discussion_r230315785
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
    @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) {
         return builder(tablePath, tableName);
       }
     
    +  /**
    +   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
    +   * CarbonReader objects, each iterating through some 'carbondata' files
    +   * and return that list of CarbonReader objects
    +   *
    +   * If the no. of files is greater than maxSplits, then break the
    +   * CarbonReader into maxSplits splits, with each split iterating
    +   * through >= 1 file.
    +   *
    +   * If the no. of files is less than maxSplits, then return list of
    +   * CarbonReader with size as the no. of files, with each CarbonReader
    +   * iterating through exactly one file
    +   *
    +   * @param maxSplits: Int
    +   * @return list of {@link CarbonReader} objects
    +   */
    +  public List<CarbonReader> split(int maxSplits) throws IOException {
    +    validateReader();
    +    if (maxSplits < 1) {
    +      throw new RuntimeException(
    +          this.getClass().getSimpleName() + ".split: maxSplits must be positive");
    +    }
    +
    +    List<CarbonReader> carbonReaders = new ArrayList<>();
    +
    +    if (maxSplits < this.readers.size()) {
    --- End diff --
   
    @ravipesala Let us add test cases in a separate PR. would it be okay?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2850: [CARBONDATA-3056] Added concurrent reading through S...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2850
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2850: [CARBONDATA-3056] Added concurrent reading through S...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2850
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1244/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2850: [CARBONDATA-3056] Added concurrent reading through S...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2850
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1460/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2850: [CARBONDATA-3056] Added concurrent reading through S...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2850
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9509/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2850: [CARBONDATA-3056] Added concurrent reading through S...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/2850
 
    LGTM



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2850


---
12