[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

classic Classic list List threaded Threaded
97 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2804#discussion_r229198062
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java ---
    @@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String schemaFilePath) throws IOExce
       /**
        * Read carbondata file and return the schema
        *
    -   * @param dataFilePath complete path including carbondata file name
    +   * @param path complete path including carbondata file name
        * @return Schema object
        * @throws IOException
        */
    -  public static Schema readSchemaInDataFile(String dataFilePath) throws IOException {
    +  public static Schema readSchemaInDataFile(String path) throws IOException {
    +    String dataFilePath = path;
    +    if (!(dataFilePath.contains(".carbondata"))) {
    +      CarbonFile[] carbonFiles = FileFactory
    +          .getCarbonFile(path)
    +          .listFiles(new CarbonFileFilter() {
    +            @Override
    +            public boolean accept(CarbonFile file) {
    +              if (file == null) {
    +                return false;
    +              }
    +              return file.getName().endsWith(".carbondata");
    +            }
    +          });
    +      if (carbonFiles == null || carbonFiles.length < 1) {
    +        throw new RuntimeException("Carbon data file not exists.");
    +      }
    +      dataFilePath = carbonFiles[0].getAbsolutePath();
    --- End diff --
   
    ok, I add ReadSchemaFromFirstDataFile and ReadSchemaFromFirstIndexFile


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1148/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1359/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9413/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1365/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1158/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9420/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1374/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1161/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9425/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    @CI pass, please check again.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2804#discussion_r229583983
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java ---
    @@ -64,11 +66,70 @@ public static Schema readSchemaInSchemaFile(String schemaFilePath) throws IOExce
       /**
        * Read carbondata file and return the schema
        *
    -   * @param dataFilePath complete path including carbondata file name
    +   * @param path carbondata store path
        * @return Schema object
        * @throws IOException
        */
    -  public static Schema readSchemaInDataFile(String dataFilePath) throws IOException {
    +  public static Schema readSchemaFromFirstDataFile(String path) throws IOException {
    +    String dataFilePath = getFirstCarbonDataFile(path);
    +    return readSchemaInDataFile(dataFilePath);
    +  }
    +
    +  /**
    +   * get first carbondata file in path and don't check all files schema
    +   *
    +   * @param path carbondata file path
    +   * @return first carbondata file name
    +   */
    +  public static String getFirstCarbonDataFile(String path) {
    --- End diff --
   
    I have already suggested to keep getFirstCarbonFile(path, extension) -- this only will give data or index file based on the extension.
   
    no need to have duplicate code for both index and data file


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2804#discussion_r229608660
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java ---
    @@ -64,11 +66,70 @@ public static Schema readSchemaInSchemaFile(String schemaFilePath) throws IOExce
       /**
        * Read carbondata file and return the schema
        *
    -   * @param dataFilePath complete path including carbondata file name
    +   * @param path carbondata store path
        * @return Schema object
        * @throws IOException
        */
    -  public static Schema readSchemaInDataFile(String dataFilePath) throws IOException {
    +  public static Schema readSchemaFromFirstDataFile(String path) throws IOException {
    +    String dataFilePath = getFirstCarbonDataFile(path);
    +    return readSchemaInDataFile(dataFilePath);
    +  }
    +
    +  /**
    +   * get first carbondata file in path and don't check all files schema
    +   *
    +   * @param path carbondata file path
    +   * @return first carbondata file name
    +   */
    +  public static String getFirstCarbonDataFile(String path) {
    --- End diff --
   
    ok, misunderstand , sorry。
    Updated


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9448/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1401/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9453/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    @ajantha-bhat @kunal642 @KanakaKumar  Please check it again.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    I think we can enhance the existing APIs itself to take folder path  and then list from the first available data file or index file.. Adding many APIs in SDK will cause confusion for same functionality. I think we can unify to single method like getSchemaFromPath() and return the schema irrespective of index file or data file or multiple subfolders.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2804: [CARBONDATA-2996] CarbonSchemaReader support read sc...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/2804
 
    @KanakaKumar ok, I unify the interface to readSchema
   
     [CARBONDATA-2996] CarbonSchemaReader support read schema from folder path
        1.Deprecated readSchemaInIndexFile and readSchemaInDataFile, unify them to readSchema
        2.Deprecated readSchemaInSchemaFile
        3.readSchema support read schema from folder path,carbonindex file, and carbondata file. and user can decide whether check all files schema
   



---
12345