[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

classic Classic list List threaded Threaded
70 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
GitHub user akashrn5 opened a pull request:

    https://github.com/apache/carbondata/pull/2829

    [CARBONDATA-3025]add more metadata in carbon file footer

    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/akashrn5/incubator-carbondata integrate1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2829.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2829
   
----
commit 9b398144f83bf5378b4378ba1ddb47f50bf7a6da
Author: akashrn5 <akashnilugal@...>
Date:   2018-10-17T14:26:49Z

    add more metadata in carbon file footer

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/839/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1037/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9105/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/846/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9111/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1043/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226218717
 
    --- Diff: format/src/main/thrift/carbondata.thrift ---
    @@ -206,6 +206,8 @@ struct FileFooter3{
         4: optional list<BlockletInfo3> blocklet_info_list3; // Information about blocklets of all columns in this file for V3 format
         5: optional dictionary.ColumnDictionaryChunk dictionary; // Blocklet local dictionary
         6: optional bool is_sort; // True if the data is sorted in this file, it is used for compaction to decide whether to use merge sort or not
    +    7: optional string written_by; // written by is used to write who wrote the file, it can be LOAD, or SDK etc
    --- End diff --
   
    please add a map<string, string> property, and put these two new fields into this property map. We can extend any field by using this property map instead of adding more fields


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226218950
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCarbonFileInputFormatWithExternalCarbonTable.scala ---
    @@ -56,7 +56,7 @@ class TestCarbonFileInputFormatWithExternalCarbonTable extends QueryTest with Be
           val builder = CarbonWriter.builder()
           val writer =
             builder.outputPath(writerPath + "/Fact/Part0/Segment_null")
    -          .withCsvInput(Schema.parseJson(schema)).build()
    +          .withCsvInput(Schema.parseJson(schema)).writtenBy("SDK").build()
    --- End diff --
   
    ```suggestion
              .withCsvInput(Schema.parseJson(schema)).writtenBy("TestCarbonFileInputFormatWithExternalCarbonTable").build()
    ```


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226219140
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTable.scala ---
    @@ -139,13 +139,13 @@ class TestNonTransactionalCarbonTable extends QueryTest with BeforeAndAfterAll {
                 .sortBy(sortColumns.toArray)
                 .uniqueIdentifier(
                   System.currentTimeMillis).withBlockSize(2).withLoadOptions(options)
    -            .withCsvInput(Schema.parseJson(schema)).build()
    +            .withCsvInput(Schema.parseJson(schema)).writtenBy("SDK").build()
    --- End diff --
   
    ```suggestion
                .withCsvInput(Schema.parseJson(schema)).writtenBy("TestNonTransactionalCarbonTable").build()
    ```


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226219429
 
    --- Diff: integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala ---
    @@ -984,7 +984,7 @@ class SparkCarbonDataSourceTest extends FunSuite with BeforeAndAfterAll {
           val writer =
             builder.outputPath(path)
               .uniqueIdentifier(System.nanoTime()).withBlockSize(2)
    -          .withCsvInput(new Schema(structType)).build()
    +          .withCsvInput(new Schema(structType)).writtenBy("DataSource").build()
    --- End diff --
   
    use this class name instead


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226219753
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/CarbonDataLoadConfiguration.java ---
    @@ -460,4 +464,20 @@ public String getColumnCompressor() {
       public void setColumnCompressor(String columnCompressor) {
         this.columnCompressor = columnCompressor;
       }
    +
    +  public String getAppName() {
    --- End diff --
   
    It is better use the same name: writtenBy


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226219943
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/CarbonDataLoadConfiguration.java ---
    @@ -460,4 +464,20 @@ public String getColumnCompressor() {
       public void setColumnCompressor(String columnCompressor) {
         this.columnCompressor = columnCompressor;
       }
    +
    +  public String getAppName() {
    +    return appName;
    +  }
    +
    +  public void setAppName(String appName) {
    +    this.appName = appName;
    +  }
    +
    +  public String getVersion() {
    +    return version;
    +  }
    +
    +  public void setVersion(String version) {
    --- End diff --
   
    Is this carbon jar version? or application version? Application version should be given in the writtenBy


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226220780
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java ---
    @@ -76,6 +80,17 @@ public static Schema readSchemaInDataFile(String dataFilePath) throws IOExceptio
         return new Schema(columnSchemaList);
       }
     
    +  public static String getVersionDetails(String dataFilePath) throws IOException {
    --- End diff --
   
    Why add this method in SchemaReader? This information is from Footer, right?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226229213
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/CarbonDataLoadConfiguration.java ---
    @@ -460,4 +464,20 @@ public String getColumnCompressor() {
       public void setColumnCompressor(String columnCompressor) {
         this.columnCompressor = columnCompressor;
       }
    +
    +  public String getAppName() {
    +    return appName;
    +  }
    +
    +  public void setAppName(String appName) {
    +    this.appName = appName;
    +  }
    +
    +  public String getVersion() {
    +    return version;
    +  }
    +
    +  public void setVersion(String version) {
    --- End diff --
   
    it is carbon version


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2829#discussion_r226233891
 
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java ---
    @@ -76,6 +80,17 @@ public static Schema readSchemaInDataFile(String dataFilePath) throws IOExceptio
         return new Schema(columnSchemaList);
       }
     
    +  public static String getVersionDetails(String dataFilePath) throws IOException {
    --- End diff --
   
    yes this info is from footer, in SDK we expose API is schema reader right, i thought may be i can expose one more API there only to get the version details, will this be ok? or need to write this in new class for footer info? please suggest


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/858/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/860/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9124/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2829
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1058/



---
1234