Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

Classic

List

70 messages Options

Options

1234

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r228123905

--- Diff: format/src/main/thrift/carbondata.thrift ---
@@ -206,6 +206,8 @@ struct FileFooter3{
4: optional list<BlockletInfo3> blocklet_info_list3; // Information about blocklets of all columns in this file for V3 format
5: optional dictionary.ColumnDictionaryChunk dictionary; // Blocklet local dictionary
6: optional bool is_sort; // True if the data is sorted in this file, it is used for compaction to decide whether to use merge sort or not
+ 7: optional string written_by; // written by is used to write who wrote the file, it can be LOAD, or SDK etc
--- End diff --

added a map

---

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r228123974

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCarbonFileInputFormatWithExternalCarbonTable.scala ---
@@ -56,7 +56,7 @@ class TestCarbonFileInputFormatWithExternalCarbonTable extends QueryTest with Be
val builder = CarbonWriter.builder()
val writer =
builder.outputPath(writerPath + "/Fact/Part0/Segment_null")
- .withCsvInput(Schema.parseJson(schema)).build()
+ .withCsvInput(Schema.parseJson(schema)).writtenBy("SDK").build()
--- End diff --

added classname for writtenby

---

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r228124017

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTable.scala ---
@@ -139,13 +139,13 @@ class TestNonTransactionalCarbonTable extends QueryTest with BeforeAndAfterAll {
.sortBy(sortColumns.toArray)
.uniqueIdentifier(
System.currentTimeMillis).withBlockSize(2).withLoadOptions(options)
- .withCsvInput(Schema.parseJson(schema)).build()
+ .withCsvInput(Schema.parseJson(schema)).writtenBy("SDK").build()
--- End diff --

changed

---

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r228124063

--- Diff: integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala ---
@@ -984,7 +984,7 @@ class SparkCarbonDataSourceTest extends FunSuite with BeforeAndAfterAll {
val writer =
builder.outputPath(path)
.uniqueIdentifier(System.nanoTime()).withBlockSize(2)
- .withCsvInput(new Schema(structType)).build()
+ .withCsvInput(new Schema(structType)).writtenBy("DataSource").build()
--- End diff --

ok

---

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r228124157

--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/CarbonDataLoadConfiguration.java ---
@@ -460,4 +464,20 @@ public String getColumnCompressor() {
public void setColumnCompressor(String columnCompressor) {
this.columnCompressor = columnCompressor;
}
+
+ public String getAppName() {
--- End diff --

handled as carbon property

---

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r228124241

--- Diff: format/src/main/thrift/carbondata.thrift ---
@@ -206,6 +206,7 @@ struct FileFooter3{
4: optional list<BlockletInfo3> blocklet_info_list3; // Information about blocklets of all columns in this file for V3 format
5: optional dictionary.ColumnDictionaryChunk dictionary; // Blocklet local dictionary
6: optional bool is_sort; // True if the data is sorted in this file, it is used for compaction to decide whether to use merge sort or not
+ 7: optional map<string, string> extra_info; // written by is used to write who wrote the file, it can be Aplication name, or SDK etc and version in which this carbondata file is written etc
--- End diff --

modified

---

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r228124274

--- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ---
@@ -371,8 +381,14 @@ public CarbonWriter build() throws IOException, InvalidLoadOptionException {
"Writer type is not set, use withCsvInput() or withAvroInput() or withJsonInput() "
+ "API based on input");
}
+ if (this.writtenByApp == null) {
--- End diff --

added

---

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on the issue:

https://github.com/apache/carbondata/pull/2829

> @akashrn5 Instead of changing many classes to pass writtenBy and appName can't we set to CarbonProperties and in writer step we can get from the same and write to thrift??

handled

---

[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

In reply to this post by qiuchenjian-2

Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/2829

LGTM

---

[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

In reply to this post by qiuchenjian-2

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2829

---

1234