Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

Classic

List

30 messages Options

Options

12

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/2356

[CARBONDATA-2566] Optimize CarbonReaderExample

Optimize CarbonReaderExample
1.Add different data type, including date and timestamp
2. update the doc
3.invoke the
Schema schema = CarbonSchemaReader
.readSchemaInSchemaFile(dataFiles[0].getAbsolutePath())
.asOriginOrder();

- [ ] Any interfaces changed?
No
- [ ] Any backward compatibility impacted?
No
- [ ] Document update required?
Yes, updated
- [ ] Testing done
update the example
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
No

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata CARBONDATA-2566-OptimizeCarbonReaderExample

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2356.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2356

----
commit 98b78adcb76f8aa13fb6985cd890ae9c5a1a7488
Author: xubo245 <xubo29@...>
Date: 2018-05-31T07:52:57Z

[CARBONDATA-2566] Optimize CarbonReaderExample

----

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2356

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6204/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2356

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5042/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2356

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5176/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2356

LGTM

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2356

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6208/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2356

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5177/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2356

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5046/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2356

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5178/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2356

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6212/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2356

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5050/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2356

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5181/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2356

retest sdv please

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2356

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5186/

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2356

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5189/

---

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2356#discussion_r192281801

--- Diff: docs/sdk-guide.md ---
@@ -408,17 +408,30 @@ External client can make use of this reader to read CarbonData files without Car
String path = "./testWriteFiles";
CarbonReader reader = CarbonReader
.builder(path, "_temp")
- .projection(new String[]{"name", "age"})
+ .projection(new String[]{"stringField", "shortField", "intField", "longField",
+ "doubleField", "boolField", "dateField", "timeField", "decimalField"})
.build();

// 2. Read data
+ long day = 24L * 3600 * 1000;
int i = 0;
while (reader.hasNext()) {
- Object[] row = (Object[]) reader.readNextRow();
- System.out.println(row[0] + "\t" + row[1]);
- i++;
+ Object[] row = (Object[]) reader.readNextRow();
+ System.out.println(
+ i + ":\t" +
--- End diff --

change to use `String.format`

---

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2356#discussion_r192281824

--- Diff: examples/spark2/src/main/java/org/apache/carbondata/examples/sdk/CarbonReaderExample.java ---
@@ -39,36 +42,89 @@ public static void main(String[] args) {
try {
FileUtils.deleteDirectory(new File(path));

- Field[] fields = new Field[2];
- fields[0] = new Field("name", DataTypes.STRING);
- fields[1] = new Field("age", DataTypes.INT);
+ Field[] fields = new Field[9];
+ fields[0] = new Field("stringField", DataTypes.STRING);
+ fields[1] = new Field("shortField", DataTypes.SHORT);
+ fields[2] = new Field("intField", DataTypes.INT);
+ fields[3] = new Field("longField", DataTypes.LONG);
+ fields[4] = new Field("doubleField", DataTypes.DOUBLE);
+ fields[5] = new Field("boolField", DataTypes.BOOLEAN);
+ fields[6] = new Field("dateField", DataTypes.DATE);
+ fields[7] = new Field("timeField", DataTypes.TIMESTAMP);
+ fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 2));

CarbonWriter writer = CarbonWriter.builder()
- .outputPath(path)
- .persistSchemaFile(true)
- .buildWriterForCSVInput(new Schema(fields));
+ .outputPath(path)
+ .buildWriterForCSVInput(new Schema(fields));

for (int i = 0; i < 10; i++) {
- writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ String[] row2 = new String[]{
+ "robot" + (i % 10),
+ String.valueOf(i),
+ String.valueOf(i),
+ String.valueOf(Long.MAX_VALUE - i),
+ String.valueOf((double) i / 2),
+ String.valueOf(true),
+ "2019-03-02",
+ "2019-02-12 03:03:34",
+ "12.345"
+ };
+ writer.write(row2);
}
writer.close();

+ File[] dataFiles = new File(path).listFiles(new FilenameFilter() {
+ @Override
+ public boolean accept(File dir, String name) {
+ if (name == null) {
+ return false;
+ }
+ return name.endsWith("carbonindex");
+ }
+ });
+ if (dataFiles == null || dataFiles.length < 1) {
+ throw new RuntimeException("Carbon index file not exists.");
+ }
+ Schema schema = CarbonSchemaReader
+ .readSchemaInIndexFile(dataFiles[0].getAbsolutePath())
+ .asOriginOrder();
+ // Transform the schema
+ String[] strings = new String[schema.getFields().length];
+ for (int i = 0; i < schema.getFields().length; i++) {
+ strings[i] = (schema.getFields())[i].getFieldName();
+ }
+
// Read data
CarbonReader reader = CarbonReader
- .builder(path, "_temp")
- .projection(new String[]{"name", "age"})
- .build();
+ .builder(path, "_temp")
+ .projection(strings)
+ .build();

System.out.println("\nData:");
+ long day = 24L * 3600 * 1000;
+ int i = 0;
while (reader.hasNext()) {
Object[] row = (Object[]) reader.readNextRow();
- System.out.println(row[0] + " " + row[1]);
+ System.out.println(
+ i + ":\t" +
--- End diff --

change to use `String.format`

---

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

In reply to this post by qiuchenjian-2

Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2356#discussion_r192284339

--- Diff: examples/spark2/src/main/java/org/apache/carbondata/examples/sdk/CarbonReaderExample.java ---
@@ -39,36 +42,89 @@ public static void main(String[] args) {
try {
FileUtils.deleteDirectory(new File(path));

- Field[] fields = new Field[2];
- fields[0] = new Field("name", DataTypes.STRING);
- fields[1] = new Field("age", DataTypes.INT);
+ Field[] fields = new Field[9];
+ fields[0] = new Field("stringField", DataTypes.STRING);
+ fields[1] = new Field("shortField", DataTypes.SHORT);
+ fields[2] = new Field("intField", DataTypes.INT);
+ fields[3] = new Field("longField", DataTypes.LONG);
+ fields[4] = new Field("doubleField", DataTypes.DOUBLE);
+ fields[5] = new Field("boolField", DataTypes.BOOLEAN);
+ fields[6] = new Field("dateField", DataTypes.DATE);
+ fields[7] = new Field("timeField", DataTypes.TIMESTAMP);
+ fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 2));

CarbonWriter writer = CarbonWriter.builder()
- .outputPath(path)
- .persistSchemaFile(true)
- .buildWriterForCSVInput(new Schema(fields));
+ .outputPath(path)
+ .buildWriterForCSVInput(new Schema(fields));

for (int i = 0; i < 10; i++) {
- writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ String[] row2 = new String[]{
+ "robot" + (i % 10),
+ String.valueOf(i),
+ String.valueOf(i),
+ String.valueOf(Long.MAX_VALUE - i),
+ String.valueOf((double) i / 2),
+ String.valueOf(true),
+ "2019-03-02",
+ "2019-02-12 03:03:34",
+ "12.345"
+ };
+ writer.write(row2);
}
writer.close();

+ File[] dataFiles = new File(path).listFiles(new FilenameFilter() {
+ @Override
+ public boolean accept(File dir, String name) {
+ if (name == null) {
+ return false;
+ }
+ return name.endsWith("carbonindex");
+ }
+ });
+ if (dataFiles == null || dataFiles.length < 1) {
+ throw new RuntimeException("Carbon index file not exists.");
+ }
+ Schema schema = CarbonSchemaReader
+ .readSchemaInIndexFile(dataFiles[0].getAbsolutePath())
+ .asOriginOrder();
+ // Transform the schema
+ String[] strings = new String[schema.getFields().length];
+ for (int i = 0; i < schema.getFields().length; i++) {
+ strings[i] = (schema.getFields())[i].getFieldName();
+ }
+
// Read data
CarbonReader reader = CarbonReader
- .builder(path, "_temp")
- .projection(new String[]{"name", "age"})
- .build();
+ .builder(path, "_temp")
+ .projection(strings)
+ .build();

System.out.println("\nData:");
+ long day = 24L * 3600 * 1000;
+ int i = 0;
while (reader.hasNext()) {
Object[] row = (Object[]) reader.readNextRow();
- System.out.println(row[0] + " " + row[1]);
+ System.out.println(
+ i + ":\t" +
--- End diff --

ok,done

---

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

In reply to this post by qiuchenjian-2

Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2356#discussion_r192284737

--- Diff: docs/sdk-guide.md ---
@@ -408,17 +408,30 @@ External client can make use of this reader to read CarbonData files without Car
String path = "./testWriteFiles";
CarbonReader reader = CarbonReader
.builder(path, "_temp")
- .projection(new String[]{"name", "age"})
+ .projection(new String[]{"stringField", "shortField", "intField", "longField",
+ "doubleField", "boolField", "dateField", "timeField", "decimalField"})
.build();

// 2. Read data
+ long day = 24L * 3600 * 1000;
int i = 0;
while (reader.hasNext()) {
- Object[] row = (Object[]) reader.readNextRow();
- System.out.println(row[0] + "\t" + row[1]);
- i++;
+ Object[] row = (Object[]) reader.readNextRow();
+ System.out.println(
+ i + ":\t" +
--- End diff --

ok, done

---

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2356

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6217/

---

12