GitHub user xubo245 opened a pull request:
https://github.com/apache/carbondata/pull/2356 [CARBONDATA-2566] Optimize CarbonReaderExample Optimize CarbonReaderExample 1.Add different data type, including date and timestamp 2. update the doc 3.invoke the Schema schema = CarbonSchemaReader .readSchemaInSchemaFile(dataFiles[0].getAbsolutePath()) .asOriginOrder(); - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? Yes, updated - [ ] Testing done update the example - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. No You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CARBONDATA-2566-OptimizeCarbonReaderExample Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2356.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2356 ---- commit 98b78adcb76f8aa13fb6985cd890ae9c5a1a7488 Author: xubo245 <xubo29@...> Date: 2018-05-31T07:52:57Z [CARBONDATA-2566] Optimize CarbonReaderExample ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2356 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6204/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2356 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5042/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2356 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5176/ --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on the issue:
https://github.com/apache/carbondata/pull/2356 LGTM --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2356 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6208/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2356 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5177/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2356 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5046/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2356 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5178/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2356 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6212/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2356 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5050/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2356 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5181/ --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2356 retest sdv please --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2356 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5186/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2356 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5189/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2356#discussion_r192281801 --- Diff: docs/sdk-guide.md --- @@ -408,17 +408,30 @@ External client can make use of this reader to read CarbonData files without Car String path = "./testWriteFiles"; CarbonReader reader = CarbonReader .builder(path, "_temp") - .projection(new String[]{"name", "age"}) + .projection(new String[]{"stringField", "shortField", "intField", "longField", + "doubleField", "boolField", "dateField", "timeField", "decimalField"}) .build(); // 2. Read data + long day = 24L * 3600 * 1000; int i = 0; while (reader.hasNext()) { - Object[] row = (Object[]) reader.readNextRow(); - System.out.println(row[0] + "\t" + row[1]); - i++; + Object[] row = (Object[]) reader.readNextRow(); + System.out.println( + i + ":\t" + --- End diff -- change to use `String.format` --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2356#discussion_r192281824 --- Diff: examples/spark2/src/main/java/org/apache/carbondata/examples/sdk/CarbonReaderExample.java --- @@ -39,36 +42,89 @@ public static void main(String[] args) { try { FileUtils.deleteDirectory(new File(path)); - Field[] fields = new Field[2]; - fields[0] = new Field("name", DataTypes.STRING); - fields[1] = new Field("age", DataTypes.INT); + Field[] fields = new Field[9]; + fields[0] = new Field("stringField", DataTypes.STRING); + fields[1] = new Field("shortField", DataTypes.SHORT); + fields[2] = new Field("intField", DataTypes.INT); + fields[3] = new Field("longField", DataTypes.LONG); + fields[4] = new Field("doubleField", DataTypes.DOUBLE); + fields[5] = new Field("boolField", DataTypes.BOOLEAN); + fields[6] = new Field("dateField", DataTypes.DATE); + fields[7] = new Field("timeField", DataTypes.TIMESTAMP); + fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 2)); CarbonWriter writer = CarbonWriter.builder() - .outputPath(path) - .persistSchemaFile(true) - .buildWriterForCSVInput(new Schema(fields)); + .outputPath(path) + .buildWriterForCSVInput(new Schema(fields)); for (int i = 0; i < 10; i++) { - writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)}); + String[] row2 = new String[]{ + "robot" + (i % 10), + String.valueOf(i), + String.valueOf(i), + String.valueOf(Long.MAX_VALUE - i), + String.valueOf((double) i / 2), + String.valueOf(true), + "2019-03-02", + "2019-02-12 03:03:34", + "12.345" + }; + writer.write(row2); } writer.close(); + File[] dataFiles = new File(path).listFiles(new FilenameFilter() { + @Override + public boolean accept(File dir, String name) { + if (name == null) { + return false; + } + return name.endsWith("carbonindex"); + } + }); + if (dataFiles == null || dataFiles.length < 1) { + throw new RuntimeException("Carbon index file not exists."); + } + Schema schema = CarbonSchemaReader + .readSchemaInIndexFile(dataFiles[0].getAbsolutePath()) + .asOriginOrder(); + // Transform the schema + String[] strings = new String[schema.getFields().length]; + for (int i = 0; i < schema.getFields().length; i++) { + strings[i] = (schema.getFields())[i].getFieldName(); + } + // Read data CarbonReader reader = CarbonReader - .builder(path, "_temp") - .projection(new String[]{"name", "age"}) - .build(); + .builder(path, "_temp") + .projection(strings) + .build(); System.out.println("\nData:"); + long day = 24L * 3600 * 1000; + int i = 0; while (reader.hasNext()) { Object[] row = (Object[]) reader.readNextRow(); - System.out.println(row[0] + " " + row[1]); + System.out.println( + i + ":\t" + --- End diff -- change to use `String.format` --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2356#discussion_r192284339 --- Diff: examples/spark2/src/main/java/org/apache/carbondata/examples/sdk/CarbonReaderExample.java --- @@ -39,36 +42,89 @@ public static void main(String[] args) { try { FileUtils.deleteDirectory(new File(path)); - Field[] fields = new Field[2]; - fields[0] = new Field("name", DataTypes.STRING); - fields[1] = new Field("age", DataTypes.INT); + Field[] fields = new Field[9]; + fields[0] = new Field("stringField", DataTypes.STRING); + fields[1] = new Field("shortField", DataTypes.SHORT); + fields[2] = new Field("intField", DataTypes.INT); + fields[3] = new Field("longField", DataTypes.LONG); + fields[4] = new Field("doubleField", DataTypes.DOUBLE); + fields[5] = new Field("boolField", DataTypes.BOOLEAN); + fields[6] = new Field("dateField", DataTypes.DATE); + fields[7] = new Field("timeField", DataTypes.TIMESTAMP); + fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 2)); CarbonWriter writer = CarbonWriter.builder() - .outputPath(path) - .persistSchemaFile(true) - .buildWriterForCSVInput(new Schema(fields)); + .outputPath(path) + .buildWriterForCSVInput(new Schema(fields)); for (int i = 0; i < 10; i++) { - writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)}); + String[] row2 = new String[]{ + "robot" + (i % 10), + String.valueOf(i), + String.valueOf(i), + String.valueOf(Long.MAX_VALUE - i), + String.valueOf((double) i / 2), + String.valueOf(true), + "2019-03-02", + "2019-02-12 03:03:34", + "12.345" + }; + writer.write(row2); } writer.close(); + File[] dataFiles = new File(path).listFiles(new FilenameFilter() { + @Override + public boolean accept(File dir, String name) { + if (name == null) { + return false; + } + return name.endsWith("carbonindex"); + } + }); + if (dataFiles == null || dataFiles.length < 1) { + throw new RuntimeException("Carbon index file not exists."); + } + Schema schema = CarbonSchemaReader + .readSchemaInIndexFile(dataFiles[0].getAbsolutePath()) + .asOriginOrder(); + // Transform the schema + String[] strings = new String[schema.getFields().length]; + for (int i = 0; i < schema.getFields().length; i++) { + strings[i] = (schema.getFields())[i].getFieldName(); + } + // Read data CarbonReader reader = CarbonReader - .builder(path, "_temp") - .projection(new String[]{"name", "age"}) - .build(); + .builder(path, "_temp") + .projection(strings) + .build(); System.out.println("\nData:"); + long day = 24L * 3600 * 1000; + int i = 0; while (reader.hasNext()) { Object[] row = (Object[]) reader.readNextRow(); - System.out.println(row[0] + " " + row[1]); + System.out.println( + i + ":\t" + --- End diff -- ok,done --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2356#discussion_r192284737 --- Diff: docs/sdk-guide.md --- @@ -408,17 +408,30 @@ External client can make use of this reader to read CarbonData files without Car String path = "./testWriteFiles"; CarbonReader reader = CarbonReader .builder(path, "_temp") - .projection(new String[]{"name", "age"}) + .projection(new String[]{"stringField", "shortField", "intField", "longField", + "doubleField", "boolField", "dateField", "timeField", "decimalField"}) .build(); // 2. Read data + long day = 24L * 3600 * 1000; int i = 0; while (reader.hasNext()) { - Object[] row = (Object[]) reader.readNextRow(); - System.out.println(row[0] + "\t" + row[1]); - i++; + Object[] row = (Object[]) reader.readNextRow(); + System.out.println( + i + ":\t" + --- End diff -- ok, done --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2356 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6217/ --- |
Free forum by Nabble | Edit this page |