[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

qiuchenjian-2
GitHub user xubo245 opened a pull request:

    https://github.com/apache/carbondata/pull/2356

    [CARBONDATA-2566] Optimize CarbonReaderExample

    Optimize CarbonReaderExample
    1.Add  different data type, including date and timestamp
    2. update the doc
    3.invoke the  
     Schema schema = CarbonSchemaReader
                    .readSchemaInSchemaFile(dataFiles[0].getAbsolutePath())
                    .asOriginOrder();
   
     - [ ] Any interfaces changed?
     No
     - [ ] Any backward compatibility impacted?
     No
     - [ ] Document update required?
    Yes, updated
     - [ ] Testing done
           update the example
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
    No

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xubo245/carbondata CARBONDATA-2566-OptimizeCarbonReaderExample

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2356.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2356
   
----
commit 98b78adcb76f8aa13fb6985cd890ae9c5a1a7488
Author: xubo245 <xubo29@...>
Date:   2018-05-31T07:52:57Z

    [CARBONDATA-2566] Optimize CarbonReaderExample

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6204/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5042/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5176/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6208/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5177/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5046/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5178/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6212/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5050/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5181/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    retest sdv please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5186/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5189/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2356#discussion_r192281801
 
    --- Diff: docs/sdk-guide.md ---
    @@ -408,17 +408,30 @@ External client can make use of this reader to read CarbonData files without Car
         String path = "./testWriteFiles";
         CarbonReader reader = CarbonReader
             .builder(path, "_temp")
    -        .projection(new String[]{"name", "age"})
    +        .projection(new String[]{"stringField", "shortField", "intField", "longField",
    +                "doubleField", "boolField", "dateField", "timeField", "decimalField"})
             .build();
     
         // 2. Read data
    +    long day = 24L * 3600 * 1000;
         int i = 0;
         while (reader.hasNext()) {
    -      Object[] row = (Object[]) reader.readNextRow();
    -      System.out.println(row[0] + "\t" + row[1]);
    -      i++;
    +        Object[] row = (Object[]) reader.readNextRow();
    +        System.out.println(
    +            i + ":\t" +
    --- End diff --
   
    change to use `String.format`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2356#discussion_r192281824
 
    --- Diff: examples/spark2/src/main/java/org/apache/carbondata/examples/sdk/CarbonReaderExample.java ---
    @@ -39,36 +42,89 @@ public static void main(String[] args) {
             try {
                 FileUtils.deleteDirectory(new File(path));
     
    -            Field[] fields = new Field[2];
    -            fields[0] = new Field("name", DataTypes.STRING);
    -            fields[1] = new Field("age", DataTypes.INT);
    +            Field[] fields = new Field[9];
    +            fields[0] = new Field("stringField", DataTypes.STRING);
    +            fields[1] = new Field("shortField", DataTypes.SHORT);
    +            fields[2] = new Field("intField", DataTypes.INT);
    +            fields[3] = new Field("longField", DataTypes.LONG);
    +            fields[4] = new Field("doubleField", DataTypes.DOUBLE);
    +            fields[5] = new Field("boolField", DataTypes.BOOLEAN);
    +            fields[6] = new Field("dateField", DataTypes.DATE);
    +            fields[7] = new Field("timeField", DataTypes.TIMESTAMP);
    +            fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 2));
     
                 CarbonWriter writer = CarbonWriter.builder()
    -                    .outputPath(path)
    -                    .persistSchemaFile(true)
    -                    .buildWriterForCSVInput(new Schema(fields));
    +                .outputPath(path)
    +                .buildWriterForCSVInput(new Schema(fields));
     
                 for (int i = 0; i < 10; i++) {
    -                writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
    +                String[] row2 = new String[]{
    +                    "robot" + (i % 10),
    +                    String.valueOf(i),
    +                    String.valueOf(i),
    +                    String.valueOf(Long.MAX_VALUE - i),
    +                    String.valueOf((double) i / 2),
    +                    String.valueOf(true),
    +                    "2019-03-02",
    +                    "2019-02-12 03:03:34",
    +                    "12.345"
    +                };
    +                writer.write(row2);
                 }
                 writer.close();
     
    +            File[] dataFiles = new File(path).listFiles(new FilenameFilter() {
    +                @Override
    +                public boolean accept(File dir, String name) {
    +                    if (name == null) {
    +                        return false;
    +                    }
    +                    return name.endsWith("carbonindex");
    +                }
    +            });
    +            if (dataFiles == null || dataFiles.length < 1) {
    +                throw new RuntimeException("Carbon index file not exists.");
    +            }
    +            Schema schema = CarbonSchemaReader
    +                .readSchemaInIndexFile(dataFiles[0].getAbsolutePath())
    +                .asOriginOrder();
    +            // Transform the schema
    +            String[] strings = new String[schema.getFields().length];
    +            for (int i = 0; i < schema.getFields().length; i++) {
    +                strings[i] = (schema.getFields())[i].getFieldName();
    +            }
    +
                 // Read data
                 CarbonReader reader = CarbonReader
    -                    .builder(path, "_temp")
    -                    .projection(new String[]{"name", "age"})
    -                    .build();
    +                .builder(path, "_temp")
    +                .projection(strings)
    +                .build();
     
                 System.out.println("\nData:");
    +            long day = 24L * 3600 * 1000;
    +            int i = 0;
                 while (reader.hasNext()) {
                     Object[] row = (Object[]) reader.readNextRow();
    -                System.out.println(row[0] + " " + row[1]);
    +                System.out.println(
    +                    i + ":\t" +
    --- End diff --
   
    change to use `String.format`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2356#discussion_r192284339
 
    --- Diff: examples/spark2/src/main/java/org/apache/carbondata/examples/sdk/CarbonReaderExample.java ---
    @@ -39,36 +42,89 @@ public static void main(String[] args) {
             try {
                 FileUtils.deleteDirectory(new File(path));
     
    -            Field[] fields = new Field[2];
    -            fields[0] = new Field("name", DataTypes.STRING);
    -            fields[1] = new Field("age", DataTypes.INT);
    +            Field[] fields = new Field[9];
    +            fields[0] = new Field("stringField", DataTypes.STRING);
    +            fields[1] = new Field("shortField", DataTypes.SHORT);
    +            fields[2] = new Field("intField", DataTypes.INT);
    +            fields[3] = new Field("longField", DataTypes.LONG);
    +            fields[4] = new Field("doubleField", DataTypes.DOUBLE);
    +            fields[5] = new Field("boolField", DataTypes.BOOLEAN);
    +            fields[6] = new Field("dateField", DataTypes.DATE);
    +            fields[7] = new Field("timeField", DataTypes.TIMESTAMP);
    +            fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 2));
     
                 CarbonWriter writer = CarbonWriter.builder()
    -                    .outputPath(path)
    -                    .persistSchemaFile(true)
    -                    .buildWriterForCSVInput(new Schema(fields));
    +                .outputPath(path)
    +                .buildWriterForCSVInput(new Schema(fields));
     
                 for (int i = 0; i < 10; i++) {
    -                writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
    +                String[] row2 = new String[]{
    +                    "robot" + (i % 10),
    +                    String.valueOf(i),
    +                    String.valueOf(i),
    +                    String.valueOf(Long.MAX_VALUE - i),
    +                    String.valueOf((double) i / 2),
    +                    String.valueOf(true),
    +                    "2019-03-02",
    +                    "2019-02-12 03:03:34",
    +                    "12.345"
    +                };
    +                writer.write(row2);
                 }
                 writer.close();
     
    +            File[] dataFiles = new File(path).listFiles(new FilenameFilter() {
    +                @Override
    +                public boolean accept(File dir, String name) {
    +                    if (name == null) {
    +                        return false;
    +                    }
    +                    return name.endsWith("carbonindex");
    +                }
    +            });
    +            if (dataFiles == null || dataFiles.length < 1) {
    +                throw new RuntimeException("Carbon index file not exists.");
    +            }
    +            Schema schema = CarbonSchemaReader
    +                .readSchemaInIndexFile(dataFiles[0].getAbsolutePath())
    +                .asOriginOrder();
    +            // Transform the schema
    +            String[] strings = new String[schema.getFields().length];
    +            for (int i = 0; i < schema.getFields().length; i++) {
    +                strings[i] = (schema.getFields())[i].getFieldName();
    +            }
    +
                 // Read data
                 CarbonReader reader = CarbonReader
    -                    .builder(path, "_temp")
    -                    .projection(new String[]{"name", "age"})
    -                    .build();
    +                .builder(path, "_temp")
    +                .projection(strings)
    +                .build();
     
                 System.out.println("\nData:");
    +            long day = 24L * 3600 * 1000;
    +            int i = 0;
                 while (reader.hasNext()) {
                     Object[] row = (Object[]) reader.readNextRow();
    -                System.out.println(row[0] + " " + row[1]);
    +                System.out.println(
    +                    i + ":\t" +
    --- End diff --
   
    ok,done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2356: [CARBONDATA-2566] Optimize CarbonReaderExampl...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2356#discussion_r192284737
 
    --- Diff: docs/sdk-guide.md ---
    @@ -408,17 +408,30 @@ External client can make use of this reader to read CarbonData files without Car
         String path = "./testWriteFiles";
         CarbonReader reader = CarbonReader
             .builder(path, "_temp")
    -        .projection(new String[]{"name", "age"})
    +        .projection(new String[]{"stringField", "shortField", "intField", "longField",
    +                "doubleField", "boolField", "dateField", "timeField", "decimalField"})
             .build();
     
         // 2. Read data
    +    long day = 24L * 3600 * 1000;
         int i = 0;
         while (reader.hasNext()) {
    -      Object[] row = (Object[]) reader.readNextRow();
    -      System.out.println(row[0] + "\t" + row[1]);
    -      i++;
    +        Object[] row = (Object[]) reader.readNextRow();
    +        System.out.println(
    +            i + ":\t" +
    --- End diff --
   
    ok, done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2356
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6217/



---
12