[jira] [Commented] (CARBONDATA-296) 1.Add CSVInputFormat to read csv files.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-296) 1.Add CSVInputFormat to read csv files.

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15574283#comment-15574283 ]

ASF GitHub Bot commented on CARBONDATA-296:
-------------------------------------------

Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/233#discussion_r83359938
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/util/CSVInputFormatUtil.java ---
    @@ -0,0 +1,57 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +package org.apache.carbondata.hadoop.util;
    +
    +import com.univocity.parsers.csv.CsvParserSettings;
    +import org.apache.hadoop.conf.Configuration;
    +
    +/**
    + * CSVInputFormatUtil is a util class.
    + */
    +public class CSVInputFormatUtil {
    +
    +  public static final String DELIMITER = "carbon.csvinputformat.delimiter";
    +  public static final String DELIMITER_DEFAULT = ",";
    +  public static final String COMMENT = "carbon.csvinputformat.comment";
    +  public static final String COMMENT_DEFAULT = "#";
    +  public static final String QUOTE = "carbon.csvinputformat.quote";
    +  public static final String QUOTE_DEFAULT = "\"";
    +  public static final String ESCAPE = "carbon.csvinputformat.escape";
    +  public static final String ESCAPE_DEFAULT = "\\";
    +  public static final String HEADER_PRESENT = "caron.csvinputformat.header.present";
    +  public static final boolean HEADER_PRESENT_DEFAULT = false;
    +
    +  public static CsvParserSettings extractCsvParserSettings(Configuration job, long start) {
    --- End diff --
   
    I think this class is not needed, move this function into CSVRecordReader class


> 1.Add CSVInputFormat to read csv files.
> ---------------------------------------
>
>                 Key: CARBONDATA-296
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-296
>             Project: CarbonData
>          Issue Type: Sub-task
>            Reporter: Ravindra Pesala
>            Assignee: QiangCai
>             Fix For: 0.2.0-incubating
>
>
> Add CSVInputFormat to read csv files, it should use Univocity parser to read csv files to get optimal performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)