[jira] [Commented] (CARBONDATA-297) 2. Add interfaces for data loading.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-297) 2. Add interfaces for data loading.

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571835#comment-15571835 ]

ASF GitHub Bot commented on CARBONDATA-297:
-------------------------------------------

Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/229#discussion_r83208961
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/DataLoadProcessorStep.java ---
    @@ -0,0 +1,40 @@
    +package org.apache.carbondata.processing.newflow;
    +
    +import java.util.Iterator;
    +
    +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException;
    +
    +/**
    + * This base interface for data loading. It can do transformation jobs as per the implementation.
    + *
    + */
    +public interface DataLoadProcessorStep {
    +
    +  /**
    +   * The output meta for this step. The data returns from this step is as per this meta.
    +   * @return
    +   */
    +  DataField[] getOutput();
    +
    +  /**
    +   * Intialization process for this step.
    +   * @param configuration
    +   * @param child
    +   * @throws CarbonDataLoadingException
    +   */
    +  void intialize(CarbonDataLoadConfiguration configuration, DataLoadProcessorStep child) throws
    +      CarbonDataLoadingException;
    +
    +  /**
    +   * Tranform the data as per the implemetation.
    +   * @return Iterator of data
    +   * @throws CarbonDataLoadingException
    +   */
    +  Iterator<Object[]> execute() throws CarbonDataLoadingException;
    --- End diff --
   
    I thought the SortStep is a singleton object within the executor, and if there are only one executor in one datanode, then the SortStep is sorting the data within datanode-scope, which is what we want. Synchronization means SortStep is thread-safe, so that multiple task can insert row into it.
    Does your desing look like this? Otherwise how you ensure data is sorting within datanode?



> 2. Add interfaces for data loading.
> -----------------------------------
>
>                 Key: CARBONDATA-297
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-297
>             Project: CarbonData
>          Issue Type: Sub-task
>            Reporter: Ravindra Pesala
>            Assignee: Ravindra Pesala
>             Fix For: 0.2.0-incubating
>
>
> Add the major interface classes for data loading so that the following jiras can use this interfaces to implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)