[ https://issues.apache.org/jira/browse/CARBONDATA-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614075#comment-15614075 ] ASF GitHub Bot commented on CARBONDATA-284: ------------------------------------------- Github user QiangCai commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/208#discussion_r85463723 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/internal/segment/Segment.java --- @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.carbondata.hadoop.internal.segment; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.carbondata.scan.filter.resolver.FilterResolverIntf; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.mapreduce.InputSplit; +import org.apache.hadoop.mapreduce.JobContext; + +/** + * Within a carbon table, each data load becomes one Segment, which stores all data files belong to this load in + * the segment folder. + */ +public abstract class Segment { + + protected String id; + + /** + * Path of the segment folder + */ + private String path; + + public Segment(String id, String path) { + this.id = id; + this.path = path; + } + + public String getId() { + return id; + } + + public String getPath() { + return path; + } + + /** + * return all InputSplit of this segment, each file is a InputSplit + * @param job job context + * @return all InputSplit + * @throws IOException + */ + public List<InputSplit> getAllSplits(JobContext job) throws IOException { --- End diff -- I suggest to return List<CarbonInputSplit> > Abstracting Index and Segment interface > --------------------------------------- > > Key: CARBONDATA-284 > URL: https://issues.apache.org/jira/browse/CARBONDATA-284 > Project: CarbonData > Issue Type: Improvement > Components: hadoop-integration > Affects Versions: 0.1.0-incubating > Reporter: Jacky Li > Fix For: 0.3.0-incubating > > > This issue is intended to abstract developer API and user API to achieve following goals: > Goal 1: User can choose the place to store Index data, it can be stored in > processing framework's memory space (like in spark driver memory) or in > another service outside of the processing framework (like using a > independent database service, which can be shared across client) > Goal 2: Developer can add more index of his choice to CarbonData files. > Besides B+ tree on multi-dimensional key which current CarbonData supports, > developers are free to add other indexing technology to make certain > workload faster. These new indices should be added in a pluggable way. > This Jira has been discussed in maillist: > http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Abstracting-CarbonData-s-Index-Interface-td1587.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) |
Free forum by Nabble | Edit this page |