[ https://issues.apache.org/jira/browse/CARBONDATA-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15612029#comment-15612029 ] ASF GitHub Bot commented on CARBONDATA-284: ------------------------------------------- Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/208#discussion_r85346673 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/internal/segment/impl/IndexedSegment.java --- @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.carbondata.hadoop.internal.segment.impl; + +import java.io.IOException; +import java.util.LinkedList; +import java.util.List; + +import org.apache.carbondata.hadoop.CarbonInputSplit; +import org.apache.carbondata.hadoop.internal.index.Block; +import org.apache.carbondata.hadoop.internal.segment.Segment; +import org.apache.carbondata.hadoop.internal.index.Index; +import org.apache.carbondata.hadoop.internal.index.IndexLoader; +import org.apache.carbondata.scan.filter.resolver.FilterResolverIntf; +import org.apache.hadoop.mapreduce.InputSplit; +import org.apache.hadoop.mapreduce.JobContext; + +/** + * This segment is backed by index, thus getSplits can use the index to do file pruning. + */ +public class IndexedSegment extends Segment { + + private IndexLoader loader; + + public IndexedSegment(String name, String path, IndexLoader loader) { + super(name, path); + this.loader = loader; + } + + @Override + public List<InputSplit> getSplits(JobContext job, FilterResolverIntf filterResolver) + throws IOException { + // do as following + // 1. create the index or get from cache by the filter name in the configuration + // 2. filter by index to get the filtered block + // 3. create input split from filtered block + + List<InputSplit> output = new LinkedList<>(); + Index index = loader.load(job.getConfiguration()); --- End diff -- does it required to load index every time? I guess we are just creating the instance of index here, so why don't you use factory here? > Abstracting Index and Segment interface > --------------------------------------- > > Key: CARBONDATA-284 > URL: https://issues.apache.org/jira/browse/CARBONDATA-284 > Project: CarbonData > Issue Type: Improvement > Components: hadoop-integration > Affects Versions: 0.1.0-incubating > Reporter: Jacky Li > Fix For: 0.3.0-incubating > > > This issue is intended to abstract developer API and user API to achieve following goals: > Goal 1: User can choose the place to store Index data, it can be stored in > processing framework's memory space (like in spark driver memory) or in > another service outside of the processing framework (like using a > independent database service, which can be shared across client) > Goal 2: Developer can add more index of his choice to CarbonData files. > Besides B+ tree on multi-dimensional key which current CarbonData supports, > developers are free to add other indexing technology to make certain > workload faster. These new indices should be added in a pluggable way. > This Jira has been discussed in maillist: > http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Abstracting-CarbonData-s-Index-Interface-td1587.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) |
Free forum by Nabble | Edit this page |