[GitHub] carbondata pull request #1297: [WIP] Added a value based compression for dec...

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/114/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/118/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder1/15/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder1/17/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/741/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/743/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138915973
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/DecimalColumnPage.java ---
    @@ -0,0 +1,96 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
    +
    +/**
    + * Represent a columnar data in one page for one column of decimal data type
    + */
    +public abstract class DecimalColumnPage extends VarLengthColumnPageBase {
    +
    +  /**
    +   * decimal converter instance
    +   */
    +  DecimalConverterFactory.DecimalConverter decimalConverter;
    +
    +  DecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize) {
    +    super(columnSpec, dataType, pageSize);
    +    decimalConverter = DecimalConverterFactory.INSTANCE
    +        .getDecimalConverter(columnSpec.getPrecision(), columnSpec.getScale());
    +  }
    +
    +  public DecimalConverterFactory.DecimalConverter getDecimalConverter() {
    +    return decimalConverter;
    +  }
    +
    +  @Override public byte[] getBytePage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public short[] getShortPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public byte[] getShortIntPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public int[] getIntPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public long[] getLongPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public float[] getFloatPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public double[] getDoublePage() {
    --- End diff --
   
    Can you move all `@Override` to previous line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138916072
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/LazyColumnPage.java ---
    @@ -91,9 +93,24 @@ public float getFloat(int rowId) {
         throw new UnsupportedOperationException("internal error");
       }
     
    -  @Override
    -  public BigDecimal getDecimal(int rowId) {
    -    return columnPage.getDecimal(rowId);
    +  @Override public BigDecimal getDecimal(int rowId) {
    --- End diff --
   
    Can you move all @Override to previous line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    @jackylk ..handled review comments


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138931778
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeDecimalColumnPage.java ---
    @@ -0,0 +1,274 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import java.math.BigDecimal;
    +
    +import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.ByteUtil;
    +
    +/**
    + * Represents a columnar data for decimal data type column for one page
    + */
    +public class UnsafeDecimalColumnPage extends DecimalColumnPage {
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize)
    +      throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    capacity = (int) (pageSize * DEFAULT_ROW_SIZE * FACTOR);
    +    initMemory();
    +  }
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize,
    +      int capacity) throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    this.capacity = capacity;
    +    initMemory();
    +  }
    +
    +  private void initMemory() throws MemoryException {
    +    switch (dataType) {
    +      case BYTE:
    +      case SHORT:
    +      case INT:
    +      case LONG:
    +        int size = pageSize << dataType.getSizeBits();
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case SHORT_INT:
    +        size = pageSize * 3;
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case DECIMAL:
    +      case STRING:
    --- End diff --
   
    It can not be String, right?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138936085
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java ---
    @@ -22,21 +22,48 @@
     import java.util.List;
     
     import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryBlock;
     import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
     import org.apache.carbondata.core.metadata.datatype.DataType;
     import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
     import org.apache.carbondata.core.util.ByteUtil;
    +import org.apache.carbondata.core.util.ThreadLocalTaskInfo;
     
    +import static org.apache.carbondata.core.metadata.datatype.DataType.BYTE;
     import static org.apache.carbondata.core.metadata.datatype.DataType.DECIMAL;
     
     public abstract class VarLengthColumnPageBase extends ColumnPage {
     
    +  static final int byteBits = BYTE.getSizeBits();
    +  static final int shortBits = DataType.SHORT.getSizeBits();
    +  static final int intBits = DataType.INT.getSizeBits();
    +  static final int longBits = DataType.LONG.getSizeBits();
    +  // default size for each row, grows as needed
    +  static final int DEFAULT_ROW_SIZE = 8;
    +
    +  static final double FACTOR = 1.25;
    +
    +  final long taskId = ThreadLocalTaskInfo.getCarbonTaskInfo().getTaskId();
    +
    +  // memory allocated by Unsafe
    +  MemoryBlock memoryBlock;
    --- End diff --
   
    This is for Unsafe column page only, right


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/775/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138941549
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java ---
    @@ -22,21 +22,48 @@
     import java.util.List;
     
     import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryBlock;
     import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
     import org.apache.carbondata.core.metadata.datatype.DataType;
     import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
     import org.apache.carbondata.core.util.ByteUtil;
    +import org.apache.carbondata.core.util.ThreadLocalTaskInfo;
     
    +import static org.apache.carbondata.core.metadata.datatype.DataType.BYTE;
     import static org.apache.carbondata.core.metadata.datatype.DataType.DECIMAL;
     
     public abstract class VarLengthColumnPageBase extends ColumnPage {
     
    +  static final int byteBits = BYTE.getSizeBits();
    +  static final int shortBits = DataType.SHORT.getSizeBits();
    +  static final int intBits = DataType.INT.getSizeBits();
    +  static final int longBits = DataType.LONG.getSizeBits();
    +  // default size for each row, grows as needed
    +  static final int DEFAULT_ROW_SIZE = 8;
    +
    +  static final double FACTOR = 1.25;
    +
    +  final long taskId = ThreadLocalTaskInfo.getCarbonTaskInfo().getTaskId();
    +
    +  // memory allocated by Unsafe
    +  MemoryBlock memoryBlock;
    --- End diff --
   
    SafeDecimalColumnPage  will handle both fixed and variable implementation of decimal data based on precsion. So it is also required to be extended from VarLengthColumnPageBase


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138941633
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java ---
    @@ -22,21 +22,48 @@
     import java.util.List;
     
     import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryBlock;
     import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
     import org.apache.carbondata.core.metadata.datatype.DataType;
     import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
     import org.apache.carbondata.core.util.ByteUtil;
    +import org.apache.carbondata.core.util.ThreadLocalTaskInfo;
     
    +import static org.apache.carbondata.core.metadata.datatype.DataType.BYTE;
     import static org.apache.carbondata.core.metadata.datatype.DataType.DECIMAL;
     
     public abstract class VarLengthColumnPageBase extends ColumnPage {
     
    +  static final int byteBits = BYTE.getSizeBits();
    +  static final int shortBits = DataType.SHORT.getSizeBits();
    +  static final int intBits = DataType.INT.getSizeBits();
    +  static final int longBits = DataType.LONG.getSizeBits();
    +  // default size for each row, grows as needed
    +  static final int DEFAULT_ROW_SIZE = 8;
    +
    +  static final double FACTOR = 1.25;
    +
    +  final long taskId = ThreadLocalTaskInfo.getCarbonTaskInfo().getTaskId();
    +
    +  // memory allocated by Unsafe
    +  MemoryBlock memoryBlock;
    --- End diff --
   
    SafeDecimalColumnPage  will handle both fixed and variable implementation of decimal data based on precsion. So it is also required to be extended from VarLengthColumnPageBase


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138941702
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeDecimalColumnPage.java ---
    @@ -0,0 +1,274 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import java.math.BigDecimal;
    +
    +import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.ByteUtil;
    +
    +/**
    + * Represents a columnar data for decimal data type column for one page
    + */
    +public class UnsafeDecimalColumnPage extends DecimalColumnPage {
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize)
    +      throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    capacity = (int) (pageSize * DEFAULT_ROW_SIZE * FACTOR);
    +    initMemory();
    +  }
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize,
    +      int capacity) throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    this.capacity = capacity;
    +    initMemory();
    +  }
    +
    +  private void initMemory() throws MemoryException {
    +    switch (dataType) {
    +      case BYTE:
    +      case SHORT:
    +      case INT:
    +      case LONG:
    +        int size = pageSize << dataType.getSizeBits();
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case SHORT_INT:
    +        size = pageSize * 3;
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case DECIMAL:
    +      case STRING:
    --- End diff --
   
    yes it will not be string..I will remove


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QACarbonData commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/27/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/779/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/164/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/793/



---
123