[GitHub] carbondata pull request #2209: [WIP][Non Transactional Table]Avro Record Com...

classic Classic list List threaded Threaded
107 messages Options
123456
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184918753
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/ArrayDataType.java ---
    @@ -183,7 +187,27 @@ public void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutp
         for (int i = 0; i < dataLength; i++) {
           children.parseAndBitPack(byteArrayInput, dataOutputStream, generator);
         }
    +  }
    +
    +  @Override
    +  public int parseComplexValue(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    +      KeyGenerator[] generator, Boolean[][] complexDictionaryIndentification, int startOffset)
    +      throws IOException, KeyGenException {
    +    int dataLength = byteArrayInput.getInt();
    +    startOffset += Integer.SIZE / Byte.SIZE;
     
    +    dataOutputStream.writeInt(dataLength);
    +    if (children instanceof PrimitiveDataType) {
    +      if (children.getIsColumnDictionary()) {
    +        dataOutputStream.writeInt(generator[children.getSurrogateIndex()].getKeySizeInBytes());
    +        startOffset += Integer.SIZE / Byte.SIZE;
    --- End diff --
   
    No need to calculate every time , just use 4 and add comment


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184918803
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -245,6 +312,28 @@ public void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutp
         dataOutputStream.write(v);
       }
     
    +  @Override
    +  public int parseComplexValue(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    +      KeyGenerator[] generator, Boolean[][] complexDictionaryIndentification, int startOffset)
    --- End diff --
   
    Boolean[][] complexDictionaryIndentification is not required as each Primitive type field knows whether it is of dictionary/ no dictionary type.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184919017
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -245,6 +312,28 @@ public void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutp
         dataOutputStream.write(v);
       }
     
    +  @Override
    +  public int parseComplexValue(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    +      KeyGenerator[] generator, Boolean[][] complexDictionaryIndentification, int startOffset)
    +      throws IOException, KeyGenException {
    +    if (!this.isDictionary) {
    +      int sizeOfData = byteArrayInput.getInt();
    +      startOffset += Integer.SIZE / Byte.SIZE;
    --- End diff --
   
    No need to calculate every time , just use 4 and add comment , Do it in all places


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184919245
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java ---
    @@ -609,5 +616,13 @@ public short getWritingCoresCount() {
       public DataMapWriterListener getDataMapWriterlistener() {
         return dataMapWriterlistener;
       }
    +
    +  public Boolean[][] getComplexDictionaryFields() {
    --- End diff --
   
    change this to boolean[][] getComplexDictionaryFields()


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184919308
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -245,6 +312,28 @@ public void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutp
         dataOutputStream.write(v);
       }
     
    +  @Override
    +  public int parseComplexValue(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    +      KeyGenerator[] generator, Boolean[][] complexDictionaryIndentification, int startOffset)
    +      throws IOException, KeyGenException {
    +    if (!this.isDictionary) {
    +      int sizeOfData = byteArrayInput.getInt();
    +      startOffset += Integer.SIZE / Byte.SIZE;
    +      dataOutputStream.writeInt(sizeOfData);
    +      byte[] bb = new byte[sizeOfData];
    +      byteArrayInput.get(bb, 0, sizeOfData);
    +      dataOutputStream.write(bb);
    +      startOffset += sizeOfData;
    +    } else {
    +      int data = byteArrayInput.getInt();
    +      startOffset += Integer.SIZE / Byte.SIZE;
    --- End diff --
   
    I think this is not required, if it is only writeoffset


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184919435
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java ---
    @@ -267,14 +267,24 @@ private static String getComplexTypeString(DataField[] dataFields) {
         return dimString.toString();
       }
     
    +  private static String isDictionaryType(CarbonDimension dimension) {
    --- End diff --
   
    why this method return type is string??? Change this to string


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184919569
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -245,6 +312,28 @@ public void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutp
         dataOutputStream.write(v);
       }
     
    +  @Override
    +  public int parseComplexValue(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    +      KeyGenerator[] generator, Boolean[][] complexDictionaryIndentification, int startOffset)
    --- End diff --
   
    What is the use of `complexDictionaryIndentification` if not used


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184919620
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/CarbonDataLoadConfiguration.java ---
    @@ -348,6 +356,33 @@ public void setCardinalityFinder(DictionaryCardinalityFinder cardinalityFinder)
         return complexKeyGenerators;
       }
     
    +  public Boolean[][] createComplexDictionaryFieldIdentification() {
    --- End diff --
   
    If not required, please remove this method


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184919901
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java ---
    @@ -222,8 +222,9 @@ private void addComplexColumn(int index, int rowId, byte[] complexColumns) {
           ByteBuffer byteArrayInput = ByteBuffer.wrap(complexColumns);
           ByteArrayOutputStream byteArrayOutput = new ByteArrayOutputStream();
           DataOutputStream dataOutputStream = new DataOutputStream(byteArrayOutput);
    -      complexDataType.parseAndBitPack(byteArrayInput, dataOutputStream,
    -          model.getComplexDimensionKeyGenerator());
    +      int startOffset = 0;
    --- End diff --
   
    What is the use of `startOffset` when not used?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184920004
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/GenericDataType.java ---
    @@ -80,7 +82,13 @@ void writeByteArray(T input, DataOutputStream dataOutputStream)
        * @throws KeyGenException
        */
       void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    -      KeyGenerator[] generator) throws IOException, KeyGenException;
    +      KeyGenerator[] generator)
    +      throws IOException, KeyGenException;
    +
    +
    +  int parseComplexValue(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    +      KeyGenerator[] generator, Boolean[][] complexDictionaryIndentification, int startOffset)
    +      throws IOException, KeyGenException;
    --- End diff --
   
    Remove `complexDictionaryIndentification` and `startOffset` if not used


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2209: [CARBONDATA-2388][SDK]Avro Record Complex Type Imple...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2209
 
    @sounakr Please add testcases for complextype also not only in AVRO.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184927388
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/scan/complextypes/PrimitiveQueryType.java ---
    @@ -53,6 +55,21 @@ public PrimitiveQueryType(String name, String parentname, int blockIndex,
         this.name = name;
         this.parentname = parentname;
         this.isDirectDictionary = isDirectDictionary;
    +    this.isDictionary = true;
    +  }
    +
    +
    +  public PrimitiveQueryType(String name, String parentname, int blockIndex,
    --- End diff --
   
    Done. But we need the parameter as dictionary field is always set.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184927567
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/SchemaReader.java ---
    @@ -81,10 +81,13 @@ public static TableInfo getTableInfo(AbsoluteTableIdentifier identifier)
       }
     
       public static TableInfo inferSchema(AbsoluteTableIdentifier identifier,
    -      boolean isCarbonFileProvider) throws IOException {
    +      boolean isCarbonFileProvider, TableInfo tableInfoFromCache) throws IOException {
         // This routine is going to infer schema from the carbondata file footer
         // Convert the ColumnSchema -> TableSchema -> TableInfo.
         // Return the TableInfo.
    +    if (tableInfoFromCache != null) {
    --- End diff --
   
    Done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184928085
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/column/ColumnSchema.java ---
    @@ -43,6 +43,9 @@
        */
       private static final long serialVersionUID = 7676766554874863763L;
     
    +  public void columnSchema() {
    --- End diff --
   
    Removed. No Use.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184928356
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/scan/collector/impl/DictionaryBasedResultCollector.java ---
    @@ -134,7 +134,13 @@ void fillDimensionData(BlockletScannedResult scannedResult, int[] surrogateResul
               row[order[i]] =
                   DataTypeUtil.getDataBasedOnDataType(scannedResult.getBlockletId(), DataTypes.STRING);
             }
    -      } else {
    +      } else if (complexDataTypeArray[i]) {
    +        // Complex Type With No Dictionary Encoding.
    +        row[order[i]] = comlexDimensionInfoMap.get(queryDimensions[i].getDimension().getOrdinal())
    +            .getDataBasedOnDataTypeFromNoDictionary(
    +                ByteBuffer.wrap(complexTypeKeyArray[complexTypeColumnIndex++]));
    +      } else
    +      {
    --- End diff --
   
    Done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184929215
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/scan/complextypes/PrimitiveQueryType.java ---
    @@ -53,6 +55,21 @@ public PrimitiveQueryType(String name, String parentname, int blockIndex,
         this.name = name;
         this.parentname = parentname;
         this.isDirectDictionary = isDirectDictionary;
    +    this.isDictionary = true;
    +  }
    +
    +
    +  public PrimitiveQueryType(String name, String parentname, int blockIndex,
    --- End diff --
   
    Moved into a single constructor. But we cannot check is the column is a dictionary or nodictionary through the existing variable dictionary as it is always going to get filled up by default.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184956997
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/scan/complextypes/PrimitiveQueryType.java ---
    @@ -53,6 +55,21 @@ public PrimitiveQueryType(String name, String parentname, int blockIndex,
         this.name = name;
         this.parentname = parentname;
         this.isDirectDictionary = isDirectDictionary;
    +    this.isDictionary = true;
    +  }
    +
    +
    +  public PrimitiveQueryType(String name, String parentname, int blockIndex,
    +      org.apache.carbondata.core.metadata.datatype.DataType dataType, int keySize,
    +      Dictionary dictionary, boolean isDirectDictionary, boolean isDictionary) {
    --- End diff --
   
    Done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184961725
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTable.scala ---
    @@ -175,9 +175,31 @@ class TestNonTransactionalCarbonTable extends QueryTest with BeforeAndAfterAll {
     
       test("test create External Table with Schema with partition, should ignore schema and partition")
       {
    -    buildTestDataSingleFile()
    +    sql("DROP TABLE IF EXISTS sdkOutputTable")
    +
    +    // with partition
    +    sql("CREATE EXTERNAL TABLE sdkOutputTable(name string) PARTITIONED BY (age int) STORED BY 'carbondata' LOCATION '/home/root1/avro/files' ")
    +//
    +//    checkAnswer(sql("select * from sdkOutputTable"), Seq(Row("robot0", 0, 0.0),
    +//      Row("robot1", 1, 0.5),
    +//      Row("robot2", 2, 1.0)))
    --- End diff --
   
    Done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184961740
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/scan/complextypes/PrimitiveQueryType.java ---
    @@ -108,4 +128,14 @@ public PrimitiveQueryType(String name, String parentname, int blockIndex,
         }
         return actualData;
       }
    +
    +  @Override public Object getDataBasedOnDataTypeFromNoDictionary(ByteBuffer data) {
    --- End diff --
   
    Done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2209: [CARBONDATA-2388][SDK]Avro Record Complex Typ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2209#discussion_r184962348
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/ArrayDataType.java ---
    @@ -171,9 +175,9 @@ public void fillCardinality(List<Integer> dimCardWithComplex) {
       /**
        * parse byte array and bit pack
        */
    -  @Override
    -  public void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
    -      KeyGenerator[] generator) throws IOException, KeyGenException {
    +  @Override public void parseAndBitPack(ByteBuffer byteArrayInput,
    --- End diff --
   
    Done


---
123456