[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2265#discussion_r196799513
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataPageSourceProvider.java ---
    @@ -129,23 +135,31 @@ private QueryModel createQueryModel(CarbondataSplit carbondataSplit,
           String carbonTablePath = carbonTable.getAbsoluteTableIdentifier().getTablePath();
     
           conf.set(CarbonTableInputFormat.INPUT_DIR, carbonTablePath);
    +      conf.set("query.id", queryId);
           JobConf jobConf = new JobConf(conf);
           CarbonTableInputFormat carbonTableInputFormat = createInputFormat(jobConf, carbonTable,
               PrestoFilterUtil.parseFilterExpression(carbondataSplit.getConstraints()),
               carbonProjection);
           TaskAttemptContextImpl hadoopAttemptContext =
               new TaskAttemptContextImpl(jobConf, new TaskAttemptID("", 1, TaskType.MAP, 0, 0));
    -      CarbonInputSplit carbonInputSplit =
    -          CarbonLocalInputSplit.convertSplit(carbondataSplit.getLocalInputSplit());
    +      CarbonMultiBlockSplit carbonInputSplit =
    +          CarbonLocalMultiBlockSplit.convertSplit(carbondataSplit.getLocalInputSplit());
           QueryModel queryModel =
               carbonTableInputFormat.createQueryModel(carbonInputSplit, hadoopAttemptContext);
    +      queryModel.setQueryId(queryId);
           queryModel.setVectorReader(true);
    +      queryModel.setStatisticsRecorder(
    +          CarbonTimeStatisticsFactory.createExecutorRecorder(queryModel.getQueryId()));
     
    +      /*
           List<CarbonInputSplit> splitList = new ArrayList<>(1);
    -      splitList.add(carbonInputSplit);
    -      List<TableBlockInfo> tableBlockInfoList = CarbonInputSplit.createBlocks(splitList);
    --- End diff --
   
    This has been fixed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2265#discussion_r196799725
 
    --- Diff: integration/presto/README.md ---
    @@ -113,6 +116,10 @@ Please follow the below steps to query carbondata in presto
       enable.unsafe.in.query.processing property by default is true in CarbonData system, the carbon.unsafe.working.memory.in.mb
       property defines the limit for Unsafe Memory usage in Mega Bytes, the default value is 512 MB.
       If your tables are big you can increase the unsafe memory, or disable unsafe via setting enable.unsafe.in.query.processing=false.
    +  
    +  If you do not want to use unsafe memory at all please set the below properties to false as well.
    --- End diff --
   
    This has been corrected


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2265#discussion_r196799800
 
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataMetadata.java ---
    @@ -152,19 +176,20 @@ private ConnectorTableMetadata getTableMetadata(SchemaTableName schemaTableName)
     
           Type spiType = carbonDataType2SpiMapper(cs);
           columnHandles.put(cs.getColumnName(),
    -          new CarbondataColumnHandle(connectorId, cs.getColumnName(), spiType, column.getSchemaOrdinal(),
    -              column.getKeyOrdinal(), column.getColumnGroupOrdinal(), false, cs.getColumnGroupId(),
    -              cs.getColumnUniqueId(), cs.isUseInvertedIndex(), cs.getPrecision(), cs.getScale()));
    +          new CarbondataColumnHandle(connectorId, cs.getColumnName(), spiType,
    +              column.getSchemaOrdinal(), column.getKeyOrdinal(), column.getColumnGroupOrdinal(),
    +              false, cs.getColumnGroupId(), cs.getColumnUniqueId(), cs.isUseInvertedIndex(),
    +              cs.getPrecision(), cs.getScale()));
         }
     
         for (CarbonMeasure measure : cb.getMeasureByTableName(tableName)) {
           ColumnSchema cs = measure.getColumnSchema();
    -
           Type spiType = carbonDataType2SpiMapper(cs);
           columnHandles.put(cs.getColumnName(),
    -          new CarbondataColumnHandle(connectorId, cs.getColumnName(), spiType, cs.getSchemaOrdinal(),
    -              measure.getOrdinal(), cs.getColumnGroupId(), true, cs.getColumnGroupId(),
    -              cs.getColumnUniqueId(), cs.isUseInvertedIndex(), cs.getPrecision(), cs.getScale()));
    +          new CarbondataColumnHandle(connectorId, cs.getColumnName(), spiType,
    +              cs.getSchemaOrdinal(), measure.getOrdinal(), cs.getColumnGroupId(), true,
    +              cs.getColumnGroupId(), cs.getColumnUniqueId(), cs.isUseInvertedIndex(),
    +              cs.getPrecision(), cs.getScale()));
         }
     
         //should i cache it?
    --- End diff --
   
    Removed the comment


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    verified, looks good to me.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2265


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5254/



---
123