Apache CarbonData Dev Mailing List archive

Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

Classic

List

Threaded

5 messages Options

rui qin

Jul 05, 2017; 10:04am

Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

Hi,all
Based on tpc-h,I have tested some sql in presto plugin of carbondata,but not very well.I found some reasons and that each split in building ChunkRowIterator object is time consuming(More than 2 seconds).
Is there any suggestions for optimization?

issue code:
org.apache.carbondata.presto.CarbondataRecordSet

public RecordCursor cursor() {
...
CarbonIterator<Object[]> carbonIterator =
new ChunkRowIterator((CarbonIterator<BatchResult>) queryExecutor.execute(queryModel));
...
}

bhavya411

Jul 05, 2017; 10:59am

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

Can you please let me know the datasize that you are using to test and how you have loaded the data into Carbon Table. In the below code the query model is created and the results are returned in chunks. So if you let me know the above details will try to see where it is going wrong. Also can you please add below properties in your carbon.properties and let me know the statistics

"carbon.enable.vector.reader", "true"
"enable.unsafe.sort", "true"
"enable.query.statistics", "true"

rui qin

Jul 06, 2017; 2:02am

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

Hi,
Datasize 1024MB(Default).I load data into Carbon Table through Spark-Shell. The presto plugin of Carbon supports the "carbondata-store" property only.How do I add the properties you listed?

Liang Chen

Jul 06, 2017; 4:28am

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

Administrator

Hi

In Spark-shell, you can use the below script :

import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.constants.CarbonCommonConstants

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_VECTOR_READER, "true")
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_OFFHEAP_SORT, "true")
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_UNSAFE_SORT, "true")

Regards
Liang

suizhe007 wrote

Hi,
Datasize 1024MB(Default).I load data into Carbon Table through Spark-Shell. The presto plugin of Carbon supports the "carbondata-store" property only.How do I add the properties you listed?

bhavya411

Jul 06, 2017; 6:07am

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

What are the configuration properties you are using for Presto and which
version you are using for testing. We are using the below configuration for
Presto

*Master*
*config.properties*
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8086
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://<ip-address>:8086

*jvm.config*
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p

*Slave 1 *
config.properties
coordinator=false
http-server.http.port=8086
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://<ip-address>:8086

jvm.config
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p

*Slave 2*
config.properties
coordinator=false
http-server.http.port=8086
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://<ip-address>:8086

jvm.config
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p

Regards
Bhavya

On Thu, Jul 6, 2017 at 9:58 AM, Liang Chen <[hidden email]> wrote:

> Hi
>
> In Spark-shell, you can use the below script :
>
> import org.apache.carbondata.core.util.CarbonProperties
> import org.apache.carbondata.core.constants.CarbonCommonConstants
>
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_
> VECTOR_READER,
> "true")
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_
> OFFHEAP_SORT,
> "true")
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_
> UNSAFE_SORT,
> "true")
>
> Regards
> Liang
>
> suizhe007 wrote
> > Hi,
> > Datasize 1024MB(Default).I load data into Carbon Table through
> > Spark-Shell. The presto plugin of Carbon supports the "carbondata-store"
> > property only.How do I add the properties you listed?
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/Why-is-slower-
> that-build-ChunkRowIterator-object-in-presto-plugin-of-
> carbondata-tp17356p17427.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>

... [show rest of quote]