Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

rui qin
Hi,all
    Based on tpc-h,I have tested some sql in presto plugin of carbondata,but not very well.I found some reasons and that each split in building ChunkRowIterator object is time consuming(More than 2 seconds).
Is there any suggestions for optimization?

issue code:
org.apache.carbondata.presto.CarbondataRecordSet

      public RecordCursor cursor() {
              ...
             CarbonIterator<Object[]> carbonIterator =
          new ChunkRowIterator((CarbonIterator<BatchResult>) queryExecutor.execute(queryModel));
             ...
      }
Reply | Threaded
Open this post in threaded view
|

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

bhavya411
Can you please let me know the datasize that you are using to test and how you have loaded the data into Carbon Table. In the below code the query model is created and the results are returned in chunks. So if you let me know the above details will try to see where it is going wrong. Also  can you please add below properties in your carbon.properties and let me know the statistics

"carbon.enable.vector.reader", "true"
"enable.unsafe.sort", "true"
"enable.query.statistics", "true"


Reply | Threaded
Open this post in threaded view
|

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

rui qin
Hi,
   Datasize 1024MB(Default).I load data into Carbon Table through Spark-Shell. The presto plugin of Carbon supports the "carbondata-store" property only.How do I add the properties you listed?
Reply | Threaded
Open this post in threaded view
|

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

Liang Chen
Administrator
Hi

In Spark-shell, you can use the below script :

import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.constants.CarbonCommonConstants

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_VECTOR_READER, "true")
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_OFFHEAP_SORT, "true")
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_UNSAFE_SORT, "true")

Regards
Liang
suizhe007 wrote
Hi,
   Datasize 1024MB(Default).I load data into Carbon Table through Spark-Shell. The presto plugin of Carbon supports the "carbondata-store" property only.How do I add the properties you listed?
Reply | Threaded
Open this post in threaded view
|

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

bhavya411
What are the configuration properties you are using for Presto and which
version you are using for testing. We are using the below configuration for
Presto

*Master*
*config.properties*
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8086
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://<ip-address>:8086

*jvm.config*
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p

*Slave 1 *
config.properties
coordinator=false
http-server.http.port=8086
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://<ip-address>:8086

jvm.config
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p

*Slave 2*
config.properties
coordinator=false
http-server.http.port=8086
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://<ip-address>:8086

jvm.config
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p



Regards
Bhavya

On Thu, Jul 6, 2017 at 9:58 AM, Liang Chen <[hidden email]> wrote:

> Hi
>
> In Spark-shell, you can use the below script :
>
> import org.apache.carbondata.core.util.CarbonProperties
> import org.apache.carbondata.core.constants.CarbonCommonConstants
>
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_
> VECTOR_READER,
> "true")
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_
> OFFHEAP_SORT,
> "true")
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_
> UNSAFE_SORT,
> "true")
>
> Regards
> Liang
>
> suizhe007 wrote
> > Hi,
> >    Datasize 1024MB(Default).I load data into Carbon Table through
> > Spark-Shell. The presto plugin of Carbon supports the "carbondata-store"
> > property only.How do I add the properties you listed?
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/Why-is-slower-
> that-build-ChunkRowIterator-object-in-presto-plugin-of-
> carbondata-tp17356p17427.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>