Fw:carbonthriftserver can not be load many times

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Fw:carbonthriftserver can not be load many times

dylan






-------- Forwarding messages --------
From: "dylan" <[hidden email]>
Date: 2017-09-12 16:25:56
To: user <[hidden email]>
Subject: carbonthriftserver can not be load many times
hello :
     when i use carbondata,i use step by
        1.create table and load data
        2.use carbonthriftserver,select * from table limit 1(it's ok)
        3.update the table
        4.use carbonthriftserver,select * from table limit 1(it's bad) ,the  error is :

       i kown carbonthrifserver use btree cache the carbonindex,
       and when i update the table the index is change,and carbonthriftserver didn't know the has changed,
       So every time i have to restart the carbonthriftserver, do not know if you run into this problem?
       Is this a design flaw, or is there a better advice to help me solve this problem, thanks!


 



 

Reply | Threaded
Open this post in threaded view
|

Re: Fw:carbonthriftserver can not be load many times

ravipesala
Hi,

It is not the behavior of carbondata, it must be a bug. Usually, when you
update then the cache refreshes for next query.
Please provide following information.
1. Carbondata and Spark version you are using.
2. Testcase to reproduce this issue.

Regards,
Ravindra.

On 12 September 2017 at 14:18, dylan <[hidden email]> wrote:

>
>
>
>
>
>
> -------- Forwarding messages --------
> From: "dylan" <[hidden email]>
> Date: 2017-09-12 16:25:56
> To: user <[hidden email]>
> Subject: carbonthriftserver can not be load many times
> hello :
>      when i use carbondata,i use step by
>         1.create table and load data
>         2.use carbonthriftserver,select * from table limit 1(it's ok)
>         3.update the table
>         4.use carbonthriftserver,select * from table limit 1(it's bad)
> ,the  error is :
>
>        i kown carbonthrifserver use btree cache the carbonindex,
>        and when i update the table the index is change,and
> carbonthriftserver didn't know the has changed,
>        So every time i have to restart the carbonthriftserver, do not know
> if you run into this problem?
>        Is this a design flaw, or is there a better advice to help me solve
> this problem, thanks!
>
>
>
>
>
>
>



--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: Fw:carbonthriftserver can not be load many times

dylan

hello ravipesala:
    thanks for your reply,
    i am use carbondata version is 1.1.0 and spark version is 1.6.0.
    and I reproduce in accordance with the official quick-start-guide case
again,
    1.Creating a Table
cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
String,city String,age int) stored by 'carbondata' ")

    2.load data into table
  cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
table carbondb.test_table")

    3.start carbonthriftserver
   /home/zz/spark-1.6.0-bin-hadoop2.6/bin/spark-submit  --master local[*]
--driver-java-options="-Dcarbon.properties.filepath=/home/zz/spark-1.6.0-bin-hadoop2.6/conf/carbon.properties"
--executor-memory 4G  --driver-memory 2g  --conf
spark.serializer=org.apache.spark.serializer.KryoSerializer   --conf
"spark.sql.shuffle.partitions=3" --conf spark.speculation=true   --class
org.apache.carbondata.spark.thriftserver.CarbonThriftServer
/home/zz/spark-1.6.0-bin-hadoop2.6/carbonlib/carbondata_2.10-1.1.0-shade-hadoop2.2.0.jar
hdfs://nameservice1/user/zz/rp_carbon_store

   4.Connecting to CarbonData Thrift Server Using Beeline.

         <http://chuantu.biz/t6/47/1505284993x2890202558.jpg>

   5.drop table
   cc.sql("drop table carbondb.test_table")

   6.recreate table and load data
    cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
String,city String,age int) stored by 'carbondata' ")
    cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
table carbondb.test_table")

   7.select data use beeline
    <http://chuantu.biz/t6/47/1505284937x1034817476.jpg>
   Like the above error, the cache is not updated

   and last i want to ask a question,
   if i not do step 5 and I executed the reloading data directly,query data
is ok,
    but the data is added, not covered, is the design is like this, or a
bug?
   

Trouble to help me see thank you!




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Fw:carbonthriftserver can not be load many times

ravipesala
Hi,

I have a confusion here.

1 and 2 steps are done through one beeline session and  3,4 and 5 are done
from another beeline session?

And also can you try it on the current master branch if the same issue
exists?


Regards,
Ravindra.

On 13 September 2017 at 15:14, dylan <[hidden email]> wrote:

>
> hello ravipesala:
>     thanks for your reply,
>     i am use carbondata version is 1.1.0 and spark version is 1.6.0.
>     and I reproduce in accordance with the official quick-start-guide case
> again,
>     1.Creating a Table
> cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
> String,city String,age int) stored by 'carbondata' ")
>
>     2.load data into table
>   cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
> table carbondb.test_table")
>
>     3.start carbonthriftserver
>    /home/zz/spark-1.6.0-bin-hadoop2.6/bin/spark-submit  --master local[*]
> --driver-java-options="-Dcarbon.properties.filepath=/
> home/zz/spark-1.6.0-bin-hadoop2.6/conf/carbon.properties"
> --executor-memory 4G  --driver-memory 2g  --conf
> spark.serializer=org.apache.spark.serializer.KryoSerializer   --conf
> "spark.sql.shuffle.partitions=3" --conf spark.speculation=true   --class
> org.apache.carbondata.spark.thriftserver.CarbonThriftServer
> /home/zz/spark-1.6.0-bin-hadoop2.6/carbonlib/carbondata_2.10-1.1.0-shade-
> hadoop2.2.0.jar
> hdfs://nameservice1/user/zz/rp_carbon_store
>
>    4.Connecting to CarbonData Thrift Server Using Beeline.
>
>          <http://chuantu.biz/t6/47/1505284993x2890202558.jpg>
>
>    5.drop table
>    cc.sql("drop table carbondb.test_table")
>
>    6.recreate table and load data
>     cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
> String,city String,age int) stored by 'carbondata' ")
>     cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
> table carbondb.test_table")
>
>    7.select data use beeline
>     <http://chuantu.biz/t6/47/1505284937x1034817476.jpg>
>    Like the above error, the cache is not updated
>
>    and last i want to ask a question,
>    if i not do step 5 and I executed the reloading data directly,query data
> is ok,
>     but the data is added, not covered, is the design is like this, or a
> bug?
>
>
> Trouble to help me see thank you!
>
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: Fw:carbonthriftserver can not be load many times

dylan
This post was updated on .
hi Ravi :
   in my case,1,2,5 and 6 step is one session on spark-shell ,4 and 7 is
one session on beeline,

 According to your Suggest,i test this case on the current master branch,

 when i use beeline there is no Btree load failed info,but in my table there
is no data,All the data is null,but in spark-shell is ok.
spark-shell:
+---+-----+--------+---+
| id| name|    city|age|
+---+-----+--------+---+
|  1|david|shenzhen| 31|
|  2|eason|shenzhen| 27|
|  3|jarry|   wuhan| 35|
+---+-----+--------+---+

beeline:
0: jdbc:hive2://localhost:10000> select * from  carbondb.test_table;
+-------+-------+-------+-------+--+
|  id   | name  | city  |  age  |
+-------+-------+-------+-------+--+
| NULL  | NULL  | NULL  | NULL  |
| NULL  | NULL  | NULL  | NULL  |
| NULL  | NULL  | NULL  | NULL  |
+-------+-------+-------+-------+--+


thanks!





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Fw:carbonthriftserver can not be load many times

ravipesala
Hi,

I think you are using mysql hive metastore and connected thrift server and
spark-shell at same time,  So 2 drivers are accessing the carbonstore at
same time and changing the metadata of it. It seems there are some
refresh issues in carbon in this case. Please raise a jira ticket we will
look into it.

Regards,
Ravindra.

On 15 September 2017 at 14:27, dylan <[hidden email]> wrote:

> hi Ravi :
>    in my case,1,2,5 and 6 step is hone session on spark-shell ,4 and 7 is
> one session on beeline,
>
>  According to your Suggest,i test this case on the current master branch,
>
>  when i use beeline there is no Btree load failed info,but in my table
> there
> is no data,All the data is null.
> but in spark-shell is ok.
> 0: jdbc:hive2://localhost:10000> select * from  carbondb.test_table;
> +-------+-------+-------+-------+--+
> |  id   | name  | city  |  age  |
> +-------+-------+-------+-------+--+
> | NULL  | NULL  | NULL  | NULL  |
> | NULL  | NULL  | NULL  | NULL  |
> | NULL  | NULL  | NULL  | NULL  |
> +-------+-------+-------+-------+--+
>
>
> thanks!
>
>
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



--
Thanks & Regards,
Ravi