Apache CarbonData Dev Mailing List archive

Re: carbon data

Posted by ZhuWilliam on Nov 29, 2016; 11:57am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/carbon-data-tp3305p3357.html

Try Code like this:
```
if (cc.tableNames().filter(f => f == _cfg.get("tableName").get).size == 0) {
df.sqlContext.sql(s"DROP TABLE IF EXISTS
${_cfg.get("tableName").get}")

writer.options(_cfg).mode(SaveMode.Overwrite).format(_format).save()
} else {

writer.options(_cfg).mode(SaveMode.valueOf(_mode)).format(_format).save()
}
```
Only when you have table created then you can use SaveMode.Append otherwise
you should use SaveMode.Overwrite to make CarbonData create table for you.

On Tue, Nov 29, 2016 at 5:56 PM, Lu Cao <[hidden email]> wrote:

> Thank you for the response Liang. I think I have followed the example but
> it still returns error:
> Data loading failed. table not found: default.carbontest
> attached my code below: I read data from a hive table with HiveContext and
> convert it to CarbonContext then generate the df and save to hdfs. I'm not
> sure whether it's correct or not when I generate the dataframe in
> sc.parallelize(sc.Files,
> 25) Do you have any other mothod we can use to generate DF?
>
> object SparkConvert {
>
> def main(args: Array[String]): Unit = {
>
> val conf = new SparkConf().setAppName("CarbonTest")
>
> val sc = new SparkContext(conf)
>
> val path = "hdfs:///user/appuser/lucao/CarbonTest_001.carbon"
>
> val hqlContext = new HiveContext(sc)
>
> val df = hqlContext.sql("select * from default.test_data_all")
>
> println("the count is:" + df.count())
>
> val cc = createCarbonContext(df.sqlContext.sparkContext, path)
>
> writeDataFrame(cc, "CarbonTest", SaveMode.Append)
>
>
>
> }
>
>
>
> def createCarbonContext(sc : SparkContext, storePath : String):
> CarbonContext = {
>
> val cc = new CarbonContext(sc, storePath)
>
> cc
>
> }
>
>
>
> def writeDataFrame(cc : CarbonContext, tableName : String, mode :
> SaveMode) : Unit = {
>
> import cc.implicits._
>
> val sc = cc.sparkContext
>
> val df = sc.parallelize(sc.files,
> 25).toDF(“col1”,”col2”,”col3”..."coln")
>
> df.write
>
> .format("carbondata")
>
> .option("tableName", tableName)
>
> .option("compress", "true")
>
> .mode(mode)
>
> .save()
>
> }
>
>
>
> }
>

--
Best Regards
_______________________________________________________________
开阔视野专注开发
WilliamZhu 祝海林 [hidden email]
产品事业部-基础平台-搜索&数据挖掘
手机：18601315052
MSN：[hidden email]
微博：@PrinceCharmingJ http://weibo.com/PrinceCharmingJ
地址：北京市朝阳区广顺北大街33号院1号楼福码大厦B座12层
_______________________________________________________________
http://www.csdn.net You're the One
全球最大中文IT技术社区一切由你开始

http://www.iteye.net
程序员深度交流社区