Apache CarbonData Dev Mailing List archive - can't apply mappartitions to dataframe generated from carboncontext

Apache CarbonData Dev Mailing List archive

can't apply mappartitions to dataframe generated from carboncontext

Posted by 孙而焓 on Jun 12, 2017; 3:22am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/can-t-apply-mappartitions-to-dataframe-generated-from-carboncontext-tp14565.html

hi:

appendix are the full error message

I try to modify dataframe row in non-sql way and get Task not serializable,my test procedure like follows:

1.val df=cc.sql(select * from t1)

2.def function1 (iterator: Iterator[Row]):Iterator[Row]={

var list=scala.collection.mutable.ListBuffer[Row]()

while (iterator.hasNext) {

var r=iterator.next

if(r.getAs[String]("col1").toString.equalsIgnoreCase(r.getAs[String]("col2").toString)) list+=r}

list.iterator

}

3.df.mapPartitions(r=>function1 (r))

I also apply function1 to dataframe generated from sqlContext work fine,

so i believe carboncontext is refering some outer variables that are not serializable.

[hidden email]

孙而焓【FFCS研究院】