Login  Register

can't apply mappartitions to dataframe generated from carboncontext

Posted by 孙而焓 on Jun 12, 2017; 3:22am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/can-t-apply-mappartitions-to-dataframe-generated-from-carboncontext-tp14565.html

hi:
    appendix are the full error message
    I try to modify dataframe row in non-sql way and get Task not serializable,my test procedure like follows:
            1.val df=cc.sql(select * from t1)
            2.def function1 (iterator: Iterator[Row]):Iterator[Row]={
                    var list=scala.collection.mutable.ListBuffer[Row]()
                    while (iterator.hasNext) {
        var r=iterator.next
                        if(r.getAs[String]("col1").toString.equalsIgnoreCase(r.getAs[String]("col2").toString)) list+=r}
                        list.iterator
                }
            3.df.mapPartitions(r=>function1 (r))    
   I also apply function1 to dataframe generated from sqlContext work fine,
    so i believe carboncontext is refering some outer variables that are not serializable.

    


孙而焓【FFCS研究院】