user yarn needs the hdfs access when loading data?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

user yarn needs the hdfs access when loading data?

Li Peng
Hi,
   
   When I use the user "spark" to create table and run spark streaming application.
   I'm confused about why the user "yarn" needs the hdfs access? if so, i can't use spark user to run app, but only use yarn user.



org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/carbondata/carbonstore/default/sale/Metadata/schema":spark:hdfs:drwxr-xr-x



INFO  21-12 11:07:52,389 - ********starting clean up**********
WARN  21-12 11:07:52,442 - Exception while invoking ClientNamenodeProtocolTranslatorPB.delete over dpnode02/192.168.9.2:8020. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/carbondata/carbonstore/sale/sale/Fact/Part0/Segment_0":spark:hdfs:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)



  Thanks
Reply | Threaded
Open this post in threaded view
|

Re: user yarn needs the hdfs access when loading data?

David CaiQiang
Please provide more info:
1.  How do you use spark? JDBCServer or Spark shell or Spark SQL?
2.  Which release?  Opensource release or Business edition?
Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: user yarn needs the hdfs access when loading data?

Li Peng
Hi,
   1. I create carbon table in spark shell with user 'spark', the hdfs access of table store location  in hdfs is 755.
       Run a spark streaming application in yarn-cluster with user 'spark'.
       The application store dataframe to carbon table,  and 'carbon.ddl.base.hdfs.url' in carbon.properties is '/user/spark'.
       

   2. use carbondata 0.2.0 release.

 
   Why user 'yarn' must write access in  loading data?  I must use 'yarn' to create table and submit app now.


Thanks.


Reply | Threaded
Open this post in threaded view
|

Re: user yarn needs the hdfs access when loading data?

lionel061201
Hi,
You can refer to this ticket, I met the same issue :
https://issues.apache.org/jira/browse/CARBONDATA-559
check your kettle home settings in carbon.properties on executor side.

Thanks,
Lionel

On Wed, Dec 28, 2016 at 10:50 AM, Li Peng <[hidden email]> wrote:

> Hi,
>    1. I create carbon table in spark shell with user 'spark', the hdfs
> access of table store location  in hdfs is 755.
>        Run a spark streaming application in yarn-cluster with user 'spark'.
>        The application store dataframe to carbon table,  and
> 'carbon.ddl.base.hdfs.url' in carbon.properties is '/user/spark'.
>
>
>    2. use carbondata 0.2.0 release.
>
>
>    Why user 'yarn' must write access in  loading data?  I must use 'yarn'
> to
> create table and submit app now.
>
>
> Thanks.
>
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/user-yarn-needs-
> the-hdfs-access-when-loading-data-tp5082p5144.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>