Apache CarbonData Dev Mailing List archive

user yarn needs the hdfs access when loading data?

Classic

List

Threaded

4 messages Options

Li Peng

Dec 27, 2016; 6:51am

user yarn needs the hdfs access when loading data?

12 posts

Hi,

When I use the user "spark" to create table and run spark streaming application.
I'm confused about why the user "yarn" needs the hdfs access? if so, i can't use spark user to run app, but only use yarn user.

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/carbondata/carbonstore/default/sale/Metadata/schema":spark:hdfs:drwxr-xr-x

INFO 21-12 11:07:52,389 - ********starting clean up**********
WARN 21-12 11:07:52,442 - Exception while invoking ClientNamenodeProtocolTranslatorPB.delete over dpnode02/192.168.9.2:8020. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/carbondata/carbonstore/sale/sale/Fact/Part0/Segment_0":spark:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)

Thanks

David CaiQiang

Dec 27, 2016; 9:56am

Re: user yarn needs the hdfs access when loading data?

171 posts

Please provide more info:
1. How do you use spark? JDBCServer or Spark shell or Spark SQL?
2. Which release? Opensource release or Business edition?

Best Regards
David Cai

Li Peng

Dec 28, 2016; 2:50am

Re: user yarn needs the hdfs access when loading data?

12 posts

Hi,
1. I create carbon table in spark shell with user 'spark', the hdfs access of table store location in hdfs is 755.
Run a spark streaming application in yarn-cluster with user 'spark'.
The application store dataframe to carbon table, and 'carbon.ddl.base.hdfs.url' in carbon.properties is '/user/spark'.

2. use carbondata 0.2.0 release.

Why user 'yarn' must write access in loading data? I must use 'yarn' to create table and submit app now.

Thanks.

lionel061201

Dec 29, 2016; 3:56am

Re: user yarn needs the hdfs access when loading data?

42 posts

Hi,
You can refer to this ticket, I met the same issue :
https://issues.apache.org/jira/browse/CARBONDATA-559
check your kettle home settings in carbon.properties on executor side.

Thanks,
Lionel

On Wed, Dec 28, 2016 at 10:50 AM, Li Peng <[hidden email]> wrote:

> Hi,
> 1. I create carbon table in spark shell with user 'spark', the hdfs
> access of table store location in hdfs is 755.
> Run a spark streaming application in yarn-cluster with user 'spark'.
> The application store dataframe to carbon table, and
> 'carbon.ddl.base.hdfs.url' in carbon.properties is '/user/spark'.
>
>
> 2. use carbondata 0.2.0 release.
>
>
> Why user 'yarn' must write access in loading data? I must use 'yarn'
> to
> create table and submit app now.
>
>
> Thanks.
>
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/user-yarn-needs-
> the-hdfs-access-when-loading-data-tp5082p5144.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

... [show rest of quote]