Hi,
Currently, CarbonData outputs audit log with other level log together in one log file, it is not easy for user to check the audit. And sometimes the audit information is not complete since it depends on each Command to invoke Logger in its run function. To improve it, I propose a new audit log implementation by following: 1. Separate the audit log from normal log, user can configure log4j to output the audit log in a separate file 2. The audit log should have a common format that includes at least: time, username, operation name, operation id that identify this operation, status (success or failure), other extra information like data loading size, time spent 3. The audit log should be in JSON format to enable analytic tool support in the future. For example, the audit log will be look like following {"time":"2018-10-31 15:02:12","username":"anonymous","opName":"CREATE TABLE","opId":"115794874155743","opStatus":"START"} {"time":"2018-10-31 15:02:12","username":"anonymous","opName":"CREATE TABLE","opId":"115794874155743","opStatus":"SUCCESS","opTime":"542 ms","tableId":"default.t1","extraInfo":{"external":"false"}} {"time":"2018-10-31 15:02:15","username":"anonymous","opName":"INSERT INTO","opId":"115797876187366","opStatus":"START"} {"time":"2018-10-31 15:02:19","username":"anonymous","opName":"INSERT INTO","opId":"115797876187366","opStatus":"SUCCESS","opTime":”4043 ms","tableId":"default.t1","extraInfo":{"SegmentId":"0","DataSize":"403.0B","IndexSize":"246.0B"}} {"time":"2018-10-31 15:02:33","username":"anonymous","opName":"DROP TABLE","opId":"115816322828613","opStatus":"START"} {"time":"2018-10-31 15:02:34","username":"anonymous","opName":"DROP TABLE","opId":"115816322828613","opStatus":"SUCCESS","opTime":"131 ms","tableId":"default.t1","extraInfo":{}} {"time":"2018-10-31 15:02:49","username":"anonymous","opName":"SHOW SEGMENTS","opId":"115831939703565","opStatus":"START"} {"time":"2018-10-31 15:02:49","username":"anonymous","opName":"SHOW SEGMENTS","opId":"115831939703565","opStatus":"SUCCESS","opTime":"30 ms","tableId":"default.t2","extraInfo":{}} {"time":"2018-10-31 15:03:54","username":"anonymous","opName":"INSERT OVERWRITE","opId":"115896869484042","opStatus":"START"} {"time":"2018-10-31 15:03:56","username":"anonymous","opName":"INSERT OVERWRITE","opId":"115896869484042","opStatus":"SUCCESS","opTime":"2039 ms","tableId":"default.t2","extraInfo":{"SegmentId":"0","DataSize":"403.0B","IndexSize":"246.0B”}} What do you think about it? Regards, Jacky |
nice job,It is more perfect to provide the number of data records per load.
-- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by Jacky Li
+1
I've few questions about this: 1. Is it OK to call it 'tableId' or 'table' 2. For what kind of statements will you audit the operations? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
I am planning to do it by adding a small framework in AtomicRunnableCommand so that all command will be audited automatically. After that I will remove all the old audit log in each command.
OK, I will change the tableId to table. Regards, Jacky > 在 2018年10月31日,下午3:49,xuchuanyin <[hidden email]> 写道: > > +1 > > I've few questions about this: > 1. Is it OK to call it 'tableId' or 'table' > > 2. For what kind of statements will you audit the operations? > > > -- > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
Free forum by Nabble | Edit this page |