Apache CarbonData Dev Mailing List archive - Re: [Discussion] Improve the reading/writing performance on the big tablestatus file

Apache CarbonData Dev Mailing List archive

Re: [Discussion] Improve the reading/writing performance on the big tablestatus file

Posted by David CaiQiang on Sep 02, 2020; 3:56am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Improve-the-reading-writing-performance-on-the-big-tablestatus-file-tp99716p99718.html

add solution 4 to separate the status file by segment status

*solution 4:* Based on solution 2, support status.inprogress

1) new tablestatus file format
{
"statusFileName":"status-uuid1",
"inProgressStatusFileName": "status-uuid2.inprogess",
"updateStatusFileName":"updatestatus-timestamp1",
"historyStatusFileName":"status.history",
"segmentMaxId":"1000"
}

2) status.inprogess file store the in-progress segment metadata

Write: at the begin of loading/compaction, add in-progress segment
metadata into status-uuid2.inprogess. at the end, move it to status-uuid1.

Read: query read status-uuid1 only. other cases read
status-uuid2.inprogess if needed.

-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Best Regards
David Cai