Posted by
vikramahuja1001 on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Clean-files-enhancement-tp100088p100531.html
Hi all,
after all the suggestions the trash folder mechanism in carbondata will be
implemented in 2 phases
Phase1 :
1. Create a generic trash folder at table level. Trash folders will be
hidden/invisible(like .trash or .recyclebin). The trash folder will be
stored in the table dir.
2. If we delete any file/folder from a table it will be moved to the trash
folder of that corresponding table (The call for adding to trash will be
added in FileFactory delete api's)
3. A trash manager will be created, which will keep track of all the files
that have been deleted and moved to the trash and will also maintain the
time when it is deleted. All the trashmanager's api will be called from the
FileFactory class
4. On clean files command, the trash folders will be cleared if the expiry
time has been met. Each file moved to the trash will have some expiration
time associated with it
Phase 2: For phase 2 more enhancements are planned, and will be implemented
after the phase 1 is completed. The plan for phase 2 development and
changes shall be posted in this mail thread itself.
Thanks
Vikram Ahuja
On Wed, Sep 16, 2020 at 8:43 AM PickUpOldDriver <
[hidden email]>
wrote:
> Hi Vikram,
>
> I agree to build a trash folder, +1.
>
> Currently, the data loading/compaction/update/merge flow has automatic
> cleaning files actions, but they are written separately. Most of them are
> aimed at deleting the stale segments(MARKED_FOR_DELETE/COMPACTED). And they
> rely on the precise of the table status. If you could build a general clean
> file function, it can be applied to substitute the current automatic
> deletion for stale folders.
>
> Besides, having a trash folder handle by Carbondata will be good, we can
> find the deleted segments by this API.
>
> And I think we should also consider the status of INSERT_IN_PROGERSS &
> INSERT_OVERWRITE _IN_PROGRESS
>
>
>
>
> --
> Sent from:
>
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/>