Hi all,
This mail is regarding enhancing the clean files command. Current behaviour : Currently when clean files is called, the segments which are MARKED_FOR_DELETE or are COMPACTED are deleted and their entries are removed from tablestatus file, Fact folder and metadata/segments folder. Enhancement behaviour idea: In this enhancement the idea is to create a trash folder(like Recycle Bin, with 777 config) which can be stored in /tmp folder(or user defined folder, a new property will be exposed). Here when ever a segment is cleaned , the necessary carbondata files (no other files) can be copied to this folder. The RecycleBin folder can have a folder for each table with name like DBName_TableName. We can keep the carbondata files here for 3 days(or as long as the user wants, a carbon property will be exposed for the same.). They can be deleted if they are not modified since 3 days or as per the property. We can maintain a thread which checks the aging time and deletes the necessary carbondata files from the trash folder. Apart from that, while cleaning INSERT_IN_PROGRESS segments will be cleaned too, but will try to get a segment lock before cleaning the INSERT_IN_PROGRESS segments. If the code is able to acquire the segment lock, i.e., it is a stale folder, it can be cleaned. If the code is not able to acquire the segment lock that means load is in progress or any other operation is in progress, in that case the INSERT_IN_PROGRESS segment will not be cleaned. Please provide input and suggestions for this enhancement idea. Thanks Vikram Ahuja -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi vikram, Thanks for proposing this.
a) If the file system is HDFS, *HDFS already supports trash.* when data is deleted in HDFS. It will be moved to trash instead of permanent delete (can also configure trash interval *fs.trash.interval*) b) If the file system is object storage like s3a or OBS. *They support bucket versioning*. The user should configure it to go back to the previous snapshot. https://docs.aws.amazon.com/AmazonS3/latest/user-guide/undelete-objects.html *So, Basically this functionality has to be there at underlying file system not at CarbonData layer. * Keeping trash folder with many configurations for this and checking aging of the trash folder can work, but it makes system complex and adds an additional overhead of maintaining this functionality. Based on this, *-1 from my side for this feature*. you can wait for other people's opinions on this before concluding. Thanks, Ajantha On Thu, Sep 10, 2020 at 4:20 PM vikramahuja1001 <[hidden email]> wrote: > Hi all, > This mail is regarding enhancing the clean files command. > Current behaviour : Currently when clean files is called, the segments > which > are MARKED_FOR_DELETE or are COMPACTED are deleted and their entries are > removed from tablestatus file, Fact folder and metadata/segments folder. > > Enhancement behaviour idea: In this enhancement the idea is to create a > trash folder(like Recycle Bin, with 777 config) which can be stored in /tmp > folder(or user defined folder, a new property will be exposed). Here when > ever a segment is cleaned , the necessary carbondata files (no other files) > can be copied to this folder. The RecycleBin folder can have a folder for > each table with name like DBName_TableName. We can keep the carbondata > files > here for 3 days(or as long as the user wants, a carbon property will be > exposed for the same.). They can be deleted if they are not modified since > 3 > days or as per the property. We can maintain a thread which checks the > aging > time and deletes the necessary carbondata files from the trash folder. > > Apart from that, while cleaning INSERT_IN_PROGRESS segments will be cleaned > too, but will try to get a segment lock before cleaning the > INSERT_IN_PROGRESS segments. If the code is able to acquire the segment > lock, i.e., it is a stale folder, it can be cleaned. If the code is not > able > to acquire the segment lock that means load is in progress or any other > operation is in progress, in that case the INSERT_IN_PROGRESS segment will > not be cleaned. > > Please provide input and suggestions for this enhancement idea. > > Thanks > Vikram Ahuja > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
In reply to this post by vikramahuja1001
+1 for this feature.
1. To provide better reliability, especially data integrity, is our top priority. I believe the trash helps a lot when problems happen. 2. It's tough for S3 to recover data under BigData Env (too many files and too much data), recovering is very time-cost and confidence-cost. we expect to recover data by ourself but the user. A trash will be helpful. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by vikramahuja1001
1. cleaning the in_progressing segment is very dangerous, please remove this
part from code. After the user explicitly uses clean file command with an option "clean_in_progressing"="true", we check segment lock to clean segment. 2. if the status of a semgent is mark_for_delete/compacted, we can delete the segment directly without backup. 3. remove code which clean stale data and partial data from loading/compaction/update/delete feature and so on. better to use a uuid as segment folder, Let cleaning stale data to be an optional operation. if we don't clean stale data, table also can work fine. 5. trash folder can be under the table path. each table has a separate trash folder. if we clean uncertain data, we can use trash folder to store data and use a separate folder for each transcation. ----- Best Regards David Cai -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Best Regards
David Cai |
Hi Vikram,
Moving to Trash/ keeping inside FACT/Part0/ folder it does not really matter, finally after configurable time it will be deleted. Moving to Trash will add an extra IO and time during the data loading. Everything will work fine if tablestatus is giving correct status. Do not delete the data physically in automatic clean files, just clean the table status with proper backup. For physical deletion, let User calls the clean command. Which will first run some sanity like getting the count before deletion and then move the segment to be deleted to some other folder[TRASH] and run the count again. If both counts matches then delete the data. Otherwise move the data back from TRASH in case of any mismatch. We need to enhance the current clean command as per the above way. -Regards Kumar Vishal On Tue, Sep 15, 2020 at 8:50 PM David CaiQiang <[hidden email]> wrote: > 1. cleaning the in_progressing segment is very dangerous, please remove > this > part from code. After the user explicitly uses clean file command with an > option "clean_in_progressing"="true", we check segment lock to clean > segment. > > 2. if the status of a semgent is mark_for_delete/compacted, we can delete > the segment directly without backup. > > 3. remove code which clean stale data and partial data from > loading/compaction/update/delete feature and so on. better to use a uuid as > segment folder, Let cleaning stale data to be an optional operation. if we > don't clean stale data, table also can work fine. > > 5. trash folder can be under the table path. each table has a separate > trash folder. if we clean uncertain data, we can use trash folder to store > data and use a separate folder for each transcation. > > > > ----- > Best Regards > David Cai > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ >
kumar vishal
|
In reply to this post by David CaiQiang
Hi David,
1. we cannot remove the code of clean up from all commands, because in case of any failures if we do not clean the stale files, there can be issues of wrong data or extra data. What i think is, we are calling the APIs which does may be say X amount of work, but we may just need some Y amount of clean up to be done (X >Y ). So may be what we can do is refactor in a proper way, just to delete or clean only the required files or folders specific to that command and not call the general or common clean up APIs which creates problem for us. 2. Yes, i agree that no need to clean up in progress in commads. Regards, AKash R Nilugal -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
+1 with Vishal proposal.
It is not safe to clean the automatically with out ensuring the data integrity. Let’s enhance the clean command to do sanity check before removing it. It should be the administrative work to delete the data, not the framework automatic feature. User can call when he needs to delete the data. Regards, Ravindra. On Tue, 15 Sep 2020 at 10:50 PM, akashrn5 <[hidden email]> wrote: > Hi David, > > > > 1. we cannot remove the code of clean up from all commands, because in case > > of any failures if we do not clean the stale files, there can be issues of > > wrong data or extra data. > > > > What i think is, we are calling the APIs which does may be say X amount of > > work, but we may just need some Y amount of clean up to be done (X >Y ). So > > may be what we can do is refactor in a proper way, just to delete or clean > > only the required files or folders specific to that command and not call > the > > general or common clean up APIs which creates problem for us. > > > > 2. Yes, i agree that no need to clean up in progress in commads. > > > > Regards, > > AKash R Nilugal > > > > > > > > -- > > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > -- Ravi |
In reply to this post by vikramahuja1001
Hi Vikram,
I agree to build a trash folder, +1. Currently, the data loading/compaction/update/merge flow has automatic cleaning files actions, but they are written separately. Most of them are aimed at deleting the stale segments(MARKED_FOR_DELETE/COMPACTED). And they rely on the precise of the table status. If you could build a general clean file function, it can be applied to substitute the current automatic deletion for stale folders. Besides, having a trash folder handle by Carbondata will be good, we can find the deleted segments by this API. And I think we should also consider the status of INSERT_IN_PROGERSS & INSERT_OVERWRITE _IN_PROGRESS -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi all,
after all the suggestions the trash folder mechanism in carbondata will be implemented in 2 phases Phase1 : 1. Create a generic trash folder at table level. Trash folders will be hidden/invisible(like .trash or .recyclebin). The trash folder will be stored in the table dir. 2. If we delete any file/folder from a table it will be moved to the trash folder of that corresponding table (The call for adding to trash will be added in FileFactory delete api's) 3. A trash manager will be created, which will keep track of all the files that have been deleted and moved to the trash and will also maintain the time when it is deleted. All the trashmanager's api will be called from the FileFactory class 4. On clean files command, the trash folders will be cleared if the expiry time has been met. Each file moved to the trash will have some expiration time associated with it Phase 2: For phase 2 more enhancements are planned, and will be implemented after the phase 1 is completed. The plan for phase 2 development and changes shall be posted in this mail thread itself. Thanks Vikram Ahuja On Wed, Sep 16, 2020 at 8:43 AM PickUpOldDriver <[hidden email]> wrote: > Hi Vikram, > > I agree to build a trash folder, +1. > > Currently, the data loading/compaction/update/merge flow has automatic > cleaning files actions, but they are written separately. Most of them are > aimed at deleting the stale segments(MARKED_FOR_DELETE/COMPACTED). And they > rely on the precise of the table status. If you could build a general clean > file function, it can be applied to substitute the current automatic > deletion for stale folders. > > Besides, having a trash folder handle by Carbondata will be good, we can > find the deleted segments by this API. > > And I think we should also consider the status of INSERT_IN_PROGERSS & > INSERT_OVERWRITE _IN_PROGRESS > > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
-1
I don’t see any reason why we should use trash. How does it change the behaviour. 1. Are you still going with automatic clean up? If yes then you are adding extra time to move the data to trash(for S3 file system). 2. Even if you move the data and keep the time to live as 3 days in trash, what if user realised that data is not right or lost after that time period. Regards, Ravindra On Thu, 17 Sep 2020 at 3:12 PM, Vikram Ahuja <[hidden email]> wrote: > Hi all, > > after all the suggestions the trash folder mechanism in carbondata will be > > implemented in 2 phases > > Phase1 : > > 1. Create a generic trash folder at table level. Trash folders will be > > hidden/invisible(like .trash or .recyclebin). The trash folder will be > > stored in the table dir. > > 2. If we delete any file/folder from a table it will be moved to the trash > > folder of that corresponding table (The call for adding to trash will be > > added in FileFactory delete api's) > > 3. A trash manager will be created, which will keep track of all the files > > that have been deleted and moved to the trash and will also maintain the > > time when it is deleted. All the trashmanager's api will be called from the > > FileFactory class > > 4. On clean files command, the trash folders will be cleared if the expiry > > time has been met. Each file moved to the trash will have some expiration > > time associated with it > > > > Phase 2: For phase 2 more enhancements are planned, and will be implemented > > after the phase 1 is completed. The plan for phase 2 development and > > changes shall be posted in this mail thread itself. > > > > > > Thanks > > Vikram Ahuja > > > > > > On Wed, Sep 16, 2020 at 8:43 AM PickUpOldDriver <[hidden email]> > > wrote: > > > > > Hi Vikram, > > > > > > I agree to build a trash folder, +1. > > > > > > Currently, the data loading/compaction/update/merge flow has automatic > > > cleaning files actions, but they are written separately. Most of them > are > > > aimed at deleting the stale segments(MARKED_FOR_DELETE/COMPACTED). And > they > > > rely on the precise of the table status. If you could build a general > clean > > > file function, it can be applied to substitute the current automatic > > > deletion for stale folders. > > > > > > Besides, having a trash folder handle by Carbondata will be good, we can > > > find the deleted segments by this API. > > > > > > And I think we should also consider the status of INSERT_IN_PROGERSS & > > > INSERT_OVERWRITE _IN_PROGRESS > > > > > > > > > > > > > > > -- > > > Sent from: > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > -- Thanks & Regards, Ravi |
agree with Ravindra,
1. stop all automatic clean data in load/insert/compact/update/delete... 2. when clean files command clean in-progress or uncertain data, we can move them to data trash. it can prevent delete useful data by mistake, we already find this issue in some scenes. other cases(for example clean mark_for_delete/compacted segment) should not use the data trash folder, clean data directly. 3. no need data trash management, suggest keeping it simple. The clean file command should support empty trash immediately, it will be enough. ----- Best Regards David Cai -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Best Regards
David Cai |
Hi Ravi and David,
1. All the automatic clean data in the case of load/insert/compact/delete will be removed, so cleaning will only happen when the clean files command is called. 2. We will only add the data to trash when we try to clean data which is in IN PROGRESS state. In case of COmpacted/Marked For Delete it will not be moved to the trash, it will be directly deleted. The user will only be able to recover the In Progress segments if the user wants. @Ravi -> Is this okay for trash usage? Only using it for in progress segments. 3. No trash management will be implemented, the data will ONLY BE REMOVED from the trash folder immediately when the clean files command is called. There will be no time to live, the data can be kept in the trash folder untill the user triggers clean files command. Let me know if you have any questions. Vikram Ahuja On Fri, Sep 18, 2020 at 1:43 PM David CaiQiang <[hidden email]> wrote: > agree with Ravindra, > > 1. stop all automatic clean data in load/insert/compact/update/delete... > > 2. when clean files command clean in-progress or uncertain data, we can > move > them to data trash. > it can prevent delete useful data by mistake, we already find this > issue > in some scenes. > other cases(for example clean mark_for_delete/compacted segment) should > not use the data trash folder, clean data directly. > > 3. no need data trash management, suggest keeping it simple. > The clean file command should support empty trash immediately, it will > be enough. > > > > ----- > Best Regards > David Cai > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
Hi Vikram,
+1 It is good to remove the automatic cleanup. But I am still worried about the clean file command executed by user as well. We need to enhance the clean file command to introduce dry run to print what segments it is going to be deleted and what is left. If user ok with dry run result then he can go for actual run. Regards, Ravindra. On Mon, 21 Sep 2020 at 1:27 PM, Vikram Ahuja <[hidden email]> wrote: > Hi Ravi and David, > > > > 1. All the automatic clean data in the case of load/insert/compact/delete > > will be removed, so cleaning will only happen when the clean files command > > is called. > > > > 2. We will only add the data to trash when we try to clean data which is in > > IN PROGRESS state. In case of COmpacted/Marked For Delete it will not be > > moved to the trash, it will be directly deleted. The user will only be able > > to recover the In Progress segments if the user wants. @Ravi -> Is this > > okay for trash usage? Only using it for in progress segments. > > > > 3. No trash management will be implemented, the data will ONLY BE REMOVED > > from the trash folder immediately when the clean files command is called. > > There will be no time to live, the data can be kept in the trash folder > > untill the user triggers clean files command. > > > > Let me know if you have any questions. > > > > Vikram Ahuja > > > > On Fri, Sep 18, 2020 at 1:43 PM David CaiQiang <[hidden email]> > wrote: > > > > > agree with Ravindra, > > > > > > 1. stop all automatic clean data in load/insert/compact/update/delete... > > > > > > 2. when clean files command clean in-progress or uncertain data, we can > > > move > > > them to data trash. > > > it can prevent delete useful data by mistake, we already find this > > > issue > > > in some scenes. > > > other cases(for example clean mark_for_delete/compacted segment) > should > > > not use the data trash folder, clean data directly. > > > > > > 3. no need data trash management, suggest keeping it simple. > > > The clean file command should support empty trash immediately, it > will > > > be enough. > > > > > > > > > > > > ----- > > > Best Regards > > > David Cai > > > -- > > > Sent from: > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > -- Thanks & Regards, Ravi |
+1 for ravi's comment
Better to show what would be deleted/moved to trash. Regards, Kunal Kapoor On Thu, Sep 24, 2020 at 8:34 PM Ravindra Pesala <[hidden email]> wrote: > Hi Vikram, > > +1 > > It is good to remove the automatic cleanup. > But I am still worried about the clean file command executed by user as > well. We need to enhance the clean file command to introduce dry run to > print what segments it is going to be deleted and what is left. If user ok > with dry run result then he can go for actual run. > > Regards, > Ravindra. > > On Mon, 21 Sep 2020 at 1:27 PM, Vikram Ahuja <[hidden email]> > wrote: > > > Hi Ravi and David, > > > > > > > > 1. All the automatic clean data in the case of load/insert/compact/delete > > > > will be removed, so cleaning will only happen when the clean files > command > > > > is called. > > > > > > > > 2. We will only add the data to trash when we try to clean data which is > in > > > > IN PROGRESS state. In case of COmpacted/Marked For Delete it will not be > > > > moved to the trash, it will be directly deleted. The user will only be > able > > > > to recover the In Progress segments if the user wants. @Ravi -> Is this > > > > okay for trash usage? Only using it for in progress segments. > > > > > > > > 3. No trash management will be implemented, the data will ONLY BE REMOVED > > > > from the trash folder immediately when the clean files command is called. > > > > There will be no time to live, the data can be kept in the trash folder > > > > untill the user triggers clean files command. > > > > > > > > Let me know if you have any questions. > > > > > > > > Vikram Ahuja > > > > > > > > On Fri, Sep 18, 2020 at 1:43 PM David CaiQiang <[hidden email]> > > wrote: > > > > > > > > > agree with Ravindra, > > > > > > > > > > 1. stop all automatic clean data in > load/insert/compact/update/delete... > > > > > > > > > > 2. when clean files command clean in-progress or uncertain data, we can > > > > > move > > > > > them to data trash. > > > > > it can prevent delete useful data by mistake, we already find this > > > > > issue > > > > > in some scenes. > > > > > other cases(for example clean mark_for_delete/compacted segment) > > should > > > > > not use the data trash folder, clean data directly. > > > > > > > > > > 3. no need data trash management, suggest keeping it simple. > > > > > The clean file command should support empty trash immediately, it > > will > > > > > be enough. > > > > > > > > > > > > > > > > > > > > ----- > > > > > Best Regards > > > > > David Cai > > > > > -- > > > > > Sent from: > > > > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > > > > > > > -- > Thanks & Regards, > Ravi > |
In reply to this post by ravipesala
+1 for ravi's comment. It's better, clean and safe.
Regards, Akash R Nilugal On Thu, Sep 24, 2020, 8:34 PM Ravindra Pesala <[hidden email]> wrote: > Hi Vikram, > > +1 > > It is good to remove the automatic cleanup. > But I am still worried about the clean file command executed by user as > well. We need to enhance the clean file command to introduce dry run to > print what segments it is going to be deleted and what is left. If user ok > with dry run result then he can go for actual run. > > Regards, > Ravindra. > > On Mon, 21 Sep 2020 at 1:27 PM, Vikram Ahuja <[hidden email]> > wrote: > > > Hi Ravi and David, > > > > > > > > 1. All the automatic clean data in the case of load/insert/compact/delete > > > > will be removed, so cleaning will only happen when the clean files > command > > > > is called. > > > > > > > > 2. We will only add the data to trash when we try to clean data which is > in > > > > IN PROGRESS state. In case of COmpacted/Marked For Delete it will not be > > > > moved to the trash, it will be directly deleted. The user will only be > able > > > > to recover the In Progress segments if the user wants. @Ravi -> Is this > > > > okay for trash usage? Only using it for in progress segments. > > > > > > > > 3. No trash management will be implemented, the data will ONLY BE REMOVED > > > > from the trash folder immediately when the clean files command is called. > > > > There will be no time to live, the data can be kept in the trash folder > > > > untill the user triggers clean files command. > > > > > > > > Let me know if you have any questions. > > > > > > > > Vikram Ahuja > > > > > > > > On Fri, Sep 18, 2020 at 1:43 PM David CaiQiang <[hidden email]> > > wrote: > > > > > > > > > agree with Ravindra, > > > > > > > > > > 1. stop all automatic clean data in > load/insert/compact/update/delete... > > > > > > > > > > 2. when clean files command clean in-progress or uncertain data, we can > > > > > move > > > > > them to data trash. > > > > > it can prevent delete useful data by mistake, we already find this > > > > > issue > > > > > in some scenes. > > > > > other cases(for example clean mark_for_delete/compacted segment) > > should > > > > > not use the data trash folder, clean data directly. > > > > > > > > > > 3. no need data trash management, suggest keeping it simple. > > > > > The clean file command should support empty trash immediately, it > > will > > > > > be enough. > > > > > > > > > > > > > > > > > > > > ----- > > > > > Best Regards > > > > > David Cai > > > > > -- > > > > > Sent from: > > > > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > > > > > > > -- > Thanks & Regards, > Ravi > |
Thanks for the suggestion Ravi.
We can include a property in the clean files command which can decide if we want to dry run. clean files on table t1 options('dry_run' = true) --> This will only show the segments which will be removed and will not clean/delete those segments or any data for that matter. By default, the dry_run will be set as false and the user can configure it when they want to use it. Rgds, Vikram On Mon, Sep 28, 2020 at 11:57 AM Akash r <[hidden email]> wrote: > +1 for ravi's comment. It's better, clean and safe. > > Regards, > Akash R Nilugal > > On Thu, Sep 24, 2020, 8:34 PM Ravindra Pesala <[hidden email]> > wrote: > > > Hi Vikram, > > > > +1 > > > > It is good to remove the automatic cleanup. > > But I am still worried about the clean file command executed by user as > > well. We need to enhance the clean file command to introduce dry run to > > print what segments it is going to be deleted and what is left. If user > ok > > with dry run result then he can go for actual run. > > > > Regards, > > Ravindra. > > > > On Mon, 21 Sep 2020 at 1:27 PM, Vikram Ahuja <[hidden email]> > > wrote: > > > > > Hi Ravi and David, > > > > > > > > > > > > 1. All the automatic clean data in the case of > load/insert/compact/delete > > > > > > will be removed, so cleaning will only happen when the clean files > > command > > > > > > is called. > > > > > > > > > > > > 2. We will only add the data to trash when we try to clean data which > is > > in > > > > > > IN PROGRESS state. In case of COmpacted/Marked For Delete it will not > be > > > > > > moved to the trash, it will be directly deleted. The user will only be > > able > > > > > > to recover the In Progress segments if the user wants. @Ravi -> Is this > > > > > > okay for trash usage? Only using it for in progress segments. > > > > > > > > > > > > 3. No trash management will be implemented, the data will ONLY BE > REMOVED > > > > > > from the trash folder immediately when the clean files command is > called. > > > > > > There will be no time to live, the data can be kept in the trash folder > > > > > > untill the user triggers clean files command. > > > > > > > > > > > > Let me know if you have any questions. > > > > > > > > > > > > Vikram Ahuja > > > > > > > > > > > > On Fri, Sep 18, 2020 at 1:43 PM David CaiQiang <[hidden email]> > > > wrote: > > > > > > > > > > > > > agree with Ravindra, > > > > > > > > > > > > > > 1. stop all automatic clean data in > > load/insert/compact/update/delete... > > > > > > > > > > > > > > 2. when clean files command clean in-progress or uncertain data, we > can > > > > > > > move > > > > > > > them to data trash. > > > > > > > it can prevent delete useful data by mistake, we already find > this > > > > > > > issue > > > > > > > in some scenes. > > > > > > > other cases(for example clean mark_for_delete/compacted segment) > > > should > > > > > > > not use the data trash folder, clean data directly. > > > > > > > > > > > > > > 3. no need data trash management, suggest keeping it simple. > > > > > > > The clean file command should support empty trash immediately, it > > > will > > > > > > > be enough. > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- > > > > > > > Best Regards > > > > > > > David Cai > > > > > > > -- > > > > > > > Sent from: > > > > > > > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > > > > > > > > > > > > > -- > > Thanks & Regards, > > Ravi > > > |
Hi all, PFA the design document. Please provide suggestions or feedback Vikram Ahuja On Mon, Sep 28, 2020 at 12:23 PM Vikram Ahuja <[hidden email]> wrote:
|
Hi all, PFA the document that lists all the changes done as a part of Clean Files Enhancement. The changes were done in 2 phases. Regards Vikram |
Free forum by Nabble | Edit this page |