Hi dev,
The current implementation of S3 support has a few limitations which are listed below. *Problem(Locking):* Currently while writing a file onto HDFS a lock is acquired to ensure synchronisation which is not feasible in case of S3 as it does not have any lease(only one user can write at a time). *Solution:* Introduce Memory lock which can take care of the above problem. *Problem(Write with append mode):* Everytime a thrift related file stream is opened append mode is used which is not supported in S3. Therefore while writing index files in append mode the existing file is read into memory and then rewritten with overwrite as true and then new content is written to the file. *Solution:* *Change the current implementation of ThriftWriter(for s3) to collect the contents of index file in a buffer add the new content and overwrite the whole file at once.* *Problem(Alter rename):* In case of rename currently table path is also being updated with the new table name. But S3 does not support force rename and the rename is copying files onto new path which can be a very time consuming task therefore the current implementation can be changed as follows: - Rename the table in metadata without altering the table path(table path will not be updated with the new table name). - If user tries to create table with old table name then create the path with UUID appended to the table name. For example table name is table1 and table path is store/db1/table1. When renaming to table2 the table name in metadata will be update to table2 but the path will remain the same. If user tries to create a new table with table1 name then the table path would be table1-<UUID>. *Problem(Preaggregate transaction):* Pre-aggregate transaction support is relying heavily on renaming the table status file as follows: - Write the main table segment as In-progress in tablestatus file. - Write the aggregate table segment as In-progress in tablestatus file. - When load for aggregate table completes write the Success segment into a new table status file with the name tablestatus-UUID. - When the load for all aggregate tables are complete start renaming the tablestatus file to tablestatus_backup_UUID and rename tablestatus-UUID to tablestatus. remove all files with _backup_UUID once done. If everything is Success then change the segment status to Success for main table. If anything fails then use the _backup_UUID to restore the aggregate table to restore to old state. *Proposed Solution:* If we use DB to store table status of the aggregate table on S3 then this problem will not come as the DB can ensure transactional behaviour while updation. Any suggestion from community is most welcomed. Please let me know for any clarification. Regards Kunal Kapoor |
Hi Kunal,
I have some questions. *Problem(Locking):* Does the memory lock support that the multiple drivers concurrently load data to the same table? maybe it should note this limitation. *Problem(Write with append mode):* 1. atomicity After the overwrite operation failed, maybe the old file is destroyed. It should be able to recover the old file. *Problem(Alter rename):* If the table folder is different with the table name, maybe "refresh table" command should be enhanced. ----- Best Regards David Cai -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Best Regards
David Cai |
Hi David,
Thanks for the suggestions. 1. Memory lock cannot support multiple drivers. Documentation will be updated with this limitation. 2. I agree that in case of failure reverting the changes is necessary. Will take care of this point. 3. You are right refresh using table name would not work. I think we can introduce refresh using path for this scenario. Thanks Kunal Kapoor On Fri, Jun 22, 2018 at 12:08 PM David CaiQiang <[hidden email]> wrote: > Hi Kunal, > I have some questions. > > *Problem(Locking):* > Does the memory lock support that the multiple drivers concurrently load > data to the same table? maybe it should note this limitation. > > *Problem(Write with append mode):* > 1. atomicity > After the overwrite operation failed, maybe the old file is destroyed. It > should be able to recover the old file. > > *Problem(Alter rename):* > If the table folder is different with the table name, maybe "refresh > table" command should be enhanced. > > > > ----- > Best Regards > David Cai > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
Free forum by Nabble | Edit this page |