[ https://issues.apache.org/jira/browse/CARBONDATA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-1690. ----------------------------------------- Resolution: Fixed > Query failed after swap table by renaming > ----------------------------------------- > > Key: CARBONDATA-1690 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1690 > Project: CarbonData > Issue Type: Bug > Components: spark-integration > Affects Versions: 1.3.0 > Reporter: xuchuanyin > Assignee: xuchuanyin > Fix For: 1.3.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > # SCENARIO > I encountered query error after swap table by renaming table. Steps to reproduce this bug are listed as below. > These steps work fine: > 1. CREATE TABLE `t1`; > 2. LOAD DATA TO `t1`; > 3. CREATE TABLE `t2`; > 4. LOAD DATA TO `t2`; > 5. RENAME `t1` TO `t3`; > 6. RENAME `t2` TO `t1`; > 7. QUERY `t1`; > These steps work wrong: > 1. CREATE TABLE `t1`; > 2. LOAD DATA TO `t1`; > 3. CREATE TABLE `t2`; > 4. LOAD DATA TO `t2`; > **5. QUERY `t1`;** --- Added this step > 6. RENAME `t1` TO `t3`; > 7. RENAME `t2` TO `t1`; > 8. QUERY `t1`; --- This step will cause failure > The above two scenario differs from that the second one add Step5 and the error will be thrown in Step8. The error message in sparksql shell looks like > ``` > Error: java.io.FileNotFoundException: File hdfs://slave1:9000/carbonstore/default/test_table/Fact/Part0/Segment_0/part-0-0_batchno0-0-1510144676427.carbondata does not exist. (state=,code=0) > ``` > # Analyze > Renaming table name in carbondata actually is done through renaming the corresponding data folder name. In addition, carbondata also refresh the metadata and its cache. > Having seen from the error message above, we find that the file name is exactly the one before rename operation. We guess the problems may lies in data map. > In the second scenario, before renaming, when we query `t1 ` (Step5), the corresponding data map will be loaded and cached. Since data map is table name based, when we query `t1` again (Step8) after renaming, the previous data map will be used, which is outdated and incorrect, thus will cause the `FileNotFoundException` error. > In the first scenario, when we query `t1` (Step7), it is the first time to load the data map, so the correct data will be readed, that's why it acts OK. > # Resolve > There are two ways to fix this bug: > 1. Change the index key of Data Map. Use `table_name + table_schema_last_update_time` in replace of `table_name`. > 2. Clear corresponding Data Map when doing renaming operation. > I prefer the second one since it is easy to implement —— just one line of code. -- This message was sent by Atlassian JIRA (v6.4.14#64029) |
Free forum by Nabble | Edit this page |