[GitHub] carbondata pull request #2898: [WIP] Fixed query failure in fileformat due s...

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2898: [WIP] Fixed query failure in fileformat due s...

qiuchenjian-2
GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/carbondata/pull/2898

    [WIP] Fixed query failure in fileformat due stale cache issue

    **Problem**
    While using FileFormat API, if a table created, dropped and then recreated with the same name the query fails because of schema mismatch issue
   
    **Analysis**
    In case of carbondata used through FileFormat API, once a table is dropped and recreated with the same name again then because the dataMap contains the stale carbon table schema mismatch exception is thrown
   
    **Solution**
    To avoid such scenarios it is always better to update the carbon table object retrieved
   
     - [ ] Any interfaces changed?
    No
     - [ ] Any backward compatibility impacted?
     No
     - [ ] Document update required?
    No
     - [ ] Testing done
    Added UT to verify the scenario
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
    NA


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/carbondata stale_carbon_table

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2898.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2898
   
----
commit 2b6789ee5464f90f43ecac3654e58424257eaa29
Author: m00258959 <manish.gupta@...>
Date:   2018-11-05T10:15:46Z

    Fixed select query failure due to stale carbonTable in dataMapFactory class

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    I think the current modification does not fix the root of the problem. If you think the table information is not get cleared, you should get it cleared, not just update it when you need it.
    The current implementation means at some time, the table information is kept somewhere as outdated.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    @xuchuanyin ...your point is correct. To explain this in detail
    1. We have already a way to clear the cached DataMaps through API call `DataMapStoreManager.getInstance().clearDataMaps(AbsoluteTableIdentifier identifier)`. This API call ensures that for a given table all the dataMaps are cleared.
    2. For FileFormat case if the above API is not integrated by the customer there is a possibility that drop table call will not come to carbondata layer and there can be few stale objects which can cause query failure.
    The PR is raised to handle the 2nd case. The other stale DataMaps are being already taken care by the LRU cache which will clear the stale entries one LRU cache threshold is reached.
    Let me know if you still have doubts


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1491/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9540/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1278/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1504/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9550/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1289/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    @manishgupta88 What if the user use fileformat carbontable and normal carbontable at the same time? For example, creating/using/droping fileformat table and then creating/using/droping normal carbon table, these tables are with the same name. Will this be OK?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    @xuchuanyin ...yes this scenario will work fine. In case of dropping normal table it will go through CarbonSession flow and drop table command is already taking care of clearing the datamaps.
    In case of fileFormat table drop, if the clear dataMap API is not integrated by customer in that case the changes done in this PR will take care of referring only to latest carbon table


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1511/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1300/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9561/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    @manishgupta88 it solves part of the problem (schema mismatch issue). But when you call getDataMaps it will give stale datamaps to you right.  How those can be updated?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1306/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2898: [CARBONDATA-3077] Fixed query failure in fileformat ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2898
 
    @ravipesala ...which method exactly you are referring to?...In all `getDataMap` methods latest `carbonTable` object is passed.and used for fetching the dataMaps..there is only one `getAllDataMaps` method which does not have any parameter but that is being only in the test cases...


---
12