[GitHub] carbondata pull request #2666: [WIP] Fix double boundary condition and clear...

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2666: [WIP] Fix double boundary condition and clear...

qiuchenjian-2
GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/2666

    [WIP] Fix double boundary condition and clear datamaps issue

    This PR depends on https://github.com/apache/carbondata/pull/2659
    1.DataMaps are not clearing properly as it creates temp table for each request. Now it searches the datamap using the tablepath to clear and also to get the datamap
   
    2. In double value bounadry cases  loading fails as carbon does not handle infinite properly. Now added a check for infinite value.
   
    3. Added validations for sort columns cannot be used while infering the schema.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata stats-issue

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2666.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2666
   
----
commit f8881fb3da31337f071e0fba936d714d46a4afd6
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-24T15:13:07Z

    Fix complex filters on spark carbon file format

commit 18ae8ae9a5ed36005af5c7917c0439405e6b661a
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-24T15:32:53Z

    Add jar for sdvtest

commit 9d67d830d123cfbb9619b8054dd7a748e02bc023
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-26T17:14:22Z

    Fix comments

commit 348cd9c3aee82d54f76bdd5beaf31217827cdf91
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-27T09:11:02Z

    Support all types of datatypes in carbon fileformat

commit 7d412cd7f81f2985f28647516931f3a4a3739419
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-27T15:48:06Z

    Fix multiple issues

commit 4a5a7efdf85e51c22d5db995262b65198b8372a5
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-28T01:30:50Z

    Fix error of supporting more columns

commit febd215ffa9f52367c7eccb9010932f3be2ef455
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-28T05:18:51Z

    Fix tests 2.1

commit d402633d0363a2e36b57872f3a551d366d21dfe8
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-28T07:14:58Z

    Fix comment

commit 851d09feb9119c92499a037fceea3fbc55fb50ff
Author: ravipesala <ravi.pesala@...>
Date:   2018-08-28T11:47:38Z

    Fix double boundary condition and clear datamaps issue

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [WIP] Fix double boundary condition and clear datama...

qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6442/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2666: [WIP] Fix double boundary condition and clear...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2666#discussion_r213330337
 
    --- Diff: integration/spark-datasource/src/main/scala/org/apache/spark/sql/carbondata/execution/datasources/SparkCarbonFileFormat.scala ---
    @@ -79,10 +79,22 @@ class SparkCarbonFileFormat extends FileFormat
           options: Map[String, String],
           files: Seq[FileStatus]): Option[StructType] = {
         val tablePath = options.get("path") match {
    -      case Some(path) => path
    -      case _ => FileFactory.getUpdatedFilePath(files.head.getPath.getParent.toUri.toString)
    +      case Some(path) =>
    +        val defaultFsUrl =
    +          sparkSession.sparkContext.hadoopConfiguration.get(CarbonCommonConstants.FS_DEFAULT_FS)
    +        if (defaultFsUrl == null) {
    +          path
    +        } else {
    +          defaultFsUrl + CarbonCommonConstants.FILE_SEPARATOR + path
    --- End diff --
   
    Check if the path already has  schema prefix before adding the defaultFSURL ( hdfs://, s3a:/,etc/)


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2666: [WIP] Fix double boundary condition and clear...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2666#discussion_r213334033
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ---
    @@ -341,6 +348,18 @@ public TableDataMap getDataMap(CarbonTable table, DataMapSchema dataMapSchema) {
         return dataMap;
       }
     
    +  private String getKeyUsingTablePath(String tablePath) {
    +    if (tablePath != null) {
    +      // Try get using table path
    +      for (Map.Entry<String, String> entry : tablePathMap.entrySet()) {
    +        if (new Path(entry.getValue()).equals(new Path(tablePath))) {
    --- End diff --
   
    I think we can directly use tablePathMap.get(tablePath).  new Path does not change comparison


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [WIP] Fix double boundary condition and clear datama...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6445/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [WIP] Fix double boundary condition and clear datama...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8125/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [WIP] Fix double boundary condition and clear datama...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/50/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [WIP] Fix double boundary condition and clear datama...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6446/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [CARBONDATA-2898] Fix double boundary condition and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8129/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [CARBONDATA-2898] Fix double boundary condition and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/58/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2666: [CARBONDATA-2898] Fix double boundary conditi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2666#discussion_r213655978
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ---
    @@ -341,6 +348,18 @@ public TableDataMap getDataMap(CarbonTable table, DataMapSchema dataMapSchema) {
         return dataMap;
       }
     
    +  private String getKeyUsingTablePath(String tablePath) {
    +    if (tablePath != null) {
    +      // Try get using table path
    +      for (Map.Entry<String, String> entry : tablePathMap.entrySet()) {
    +        if (new Path(entry.getValue()).equals(new Path(tablePath))) {
    --- End diff --
   
    Path has utilities to verify the path in all scenarios, so it is better to use path


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [CARBONDATA-2898] Fix double boundary condition and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6459/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [CARBONDATA-2898] Fix double boundary condition and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [CARBONDATA-2898] Fix double boundary condition and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8145/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [CARBONDATA-2898] Fix double boundary condition and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/74/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2666: [CARBONDATA-2898] Fix double boundary condition and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/2666
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2666: [CARBONDATA-2898] Fix double boundary conditi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2666


---