Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2200: [CARBONDATA-2373][DataMap] Add bloom datamap ...

Classic

List

Threaded

34 messages Options

qiuchenjian-2

[GitHub] carbondata pull request #2200: [CARBONDATA-2373][DataMap] Add bloom datamap ...

GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/2200

[CARBONDATA-2373][DataMap] Add bloom datamap to support precise equal query

For each indexed column, adding a bloom filter for each blocklet to
indicate whether it belongs to this blocklet.
Currently bloom filter is using guava version.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

- [x] Any interfaces changed?
`Yes, added interface in DataMapMeta`
- [x] Any backward compatibility impacted?
`NO`
- [x] Document update required?
`NO`
- [x] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
`Added tests`
- How it is tested? Please attach test report.
`Tested in local machine`
- Is it a performance related change? Please attach the performance test report.
`Bloom datamap can reduce blocklets in precise equal query scenario ann enhance the query performance`
- Any additional information to help reviewers in testing this change.
`NO`
- [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
`Not related`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata 0421_bloom_datamap

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2200.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2200

----
commit 160b0f42248fe719f898c10cb84ab2d32eafdaac
Author: xuchuanyin <xuchuanyin@...>
Date: 2018-04-21T02:59:04Z

Add bloom datamap using bloom filter

For each indexed column, adding a bloom filter for each blocklet to
indicate whether it belongs to this blocklet.
Currently bloom filter is using guava version.

----

---

qiuchenjian-2

[GitHub] carbondata pull request #2200: [CARBONDATA-2373][DataMap] Add bloom datamap ...

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2200#discussion_r183198962

--- Diff: datamap/bloom/pom.xml ---
@@ -0,0 +1,88 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+
+ <parent>
+ <groupId>org.apache.carbondata</groupId>
+ <artifactId>carbondata-parent</artifactId>
+ <version>1.4.0-SNAPSHOT</version>
+ <relativePath>../../pom.xml</relativePath>
+ </parent>
+
+ <artifactId>carbondata-bloom</artifactId>
+ <name>Apache CarbonData :: Bloom Index DataMap</name>
+
+ <properties>
+ <dev.path>${basedir}/../../dev</dev.path>
+ <lucene.version>6.3.0</lucene.version>
--- End diff --

can you move this definition in parent pom

---

qiuchenjian-2