GitHub user kevinjmh opened a pull request:
https://github.com/apache/carbondata/pull/2732 [WIP] lz4 as column compressor in final store Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kevinjmh/carbondata lz4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2732.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2732 ---- commit b5a4353c9f7536973f8aa1900757e2266cde31ee Author: Manhua <kevinjmh@...> Date: 2018-09-18T11:41:51Z lz4 test ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/331/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8578/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/508/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2732#discussion_r218477379 --- Diff: core/src/main/java/net/jpountz/lz4/LZ4CompressorWithLength.java --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// code ported from https://github.com/lz4/lz4-java/issues/119 +// remove this class when new version > 1.4.1 released +// this is only for test + +/* + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package net.jpountz.lz4; + +import java.nio.ByteBuffer; +import java.util.Arrays; + +/** + * Covenience class to include the length of the original decompressed data + * in the output compressed data, so that the user does not need to save + * the length at anywhere else. The compressed data must be decompressed by + * {@link LZ4DecompressorWithLength} and is NOT compatible with any other + * decompressors in lz4-java or any other lz4 tools. This class deliberately + * does not extend {@link LZ4Compressor} because they are not interchangable. + */ + +public class LZ4CompressorWithLength { --- End diff -- Is this copied from net.jpountz.lz4? --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2732#discussion_r218635891 --- Diff: core/src/main/java/net/jpountz/lz4/LZ4CompressorWithLength.java --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// code ported from https://github.com/lz4/lz4-java/issues/119 +// remove this class when new version > 1.4.1 released +// this is only for test + +/* + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package net.jpountz.lz4; + +import java.nio.ByteBuffer; +import java.util.Arrays; + +/** + * Covenience class to include the length of the original decompressed data + * in the output compressed data, so that the user does not need to save + * the length at anywhere else. The compressed data must be decompressed by + * {@link LZ4DecompressorWithLength} and is NOT compatible with any other + * decompressors in lz4-java or any other lz4 tools. This class deliberately + * does not extend {@link LZ4Compressor} because they are not interchangable. + */ + +public class LZ4CompressorWithLength { --- End diff -- yes. These codes didn't not packed in a released jar. Here we cope it only for test. See comment in L18-20 --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2732#discussion_r218644727 --- Diff: core/src/main/java/net/jpountz/lz4/LZ4DecompressorWithLength.java --- @@ -0,0 +1,191 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// code ported from https://github.com/lz4/lz4-java/issues/119 +// remove this class when new version > 1.4.1 released +// this is only for test + +/* + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package net.jpountz.lz4; + +import java.nio.ByteBuffer; + +// code ported from https://github.com/lz4/lz4-java/issues/119 +// remove this when new version > 1.4.1 released + +/** + * Convenience class to decompress data compressed by {@link LZ4CompressorWithLength}. + * This decompressor is NOT compatible with any other compressors in lz4-java + * or any other lz4 tools. + * The user does not need to specify the length of the compressed data or + * original data because the length of the original decompressed data is + * included in the compressed data. + */ + +public class LZ4DecompressorWithLength { + + private final LZ4FastDecompressor decompressor; + + /** + * Returns the decompressed length of compressed data in <code>src</code>. + * + * @param src the compressed data + * @return the decompressed length + */ + public static int getDecompressedLength(byte[] src) { + return getDecompressedLength(src, 0); + } + + /** + * Returns the decompressed length of compressed data in <code>src[srcOff:]</code>. + * + * @param src the compressed data + * @param srcOff the start offset in src + * @return the decompressed length + */ + public static int getDecompressedLength(byte[] src, int srcOff) { + return (src[srcOff] & 0xFF) | (src[srcOff + 1] & 0xFF) << 8 | + (src[srcOff + 2] & 0xFF) << 16 | src[srcOff + 3] << 24; + } + + /** + * Returns the decompressed length of compressed data in <code>src</code>. + * + * @param src the compressed data + * @return the decompressed length + */ + public static int getDecompressedLength(ByteBuffer src) { + return getDecompressedLength(src, src.position()); + } + + /** + * Returns the decompressed length of compressed data in <code>src[srcOff:]</code>. + * + * @param src the compressed data + * @param srcOff the start offset in src + * @return the decompressed length + */ + public static int getDecompressedLength(ByteBuffer src, int srcOff) { + return (src.get(srcOff) & 0xFF) | (src.get(srcOff + 1) & 0xFF) << 8 | + (src.get(srcOff + 2) & 0xFF) << 16 | src.get(srcOff + 3) << 24; + } + + /** + * Creates a new decompressor to decompress data compressed by {@link LZ4CompressorWithLength}. + * + * @param decompressor decompressor to use + */ + public LZ4DecompressorWithLength(LZ4FastDecompressor decompressor) { + this.decompressor = decompressor; + } + + /** + * Convenience method, equivalent to calling + * {@link #decompress(byte[], int, byte[], int) decompress(src, 0, dest, 0)}. + * + * @param src the compressed data + * @param dest the destination buffer to store the decompressed data + * @return the number of bytes read to restore the original input + */ + public int decompress(byte[] src, byte[] dest) { --- End diff -- As we talked offline, the return value may not be the uncompressed size, we need to handle it. --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2732#discussion_r218645316 --- Diff: core/src/main/java/net/jpountz/lz4/LZ4DecompressorWithLength.java --- @@ -0,0 +1,191 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// code ported from https://github.com/lz4/lz4-java/issues/119 +// remove this class when new version > 1.4.1 released +// this is only for test + +/* + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package net.jpountz.lz4; + +import java.nio.ByteBuffer; + +// code ported from https://github.com/lz4/lz4-java/issues/119 +// remove this when new version > 1.4.1 released + +/** + * Convenience class to decompress data compressed by {@link LZ4CompressorWithLength}. + * This decompressor is NOT compatible with any other compressors in lz4-java + * or any other lz4 tools. + * The user does not need to specify the length of the compressed data or + * original data because the length of the original decompressed data is + * included in the compressed data. + */ + +public class LZ4DecompressorWithLength { + + private final LZ4FastDecompressor decompressor; + + /** + * Returns the decompressed length of compressed data in <code>src</code>. + * + * @param src the compressed data + * @return the decompressed length + */ + public static int getDecompressedLength(byte[] src) { + return getDecompressedLength(src, 0); + } + + /** + * Returns the decompressed length of compressed data in <code>src[srcOff:]</code>. + * + * @param src the compressed data + * @param srcOff the start offset in src + * @return the decompressed length + */ + public static int getDecompressedLength(byte[] src, int srcOff) { + return (src[srcOff] & 0xFF) | (src[srcOff + 1] & 0xFF) << 8 | + (src[srcOff + 2] & 0xFF) << 16 | src[srcOff + 3] << 24; + } + + /** + * Returns the decompressed length of compressed data in <code>src</code>. + * + * @param src the compressed data + * @return the decompressed length + */ + public static int getDecompressedLength(ByteBuffer src) { + return getDecompressedLength(src, src.position()); + } + + /** + * Returns the decompressed length of compressed data in <code>src[srcOff:]</code>. + * + * @param src the compressed data + * @param srcOff the start offset in src + * @return the decompressed length + */ + public static int getDecompressedLength(ByteBuffer src, int srcOff) { + return (src.get(srcOff) & 0xFF) | (src.get(srcOff + 1) & 0xFF) << 8 | + (src.get(srcOff + 2) & 0xFF) << 16 | src.get(srcOff + 3) << 24; + } + + /** + * Creates a new decompressor to decompress data compressed by {@link LZ4CompressorWithLength}. + * + * @param decompressor decompressor to use + */ + public LZ4DecompressorWithLength(LZ4FastDecompressor decompressor) { + this.decompressor = decompressor; + } + + /** + * Convenience method, equivalent to calling + * {@link #decompress(byte[], int, byte[], int) decompress(src, 0, dest, 0)}. + * + * @param src the compressed data + * @param dest the destination buffer to store the decompressed data + * @return the number of bytes read to restore the original input + */ + public int decompress(byte[] src, byte[] dest) { --- End diff -- ok, fixed --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/341/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8588/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/518/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/349/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/350/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/527/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8597/ --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:
https://github.com/apache/carbondata/pull/2732 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/360/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8607/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2732 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/537/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2732#discussion_r219024705 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/compression/Lz4Compressor.java --- @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.datastore.compression; + +import java.io.IOException; +import java.nio.ByteBuffer; +import java.nio.DoubleBuffer; +import java.nio.FloatBuffer; +import java.nio.IntBuffer; +import java.nio.LongBuffer; +import java.nio.ShortBuffer; +import java.util.Arrays; + +import org.apache.carbondata.core.util.ByteUtil; + +import net.jpountz.lz4.LZ4Compressor; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4FastDecompressor; + + +public class Lz4Compressor implements Compressor { + + private LZ4Compressor compressor; + private LZ4FastDecompressor decompressor; + + public Lz4Compressor() { + LZ4Factory factory = LZ4Factory.fastestInstance(); + compressor = factory.fastCompressor(); + decompressor = factory.fastDecompressor(); + } + + @Override + public String getName() { + return "lz4"; + } + + @Override + public byte[] compressByte(byte[] unCompInput) { + // get max compressed length --- End diff -- If this code is copied from LZ4, better to make a declaration --- |
Free forum by Nabble | Edit this page |