GitHub user xubo245 opened a pull request:
https://github.com/apache/carbondata/pull/2792 [CARBONDATA-2981] Support read primitive data type in CSDK [CARBONDATA-2981] Support read primitive data type in CSDK 1.support readNextCarbonRow 2.support read different primitive data type in c code from java side: int double short long string 3.support some data type and convert: date timestamp varchar decimal array<T> 3.1 return int when read date 3.2 return long when read timestamp 3.3 return string when read varchar 3.4 return string when read decimal 3.5 support array<string> This PR is based on PR2738, and will remove related commit after PR2738 merged. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? add new interface - [ ] Any backward compatibility impacted? No - [ ] Document update required? Yes - [ ] Testing done update test case in c code - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. jira 2951 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CARBONDATA-2981_primitiveDataType Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2792.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2792 ---- commit 5f93bfc999dc7309671d59b1e73e4085d2684d58 Author: xubo245 <xubo29@...> Date: 2018-09-20T10:35:34Z [CARBONDATA-2952] Provide CarbonReader C++ interface for SDK 1.init carbonreader,config data path and tablename 2.config ak sk endpoing for S3 3.configure projection 4.build carbon reader 5.hasNext 6.readNextRow 7.close optimize commit cd181b91c33d32e66a3f0026f1e3167a148b37e7 Author: xubo245 <xubo29@...> Date: 2018-09-29T09:06:03Z [CARBONDATA-2981] Support read primitive data type in CSDK 1.support readNextCarbonRow 2.support read different primitive data type in c code from java side: int double short long string 3.support some data type and convert: date timestamp varchar decimal array<T> su ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/657/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8919/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/851/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/661/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/856/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8924/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/662/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/663/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8926/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/664/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8927/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2792 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/859/ --- |
In reply to this post by qiuchenjian-2
Github user kunal642 commented on the issue:
https://github.com/apache/carbondata/pull/2792 @xubo245 Please add link for CSDK-guide in README file. --- |
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2792#discussion_r222686566 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/row/CarbonRow.java --- @@ -57,6 +74,154 @@ public String getString(int ordinal) { return (String) data[ordinal]; } + /** + * get short type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public short getShort(int ordinal) { + return (short) data[ordinal]; + } + + /** + * get int data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public int getInt(int ordinal) { + return (Integer) data[ordinal]; + } + + /** + * get long data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public long getLong(int ordinal) { + return (long) data[ordinal]; + } + + /** + * get array data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public Object[] getArray(int ordinal) { + return (Object[]) data[ordinal]; + } + + /** + * get double data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public double getDouble(int ordinal) { + return (double) data[ordinal]; + } + + /** + * get boolean data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public boolean getBoolean(int ordinal) { + return (boolean) data[ordinal]; + } + + /** + * get byte data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public Byte getByte(int ordinal) { + return (Byte) data[ordinal]; + } + + /** + * get float data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public float getFloat(int ordinal) { + return (float) data[ordinal]; + } + + /** + * get varchar data type data by ordinal + * This is for CSDK + * JNI don't support varchar, so carbon convert decimal to string + * + * @param ordinal the data index of carbonRow + * @return + */ + public String getVarchar(int ordinal) { + return (String) data[ordinal]; + } + + /** + * get decimal data type data by ordinal + * This is for CSDK + * JNI don't support Decimal, so carbon convert decimal to string + * + * @param ordinal the data index of carbonRow + * @return + */ + public String getDecimal(int ordinal) { + return ((BigDecimal) data[ordinal]).toString(); + } + + /** + * get data type by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public DataType getDataType(int ordinal) { + return dataTypes[ordinal]; + } + + /** + * get data type name by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public String getDataTypeName(int ordinal) { + return dataTypes[ordinal].getName(); + } + + /** + * get element type name by ordinal + * child schema data type name + * for example: return STRING if it's Array<String> in java + * + * @param ordinal the data index of carbonRow + * @return element type name + */ + public String getElementTypeName(int ordinal) { --- End diff -- If this method can work only for Array, we can rename it to getArrayElementTypeName and throw exception if its not array type. return null cause integration errors for unsupported ata types --- |
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2792#discussion_r222689040 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java --- @@ -116,6 +117,25 @@ public void initialize(InputSplit inputSplit, TaskAttemptContext context) return readSupport.readRow(carbonIterator.next()); } + /** + * get CarbonRow data, including data and datatypes + * + * @return carbonRow object or data array or T + * @throws IOException + * @throws InterruptedException + */ + public T getCarbonRow() throws IOException, InterruptedException { --- End diff -- I think instead of confusing T, we can define the return type as CarbonRow itself --- |
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2792#discussion_r222690123 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/readsupport/impl/DictionaryDecodeReadSupport.java --- @@ -81,7 +82,24 @@ data[i] = dictionaries[i].getDictionaryValueForKey((int) data[i]); } } - return (T)data; + return (T) data; + } + + /** + * get carbonRow, including data and datatpes + * + * @param data row data + * @return CarbonRow Object + */ + public T readCarbonRow(Object[] data) { --- End diff -- Instead of changing the DictionaryDecodeReadSupport & other classes hierarchy, I suggest to use a new Row class as utility and just provide required methods to avoid impact on base code. --- |
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2792#discussion_r222691023 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/row/CarbonRow.java --- @@ -18,8 +18,11 @@ package org.apache.carbondata.core.datastore.row; import java.io.Serializable; +import java.math.BigDecimal; --- End diff -- CarbonRow has different fields like data, rawData, rangeID etc. It seems not intended for end user API. I think we can add a simple Row class for SDK scope. --- |
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2792#discussion_r222691292 --- Diff: store/CSDK/CarbonReader.cpp --- @@ -89,10 +89,18 @@ jboolean CarbonReader::hasNext() { return hasNext; } +jobject CarbonReader::readNextCarbonRow() { + jclass carbonReader = jniEnv->GetObjectClass(carbonReaderObject); + jmethodID readNextCarbonRowID = jniEnv->GetMethodID(carbonReader, "readNextCarbonRow", + "()Lorg/apache/carbondata/core/datastore/row/CarbonRow;"); + jobject carbonRow = (jobject) jniEnv->CallObjectMethod(carbonReaderObject, readNextCarbonRowID); + return carbonRow; +} + jobjectArray CarbonReader::readNextRow() { jclass carbonReader = jniEnv->GetObjectClass(carbonReaderObject); - jmethodID readNextRow2ID = jniEnv->GetMethodID(carbonReader, "readNextStringRow", "()[Ljava/lang/Object;"); - jobjectArray row = (jobjectArray) jniEnv->CallObjectMethod(carbonReaderObject, readNextRow2ID); + jmethodID readNextStringRowID = jniEnv->GetMethodID(carbonReader, "readNextStringRow", "()[Ljava/lang/Object;"); --- End diff -- We can remove "readNextStringRow" and add a utility method in JNI to achieve the same. --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2792#discussion_r223238343 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/row/CarbonRow.java --- @@ -57,6 +74,154 @@ public String getString(int ordinal) { return (String) data[ordinal]; } + /** + * get short type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public short getShort(int ordinal) { + return (short) data[ordinal]; + } + + /** + * get int data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public int getInt(int ordinal) { + return (Integer) data[ordinal]; + } + + /** + * get long data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public long getLong(int ordinal) { + return (long) data[ordinal]; + } + + /** + * get array data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public Object[] getArray(int ordinal) { + return (Object[]) data[ordinal]; + } + + /** + * get double data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public double getDouble(int ordinal) { + return (double) data[ordinal]; + } + + /** + * get boolean data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public boolean getBoolean(int ordinal) { + return (boolean) data[ordinal]; + } + + /** + * get byte data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public Byte getByte(int ordinal) { + return (Byte) data[ordinal]; + } + + /** + * get float data type data by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public float getFloat(int ordinal) { + return (float) data[ordinal]; + } + + /** + * get varchar data type data by ordinal + * This is for CSDK + * JNI don't support varchar, so carbon convert decimal to string + * + * @param ordinal the data index of carbonRow + * @return + */ + public String getVarchar(int ordinal) { + return (String) data[ordinal]; + } + + /** + * get decimal data type data by ordinal + * This is for CSDK + * JNI don't support Decimal, so carbon convert decimal to string + * + * @param ordinal the data index of carbonRow + * @return + */ + public String getDecimal(int ordinal) { + return ((BigDecimal) data[ordinal]).toString(); + } + + /** + * get data type by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public DataType getDataType(int ordinal) { + return dataTypes[ordinal]; + } + + /** + * get data type name by ordinal + * + * @param ordinal the data index of carbonRow + * @return + */ + public String getDataTypeName(int ordinal) { + return dataTypes[ordinal].getName(); + } + + /** + * get element type name by ordinal + * child schema data type name + * for example: return STRING if it's Array<String> in java + * + * @param ordinal the data index of carbonRow + * @return element type name + */ + public String getElementTypeName(int ordinal) { --- End diff -- ok, done --- |
Free forum by Nabble | Edit this page |