GitHub user xubo245 opened a pull request:
https://github.com/apache/carbondata/pull/2691 [CARBONDATA-2912] Support CSV table load csv data with spark2.2 In branch-1.3, CSV table cann't load csv data with spark2.2 Carbon need upgrade commons-lang3 vision Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CARBONDATA-2912_twoInsert1.3.2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2691.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2691 ---- commit c055c8f33123bfb6e1103456bea23a0ff8c944ca Author: ravipesala <ravi.pesala@...> Date: 2018-02-03T20:31:00Z [maven-release-plugin] prepare release apache-carbondata-1.3.0-rc2 commit 607b4cef646b2b9a3c2a8fc687dc40342165979a Author: ravipesala <ravi.pesala@...> Date: 2018-02-03T20:31:53Z [maven-release-plugin] prepare for next development iteration commit 449668ad9cda869b14f31dcc2c6df6454701cddc Author: dhatchayani <dhatcha.official@...> Date: 2018-02-05T10:51:09Z [CARBONDATA-2131] Alter table adding long datatype is failing but Create table with long type is successful, in Spark 2.1 Modified code to make "Create table" supported data types and "alter add columns" supported data types consistent This closes #1932 commit a3b97f38412cf96ee041b6ebfbd7c39af54e391d Author: kumarvishal <kumarvishal.1802@...> Date: 2018-02-05T09:47:02Z [CARBONDATA-2142] Fixed Pre-Aggregate datamap creation issue Fixed Reverting changes issue in case of create pre-aggregate data⦠map creation is failing Removed look-up while creating the pre-aggregate data map Removed unused code This closes #1943 commit 2c5ecfbfe5ce3357d041207cad8edcf587e4115f Author: akashrn5 <akashnilugal@...> Date: 2018-02-07T13:14:33Z [CARBONDATA-2119]deserialization issue for carbonloadmodel Problem: Load model was not getting de-serialized in the executor due to which 2 different carbon table objects were being created. Solution: Reconstruct carbonTable from tableInfo if not already created. This closes #1947 commit 8b105a1e1f6e7e7e3b0bc13d44c1bf93fd821e31 Author: m00258959 <manish.gupta@...> Date: 2018-02-07T06:37:33Z [CARBONDATA-2143] Fixed query memory leak issue for task failure during initialization of record reader Problem: Whenever a query is executed, in the internalCompute method of CarbonScanRdd class record reader is initialized. A task completion listener is attached to each task after initialization of the record reader. During record reader initialization, queryResultIterator is initialized and one blocklet is processed. The blocklet processed will use available unsafe memory. Lets say there are 100 columns and 80 columns get the space but there is no space left for the remaining columns to be stored in the unsafe memory. This will result is memory exception and record reader initialization will fail leading to failure in query. In the above case the unsafe memory allocated for 80 columns will not be freed and will always remain occupied till the JVM process persists. Impact It is memory leak in the system and can lead to query failures for queries executed after one one query fails due to the above reason. Solution: Attach the task completion listener before record reader initialization so that if the query fails at the very first instance after using unsafe memory, still that memory will be cleared. This closes #1948 commit 9f73f0e60611c52278d2d475a89d42adebf32f60 Author: m00258959 <manish.gupta@...> Date: 2018-02-05T11:40:18Z [CARBONDATA-2134] Prevent implicit column filter list from getting serialized while submitting task to executor Problem In the current store blocklet pruning in driver and no further pruning takes place in the executor side. But still the implicit column filter list being sent to executor. As the size of list grows the cost of serializing and deserializing the list is increasing which can impact the query performance. Solution Remove the list from the filter expression before submitting the task to executor. This closes #1935 commit 1137c285f55dfdc0de24bdebf81d78187df93f8a Author: kunal642 <kunalkapoor642@...> Date: 2018-02-08T06:20:23Z [CARBONDATA-1763] Dropped table if exception thrown while creation Preaggregate table is not getting dropped when creation fails because Exceptions from undo metadata is not handled If preaggregate table is not registered with main table(main table updation fails) then it is not dropped from metastore. This closes #1951 commit 6e435de5e04ace63fe5b105e2f180ef0932d80d3 Author: rahulforallp <rahul.kumar@...> Date: 2018-02-06T13:11:35Z [CARBONDATA-2137] Delete query performance improved Following is the configuration used : SPARK_EXECUTOR_MEMORY : 200G SPARK_DRIVER_MEMORY : 20G SPARK_EXECUTOR_CORES : 32 SPARK_EXECUTOR_INSTANCEs : 3 Earlier it was taking 20 minute now it is taking approx 5 minute This closes #1937 commit bc3f825107517ad1e39a385c488beadd6022ab8e Author: akashrn5 <akashnilugal@...> Date: 2018-02-08T17:40:43Z [CARBONDATA-2150] Unwanted updatetable status files are being generated for the delete operation where no records are deleted Problem: Unwanted updatetable status files are being generated for the delete operation where no records are deleted Analysis: when the filter value for delete operation is less than the maximum value in that column, then getsplits() will return the block and hence in delete logic, it was creating update table status file even though delete operation was not done and added spark context to create database event This closes #1957 commit 15cc7fa97722d055ad5627b3a915ee6d2b6817d6 Author: akashrn5 <akashnilugal@...> Date: 2018-02-14T13:37:15Z [CARBONDATA-2182] Added one more params called extraParams in SessionParams and add carbonSessionInfo to CarbonEnvInitPreEvent Add one more param called ExtraParmas in SessionParams for session Level operations and pass the carbonSessionInfo to event, so that user can save information in that at session level in carbonSessionInfo This closes #1978 commit 27634deee82d7a1560e75f8dfc09333eb8df51db Author: anubhav100 <anubhav.tarar@...> Date: 2018-02-06T08:03:39Z [CARBONDATA-2133] Fixed Exception displays after performing select query on newly added Boolean Type Problem : In Restructure util and RestructureBasedVectorResultCollector to get the default value of a measure type the case for boolean data type was missing,and in DataTypeUtil to store default value in bytes case of boolean data type was missing Solution: Add the Boolean data type case This closes #1934 commit aff3b39efd772a881590432816369a05d0cb5855 Author: akashrn5 <akashnilugal@...> Date: 2018-02-15T13:30:26Z [CARBONDATA-2103] Optimize show tables for filtering datamaps Problem Show tables was taking more time as two times lookup was happening to filter out the datamaps Solution add a hive table property which is true for all tables and false for datamaps like preAggregate table and show tables filter out these tables based on the property. This closes #1980 commit 7beef112b59c9ccfe14baca87ae841cfe77e4dce Author: akashrn5 <akashnilugal@...> Date: 2018-02-14T10:15:04Z [CARBONDATA-2183] Fix compaction when segment is delete during compaction and remove unnecessary parameters in functions Problem: when compaction is started and job is running, and parallelly the segment involved in the compaction is deleted using DeleteSegmentByID, then compaction is success. Solution: when compaction is started and job is running, and parallelly the segment involved in the compaction is deleted using DeleteSegmentByID, then compaction should be aborted and failed. and proper error message should thrown to user. THis PR also removes the unnecessary parameters in functions. This closes #1979 commit 39ac94e462e6571414dee8f58c174e44a79f8ad4 Author: kunal642 <kunalkapoor642@...> Date: 2018-02-12T19:23:31Z [CARBONDATA-2142] [CARBONDATA-1763] Fixed issues while creation concurrent datamaps Analysis: 1. GenerateTableSchemaString in CarbonMetastore did not have any specific implementation for hive metastore due to which carbontables were being cached in MetaData. As there is no way to refresh table in hivemetastore therefore this is wrong. All queries should get the latest carbon table from metastore and not from cache. 2. If updating the main table status fails then revertMainTableChanges method is called to revert the changes. The logic to revert was wrong which led to wrong entry getting deleted from the schema. 3. Moved the force remove logic before taking locks as deletion from metastore should happen even if the lock if not present as the table is in stale state(Entry is not there in parent but available in metastore). This closes #1975 commit c2785b352f7b7cb2dd524811b0696fb18c12d5b0 Author: BJangir <babulaljangir111@...> Date: 2018-02-11T19:32:30Z [CARBONDATA-2161] update mergeTo column for compacted segment of streaming table This closes #1971 commit f8a62a9bd8ba39cd6bc247c587a7a3e1afd99254 Author: QiangCai <qiangcai@...> Date: 2018-02-11T08:06:01Z [CARBONDATA-2151][Streaming] Fix filter query issue on streaming table 1.Fix filter query issue for timestamp, date, decimal 2.Add more test case dataType: int, streaming, float, double, decimal, timestamp, date, complex operation: =, <>, >=, >, <, <=, in, like, between, is null, is not null This closes #1969 commit 4bbbd4b1df444163cfb72cf74a05c1a9d09e1200 Author: BJangir <babulaljangir111@...> Date: 2018-02-19T17:01:00Z [CARBONDATA-2185] Add InputMetrics for Streaming Reader This closes #1985 commit 6f9016db52dd3f9c31ba20e585debfc283e2594e Author: Zhang Zhichao <441586683@...> Date: 2018-02-09T09:32:54Z [CARBONDATA-2149]Fix complex type data displaying error when use DataFrame to write complex type data The default value of 'complex_delimiter_level_1' and 'complex_delimiter_level_2' is wrong, it must be '$' and ':', not be '$' and '\:'. Escape characters '\' need to be added only when using delimiters in ArrayParserImpl or StructParserImpl This closes #1962 commit b0a2fabcc8584dfba24ad0ea135948f5365a7335 Author: QiangCai <qiangcai@...> Date: 2018-02-25T10:53:41Z [CARBONDATA-2200] Fix bug of LIKE operation on streaming table Fix bug of LIKE operation on streaming table, LIKE operation will be converted to StartsWith / EndsWith / Contains expression. Carbon will use RowLevelFilterExecuterImpl to evaluate this expression. Streaming table also should implement RowLevelFilterExecuterImpl. This closes #1996 commit e363dd1a68e2138591a930055dd1701a1245825f Author: rahulforallp <rahul.kumar@...> Date: 2018-02-25T09:55:26Z [CARBONDATA-2201] NPE fixed while triggering the LoadTablePreExecutionEvent before Streaming While triggering the LoadTablePreExecutionEvent we require options provided by user and the finalOptions. In case of streaming both are same. If we pass null . It may cause NPE. This closes #1997 commit 0f210c86ca3ee9f0fa845cdeaef418ed9253b6f8 Author: Zhang Zhichao <441586683@...> Date: 2018-02-04T04:54:24Z [MINOR]Remove dependency of Java 1.8 This closes #1928 commit 758d03e783e324f70b6599be7feb1951b1034f51 Author: ravipesala <ravi.pesala@...> Date: 2018-02-09T04:07:02Z [CARBONDATA-2168] Support global sort for standard hive partitioning This closes #1972 commit 1997ca235f90b5746262c9654b685b9b6bd3f16a Author: ravipesala <ravi.pesala@...> Date: 2018-02-14T19:01:56Z [CARBONDATA-2187][PARTITION] Partition restructure for new folder structure and supporting partition location feature This closes #1984 commit b51d8186a82818672067dfd0387af6ff505f940c Author: Jatin <jatin.demla@...> Date: 2018-02-23T11:26:17Z [CARBONDATA-2199] Fixed Dimension column after restructure getting wrong block datatype Problem: Changing datatype of measure having sort_columns calls for restructure and after having restructure it changes the datatype to actual datatype for which accessing the data with changed datatype gives exception of incorrect length. Solution: Store the datatype in DimensionInfo while restructuring and access the same datatype to get the block data type. This closes #1993 commit 7726b4f9b379b0eec4b9fff6571415f47fa55587 Author: Jatin <jatin.demla@...> Date: 2018-02-27T10:43:40Z [CARBONDATA-2207] Fix testcases after using hive metastore CarbonTable was getting null in case of hivemetatore so, fetch the same from metastore instead of carbon. This closes #2005 commit b360f9084f873bc096d7fabfde20730fbc752350 Author: chenliang613 <chenliang613@...> Date: 2018-02-08T17:32:38Z [HOTFIX] Add partition usage code This closes #1956 commit b9a6b68658fd0f7f408102374b3ef31dcfe44cea Author: akashrn5 <akashnilugal@...> Date: 2018-02-28T11:58:43Z [CARBONDATA-2217]fix drop partition for non existing partition and set FactTimeStamp during compaction for partition table Problem: 1)when drop partition is fired for a column which does not exists , it throws null pointer exception 2)select * is not working when clean files operation is fired after second level of compaction, it throws exception sometimes 3)new segment is getting created for all the segments if any one partition is dropped Solution: 1)have a null check , if column does not exists 2)give different timestamp for fact files during compaction to avoid deletion of files during clean files 3)for the partition which is dropped, only for that new segment file should be written and not for all the partition 4) This PR also contains fix for creating a pre aggregate table with same name which has already created in other database This closes #2017 commit 660190fb544e338acd131e7cc30de171e7600df6 Author: akashrn5 <akashnilugal@...> Date: 2018-02-28T12:08:50Z [CARBONDATA-2103]Make show datamaps configurable in show tables command Make the show datamaps in show tables configurable: a new carbon property is added called carbon.query.show.datamaps, by default is it true, show show tables will list all the table including main table and datamaps. if we want to filter datamaps in show tables, configure this as false This closes #2015 commit 092b5d58a50498a0a66bf6166907965612eb1fc5 Author: ravipesala <ravi.pesala@...> Date: 2018-03-01T06:34:53Z [CARBONDATA-2219] Added validation for external partition location to use same schema. This closes #2018 ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8288/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/219/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2691 @xubo245 What does 'CSV table' mean in the title? --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2691 Carbon need upgrade commons-lang3 vision --- version --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2691 @xuchuanyin CSV table is "create table ... using csv options...". In branch-1.3, CSV table cann't load csv data with spark2.2 because: If using low commons-lang version, the default for the timestampFormat is yyyy-MM-dd'T'HH:mm:ss.SSSXXX which is an illegal argument and can not be recognized after upgrade spark from 2.1 to 2.2. It needs to be set when you are writing the dataframe out. --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2691 retest this please --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2691 Oh, I didn't know this grammar before. Is CSV table a carbon table or spark table? --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2691 sparkï¼it support using parquet too --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8313/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/243/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/3/ --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2691 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8340/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/102/ --- |
In reply to this post by qiuchenjian-2
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/2691 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/270/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8343/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/105/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2691 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/273/ --- |
Free forum by Nabble | Edit this page |