[GitHub] [carbondata] KanakaKumar commented on a change in pull request #3183: [CARBONDATA-3349] Show sort_columns for each segment

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] KanakaKumar commented on a change in pull request #3183: [CARBONDATA-3349] Show sort_columns for each segment

GitBox
KanakaKumar commented on a change in pull request #3183: [CARBONDATA-3349] Show sort_columns for each segment
URL: https://github.com/apache/carbondata/pull/3183#discussion_r281179465
 
 

 ##########
 File path: integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
 ##########
 @@ -137,14 +159,65 @@ object CarbonStore {
               mergedTo,
               load.getFileFormat.toString,
               Strings.formatSize(dataSize.toFloat),
-              Strings.formatSize(indexSize.toFloat))
+              Strings.formatSize(indexSize.toFloat),
+              isSorted,
+              sortColumns)
           }
         }.toSeq
     } else {
       Seq.empty
     }
   }
 
+  private def getSortColumnsOfSegment(
+      load: LoadMetadataDetails,
+      readCommitScope: ReadCommittedScope,
+      tableDataMap: TableDataMap,
+      hadoopConf: Configuration
+  ): (String, String) = {
+    // isSorted has 3 options: true, false, ""(for legacy store, before version 1.5.1)
+    var isSorted = ""
+    // when isSorted is true, need show sort_columns
+    var sortColumns = ""
+    if (load.getFileFormat == FileFormat.ROW_V1) {
+      isSorted = "false"
+    } else if (tableDataMap != null && load.getVisibility.equalsIgnoreCase("true")) {
+      val indexHeader = SegmentIndexFileStore
+        .getIndexHeaderOfSegment(load,
 
 Review comment:
   Say if customer has few thousands of segments, reading header files of all these segments will take huge lot of time.
   Can think of alternative options
   1) Launch a job to read header and get the data
   2) Enhance segment status to hold the is sorted & sort column names flag
   3) Provide a parameter to show sort_columns only when user wants.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services