Es external index by LinhongLiu · Pull Request #2 · LinhongLiu/OAP

LinhongLiu · 2018-08-13T06:04:20Z

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

LuciferYang · 2018-08-13T08:17:09Z

src/main/java/org/apache/spark/sql/execution/datasources/parquet/OapTransportClient.java

@@ -0,0 +1,94 @@
+package org.apache.spark.sql.execution.datasources.parquet;


Add Spark LICENSE?

LuciferYang · 2018-08-13T08:19:42Z

src/main/scala/org/apache/spark/sql/execution/datasources/oap/index/OapEsIndex.scala

+
+  private val client = getEsClient
+
+  private val mappings = {


number_of_shards and number_of_replicas should be configurable

LuciferYang · 2018-08-13T08:23:15Z

src/main/scala/org/apache/spark/sql/execution/datasources/oap/index/OapEsIndex.scala

+      bulkRequest.add(indexRequest)
+    }
+    bulkRequest.execute().actionGet()
+    client.admin().indices().refresh(new RefreshRequest(tableName)).actionGet()


We need to make sure whether refresh operation is necessary to let indexes to be available.

yes, it's necessary.

LuciferYang · 2018-08-13T08:25:47Z

src/main/scala/org/apache/spark/sql/execution/datasources/oap/index/OapEsIndex.scala

+   */
+  override protected def getEsClient: TransportClient = {
+    val settings = Settings.builder
+      .put("cluster.name", "elasticsearch")


cluster.name should be configurable, not all cluster is called elasticsearch

let's do this after all configs figured out. for example: host, port.

## What changes were proposed in this pull request? 1st subtask of steps commented in #757. - Refactored the class hierarchy of `FiberSensor` - Combine 2 `FiberxxSensor` to 1 - Remove unused things ## How was this patch tested? Existing tests.

* add profile for spark 2.1 and 2.2 * Move spark code to spark2.1 * add 2.2 SqlBase.g4 * add 2.2 ColumnBatch.java * add 2.2 DAGScheduler.scala * Add 2.2 DataSource.scala * add 2.2 DataSourceStrategy * revert changes in FileFormatWriter when moving * add 2.2 FileFormatWriter.scala * add 2.2 FileSourceStrategy.scala * revert changes in InsertIntoHadoopFsRelationCommand when moving code * add 2.2 OutputWriter.scala * fix FileFormatWriter * add 2.2 DataSourceScanExec * update 2.2 SparkSqlParser * update HiveThriftServer2.scala * update 2.2 SparkSQLCLIDriver * update 2.2 ApiRootResource.scala * update 2.2 BlockId * update 2.2 BitSet * fix OapConf * update travis to test 2.2 * revert changes in 2.1 OutputWriter * Fix CaseINsensitiveMap compatitible issue * fix InputFileNameHolder compatitible issue * fix compatitible issue in indexPlans * Fix ParquetFilters inaccessible issue * Fix OapIndexInfoStatusSerDe compatible issue * Fix FileIndex compatible issue * Add LogicalPlanAdapter * Add AggregateFunctionAdapter * Fix listFiles * Add FileSourceScanExecAdapter * Fix OapConf * Add RpcEndpointRefAdapter * Fix ApiRootResource * Add OapSessionStateBuilder and fix OapEnv and OapSession for 2.2 * Fix OutputWriter * Fix more * sqlConf -> sqlContext.conf * Fix listFiles again * Fix OapStrategies * change strategies order * update java version and fix SparkConf issue in test * Fix task write result in FileFormatWriter * add PARQUET_INT64_AS_TIMESTAMP_MILLIS as LuciferYang suggests * import ColumnarBatchScan * update ColumnarBatchScan * fix unit tests * update trvis * update travis to test both 2.1 and 2.2 * remove FileIndex.scala * update OapEnv and readme * tweak

## What changes were proposed in this pull request? SubTask 1 of #825(#829), import `VectorizedRleValuesReader`, `VectorizedPlainValuesReader` ,`VectorizedRleValuesReader` from Spark Code include 2.1 & 2.2, will change some access modifer of these class. ## How was this patch tested? mvn test pass

* update pom.xml * Spark 2.3 support - import spark 2.2 source code * Revert "update pom.xml" This reverts commit 24e4b0c. * quit Revert "Revert "update pom.xml"" This reverts commit 8b5914b. * Revert "Spark 2.3 support - import spark 2.2 source code" This reverts commit 7d70529. * update pom.xml with exclude

LuciferYang reviewed Aug 13, 2018

View reviewed changes

LinhongLiu force-pushed the es-external-index branch from 799e05d to 24d220e Compare August 13, 2018 10:46

Guo Chenzhao and others added 12 commits August 14, 2018 12:40

Refactor FiberSensor (#823)

dfa11c8

## What changes were proposed in this pull request? 1st subtask of steps commented in #757. - Refactored the class hierarchy of `FiberSensor` - Combine 2 `FiberxxSensor` to 1 - Remove unused things ## How was this patch tested? Existing tests.

add es dependency

cb98660

OapTransportClient

6466e84

es module interface

75d3cf7

es module with create/insert/drop

826b895

fix something

4ac48e5

disable test suite

439cf1a

add DDL to create/drop ES index

dd97eea

add spark 2.2 support

aaf1c2f

LinhongLiu force-pushed the es-external-index branch from cfd2540 to aaf1c2f Compare August 23, 2018 03:16

integrate code

ef177b2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Es external index#2

Es external index#2
LinhongLiu wants to merge 13 commits intomasterfrom
es-external-index

LinhongLiu commented Aug 13, 2018

Uh oh!

LuciferYang Aug 13, 2018

Uh oh!

LuciferYang Aug 13, 2018

Uh oh!

LuciferYang Aug 13, 2018

Uh oh!

LinhongLiu Aug 13, 2018

Uh oh!

LuciferYang Aug 13, 2018 •

edited

Loading

Uh oh!

LinhongLiu Aug 13, 2018

Uh oh!

LuciferYang Aug 14, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@@ -0,0 +1,94 @@
		package org.apache.spark.sql.execution.datasources.parquet;

Conversation

LinhongLiu commented Aug 13, 2018

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

LuciferYang Aug 13, 2018

Choose a reason for hiding this comment

Uh oh!

LuciferYang Aug 13, 2018

Choose a reason for hiding this comment

Uh oh!

LuciferYang Aug 13, 2018

Choose a reason for hiding this comment

Uh oh!

LinhongLiu Aug 13, 2018

Choose a reason for hiding this comment

Uh oh!

LuciferYang Aug 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LinhongLiu Aug 13, 2018

Choose a reason for hiding this comment

Uh oh!

LuciferYang Aug 14, 2018

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

LuciferYang Aug 13, 2018 •

edited

Loading