Skip to content

Es external index#2

Open
LinhongLiu wants to merge 13 commits intomasterfrom
es-external-index
Open

Es external index#2
LinhongLiu wants to merge 13 commits intomasterfrom
es-external-index

Conversation

@LinhongLiu
Copy link
Owner

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@@ -0,0 +1,94 @@
package org.apache.spark.sql.execution.datasources.parquet;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Spark LICENSE?


private val client = getEsClient

private val mappings = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

number_of_shards and number_of_replicas should be configurable

bulkRequest.add(indexRequest)
}
bulkRequest.execute().actionGet()
client.admin().indices().refresh(new RefreshRequest(tableName)).actionGet()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure whether refresh operation is necessary to let indexes to be available.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's necessary.

*/
override protected def getEsClient: TransportClient = {
val settings = Settings.builder
.put("cluster.name", "elasticsearch")
Copy link
Collaborator

@LuciferYang LuciferYang Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster.name should be configurable, not all cluster is called elasticsearch

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do this after all configs figured out. for example: host, port.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Guo Chenzhao and others added 12 commits August 14, 2018 12:40
## What changes were proposed in this pull request?

1st subtask of steps commented in #757. 

- Refactored the class hierarchy of `FiberSensor`
- Combine 2 `FiberxxSensor` to 1
- Remove unused things

## How was this patch tested?

Existing tests.
* add profile for spark 2.1 and 2.2

* Move spark code to spark2.1

* add 2.2 SqlBase.g4

* add 2.2 ColumnBatch.java

* add 2.2 DAGScheduler.scala

* Add 2.2 DataSource.scala

* add 2.2 DataSourceStrategy

* revert changes in FileFormatWriter when moving

* add 2.2 FileFormatWriter.scala

* add 2.2 FileSourceStrategy.scala

* revert changes in InsertIntoHadoopFsRelationCommand when moving code

* add 2.2 OutputWriter.scala

* fix FileFormatWriter

* add 2.2 DataSourceScanExec

* update 2.2 SparkSqlParser

* update HiveThriftServer2.scala

* update 2.2 SparkSQLCLIDriver

* update 2.2 ApiRootResource.scala

* update 2.2 BlockId

* update 2.2 BitSet

* fix OapConf

* update travis to test 2.2

* revert changes in 2.1 OutputWriter

* Fix CaseINsensitiveMap compatitible issue

* fix InputFileNameHolder compatitible issue

* fix compatitible issue in indexPlans

* Fix ParquetFilters inaccessible issue

* Fix OapIndexInfoStatusSerDe compatible issue

* Fix FileIndex compatible issue

* Add LogicalPlanAdapter

* Add AggregateFunctionAdapter

* Fix listFiles

* Add FileSourceScanExecAdapter

* Fix OapConf

* Add RpcEndpointRefAdapter

* Fix ApiRootResource

* Add OapSessionStateBuilder and fix OapEnv and OapSession for 2.2

* Fix OutputWriter

* Fix more

* sqlConf -> sqlContext.conf

* Fix listFiles again

* Fix OapStrategies

* change strategies order

* update java version and fix SparkConf issue in test

* Fix task write result in FileFormatWriter

* add PARQUET_INT64_AS_TIMESTAMP_MILLIS as LuciferYang suggests

* import ColumnarBatchScan

* update ColumnarBatchScan

* fix unit tests

* update trvis

* update travis to test both 2.1 and 2.2

* remove FileIndex.scala

* update OapEnv and readme

* tweak
## What changes were proposed in this pull request?
SubTask 1 of #825(#829),  import `VectorizedRleValuesReader`, `VectorizedPlainValuesReader` ,`VectorizedRleValuesReader` from Spark Code include 2.1 & 2.2, will change some access modifer of these class.

## How was this patch tested?

mvn test pass
* update pom.xml

* Spark 2.3 support - import spark 2.2 source code

* Revert "update pom.xml"

This reverts commit 24e4b0c.

* quit Revert "Revert "update pom.xml""

This reverts commit 8b5914b.

* Revert "Spark 2.3 support - import spark 2.2 source code"

This reverts commit 7d70529.

* update pom.xml with exclude
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants