Ingestion re-implement on updated Elastic.Ingest.Elasticsearch#2755
Open
Ingestion re-implement on updated Elastic.Ingest.Elasticsearch#2755
Conversation
…mappings Replace manual channel orchestration with IncrementalSyncOrchestrator<T> and source-generated ElasticsearchTypeContext from Elastic.Mapping 0.4.0. Add field type attributes ([Keyword], [Text], [Object], etc.) directly on DocumentationDocument to drive the mapping source generator, replacing verbose manual JSON mappings. - Update Elastic.Ingest.Elasticsearch 0.17.1 → 0.19.0, add Elastic.Mapping 0.4.0 - Add mapping attributes to DocumentationDocument and IndexedProduct - Create DocumentationMappingConfig.cs with two Entity variants (lexical/semantic) - Rewrite ElasticsearchMarkdownExporter to use orchestrator for dual-index mode - Delete ElasticsearchIngestChannel.cs and ElasticsearchIngestChannel.Mapping.cs - Remove unused ReindexAsync from ElasticsearchOperations - Update SearchBootstrapFixture to use IngestChannel with semantic type context
Replaces `ElasticsearchOptions` with `DocumentationEndpoints` as the single source of truth for
Elasticsearch configuration across all API apps, MCP server, and integration tests.
- Adds `IndexName` property to `ElasticsearchEndpoint` with a field-backed getter defaulting to
`{IndexNamePrefix}-dev-latest`.
- Creates `ElasticsearchEndpointFactory` in `ServiceDefaults` to centralize user-secrets and
environment variable reading, eliminating the duplicated `72f50f33` secrets ID pattern.
- Registers `DocumentationEndpoints` as a singleton in `AddDocumentationServiceDefaults`.
- Updates `ElasticsearchClientAccessor` to accept `DocumentationEndpoints` instead of
`ElasticsearchOptions`, supporting both API key and basic authentication.
- Updates all gateway consumers (`NavigationSearchGateway`, `FullSearchGateway`,
`DocumentGateway`, `ElasticsearchAskAiMessageFeedbackGateway`) to use endpoint properties.
- Simplifies all three integration test files (`SearchRelevanceTests`,
`McpToolsIntegrationTestsBase`, `SearchBootstrapFixture`) to use `ElasticsearchEndpointFactory`
and `ElasticsearchTransportFactory`, removing manual config construction.
- Deletes `ElasticsearchOptions.cs` and removes `Microsoft.Extensions.Configuration.UserSecrets`
from the Search project.
Move mapping context (DocumentationMappingContext, LexicalConfig, SemanticConfig, DocumentationAnalysisFactory) from Elastic.Markdown to Elastic.Documentation so both indexing and search derive index names from the same source. Add ContentHash helper to avoid Elastic.Ingest.Elasticsearch dependency in Elastic.Documentation. Remove IndexName from ElasticsearchEndpoint, add Namespace to DocumentationEndpoints. ElasticsearchEndpointFactory resolves namespace from DOCUMENTATION_ELASTIC_INDEX env var (backward compat), DOTNET_ENVIRONMENT, ENVIRONMENT, or falls back to "dev". ElasticsearchClientAccessor derives SearchIndex and RulesetName from namespace instead of parsing the old IndexName string. Remove ExtractRulesetName and all hardcoded "semantic-docs-dev-latest" assignments from tests and config files.
Enable IndexPatternUseBatchDate now that Elastic.Mapping supports it, and pass batchTimestamp to IngestChannelOptions in the lexical-only path so the channel uses the exporter's timestamp for index name computation.
…meter Simplify DocumentationTooling endpoint resolution by delegating to ElasticsearchEndpointFactory. Add missing skipOpenApi parameter to IsolatedIndexService.Index call.
The lexical-only code path manually reimplemented drain, delete-stale, refresh, and alias logic that the orchestrator handles automatically. Remove the flag end-to-end: CLI parameters, configuration, exporter branching, and CLI documentation.
🔍 Preview links for changed docs |
Add .jina-embeddings-v5-text-small inference on 6 fields (title, abstract, ai_rag_optimized_summary, ai_questions, ai_use_cases, stripped_body) to enable hybrid sparse+dense retrieval. Rename InferenceId to ElserInferenceId for clarity.
dfe279a to
50c89b2
Compare
Use source-generated IStaticMappingResolver delegates for auto-stamping BatchIndexDate and LastUpdated instead of manual assignment. Replace DocumentationAnalysisFactory.CreateContext with direct context customization via WithIndexName() and record-with expressions. Pass IndexSettings for default_pipeline conditionally at runtime.
…nment
Rename indexNamespace to buildType throughout the exporter pipeline so
callers pass the build type (assembler, isolated, codex) instead of the
environment name. Search services now hardcode "assembler" as the type
since they always target assembler indices.
ResolveNamespace renamed to ResolveEnvironment and updated to parse the
old production index format ({variant}-docs-{env}-{timestamp}) to
extract the environment name.
src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
Dismissed
Show dismissed
Hide dismissed
… to simplify index naming logic. Update Elasticsearch dependencies to version 0.28.0.
reakaleek
approved these changes
Feb 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Migrate Elasticsearch indexing to source-generated mappings via
Elastic.Mappingand theIncrementalSyncOrchestratorfromElastic.Ingest.Elasticsearch, replacing ~420 lines of hand-rolled ingest channel code. Introduces a clear separation between build type ({type}) and environment ({env}) in all index naming.Key changes
DocumentationMappingConfig.csdeclares index structure, field mappings, and analysis settings using[Entity<T>]attributes. The source generator produces a typedCreateContext(type:, env:)factory, eliminating manual index name construction.IncrementalSyncOrchestratorreplaces the two manually managedElasticsearchLexicalIngestChannel/ElasticsearchSemanticIngestChannelclasses. Dual-index writes, alias rotation, and hash-based change detection are now handled by the library.ElasticsearchEndpointFactoryresolves Elasticsearch URL, credentials, and environment from user secrets and env vars in one place, shared by all services.{type}, while{env}auto-resolves from theENVIRONMENTenv var. Previously{type}was incorrectly receiving the environment name.Elasticsearch resource naming changes
All resources now follow a structured
docs-{type}.{variant}-{env}convention:edge)assembler, env=edge)lexical-docs-edge-2025.10.23.120521docs-assembler.lexical-edge-2025.10.23.120521lexical-docs-edge-latestdocs-assembler.lexical-edge-latestlexical-docs-edgedocs-assembler.lexical-edge-latestlexical-docs-edge-templatedocs-assembler.lexical-edge-templatesemantic-docs-edge-2025.10.23.120521docs-assembler.semantic-edge-2025.10.23.120521semantic-docs-edge-latestdocs-assembler.semantic-edge-latestsemantic-docs-edgedocs-assembler.semantic-edge-latestsemantic-docs-edge-templatedocs-assembler.semantic-edge-templatedocs-edgedocs-assemblerdocs-ruleset-edgedocs-ruleset-assemblerThe old
{variant}-docs-{env}pattern placed the variant (lexical/semantic) first and embedded only the environment. The newdocs-{type}.{variant}-{env}pattern groups all docs indices under a commondocs-*prefix, encodes the build type, and uses a dot-separated structure for ILM/SLM grouping.Library version bumps
Elastic.Ingest.Elasticsearch0.17.1 → 0.28.0 andElastic.Mapping(new) 0.28.0:IncrementalSyncOrchestrator<T>— manages dual-index (primary + secondary) writes with coordinated alias rotation, replacing two bespoke channel wrappers and manual multiplex/reindex strategy logic.[ElasticsearchMappingContext]— generates typed mapping builders, field configurators, andCreateContext()from[Entity<T>]attributes, eliminating runtime reflection and hand-built JSON mapping strings.HashedBulkUpdate— content-hash-based deduplication built into the channel, replacing manual hash computation for change detection.BootstrapMethod/PreBootstrapTask— declarative bootstrap lifecycle hooks replace imperative init sequences for synonyms, query rules, and enrichment policies.ConfigureAnalysis/IndexSettingsonElasticsearchTypeContext— analysis and settings are composed at context creation rather than injected via callback overrides.Net effect: deleted
ElasticsearchIngestChannel.cs(161 lines) andElasticsearchIngestChannel.Mapping.cs(260 lines), and simplified the exporter constructor and lifecycle significantly.Test plan
dotnet buildpasses (verified)docs-assembler.semantic-edge-latestdocs-assembler(notdocs-edge)docs-assembler.semantic-*read aliasDOCUMENTATION_ELASTIC_INDEXenv var (e.g.lexical-docs-edge-2025.10.23.120521) correctly parses environment asedge