From e4a815818e55c383d6958612bae27c1664e74bb5 Mon Sep 17 00:00:00 2001 From: Jeff Mesnil Date: Thu, 29 Jan 2026 10:01:26 +0100 Subject: [PATCH 1/6] chore: Add PRD to improve performance The insights-runtime-extractor must be able to scan only a subset of containers (requested by the Insights Operator) instead of collecting data for *all* containers when the Insights Operator has a hard limit on number of containers that are gathered. This addresses https://issues.redhat.com/browse/CCXDEV-15960. Signed-off-by: Jeff Mesnil --- docs/prd/prd-0001-container-ids-filtering.md | 401 +++++++++++++++++++ 1 file changed, 401 insertions(+) create mode 100644 docs/prd/prd-0001-container-ids-filtering.md diff --git a/docs/prd/prd-0001-container-ids-filtering.md b/docs/prd/prd-0001-container-ids-filtering.md new file mode 100644 index 00000000..076ebc32 --- /dev/null +++ b/docs/prd/prd-0001-container-ids-filtering.md @@ -0,0 +1,401 @@ +# PRD: Container IDs Filtering for Runtime Extraction + +## Overview + +This document describes the changes required to support targeted container scanning via a list of container IDs. Currently, the system either scans all running containers or a single container. This enhancement will allow clients to request scanning of a specific subset of containers by providing their IDs in a POST request. + +## Motivation + +- **Performance**: Scanning only specific containers reduces resource usage and response time. + +--- + +## Current Architecture + +### Request Flow +``` +HTTP Client → Exporter (Go) → Extractor Server (Rust) → Coordinator (Rust) → Fingerprints + GET /gather_runtime_info TCP 3000 /coordinator +``` + +### Current Capabilities +- **Exporter**: Only supports `GET /gather_runtime_info?hash=true|false` +- **Extractor Server**: Receives empty TCP payload, spawns coordinator +- **Coordinator**: Accepts optional single `--container-id ` argument (not used by the extractor server) + +--- + +## Proposed Changes + +### 1. Exporter HTTP Server (Go) + +**File**: `exporter/cmd/exporter/main.go` + +#### New Endpoint +Add a new HTTP POST endpoint alongside the existing GET endpoint: + +| Method | Path | Request Body | Description | +|--------|------|--------------|-------------| +| POST | `/gather_runtime_info` | JSON object with container IDs | Scan specific containers | + +#### Request Format +```json +{ + "containerIds": ["abc123...", "def456...", "ghi789..."] +} +``` + +#### Validation Rules +- Request body must be valid JSON +- `containerIds` field must be present and be an array +- Array can be empty (equivalent to scanning all containers) +- Each container ID must be a non-empty string +- Container IDs should be the short or full CRI-O container ID format + +#### Changes Required + +1. **Add JSON request struct**: + ```go + type GatherRuntimeInfoRequest struct { + ContainerIds []string `json:"containerIds"` + } + ``` + +2. **Update HTTP handler**: + - Parse request method (GET vs POST) + - For POST: parse JSON body and validate + - Pass container IDs to `triggerRuntimeInfoExtraction()` + +3. **Update `triggerRuntimeInfoExtraction()` function**: + - Add parameter: `containerIds []string` + - Serialize container IDs as a single string of comma-separated container IDs before sending over TCP + - Send string payload instead of empty payload to extractor server + +4. **TCP Protocol Change**: + - Current: Empty payload triggers extraction + - New: String payload with a list of comma-separated container IDs + ``` + "abc123,def456,..." + ``` + - Empty string means "scan all containers" + +#### Backward Compatibility +- GET endpoint behavior remains unchanged (scans all containers) +- GET internally sends an empty string `""` to maintain protocol consistency + +--- + +### 2. Extractor Server (Rust) + +**File**: `extractor/src/bin/extractor_server.rs` + +#### Changes Required + +1. **Update `handle_trigger_extraction()` function**: + - Read incoming data from TCP stream (currently ignored) as a String + - Pas the incoming data to the CLI argument if not empty. Wrap them in a String. + +2. **Update coordinator invocation**: + - Current: + ```rust + Command::new("/coordinator") + .arg("--log-level") + .arg(log_level) + .output() + ``` + - New (when container IDs provided): + ```rust + Command::new("/coordinator") + .arg("--log-level") + .arg(log_level) + .arg(container_ids) + .output() + ``` + +3. **Error Handling**: + - Invalid String: Return error response via TCP + - Log validation errors at warn level + +#### TCP Protocol + +| Direction | Format | Example | +|-----------|--------|---------| +| Request | Plain text | `abc123,def456` | +| Response | Plain text (path) | `data/out-1234567890` | + +--- + +### 3. Coordinator (Rust) + +**File**: `extractor/src/bin/coordinator.rs` + +#### Current Arguments +```rust +#[arg(short, long)] +log_level: Option + +#[arg(help = "ID of the container to scan. If absent, all containers are scanned")] +container_id: Option +``` + +#### New Arguments +```rust +#[arg(short, long)] +log_level: Option + +#[arg(help = "Comma-separated list of container IDs to scan")] +container_ids: Option +``` + +#### Changes Required + +1. **Add new CLI argument**: + - Change `container_id` to `container_ids` so that multiple containers ID can be passed (separated by commas) + - Parse them into `Vec` + +2. **Update container filtering logic**: + + - Current location in coordinator.rs `main()`: + ```rust + let containers = match args.container_id { + None => get_containers(), + Some(container_id) => match get_container(&container_id) { + Some(container) => vec![container], + None => vec![], + }, + }; + ``` + - New logic: + ```rust + let containers = match args.container_ids { + None => get_containers(), + Some(container_ids) => { + // split them from commas, + // iterate on them: for each trimmed non-empty container ID, + // call get_container(container_id) + // and collect them in a Vector + }, + }; + + ``` +3. **Update container information retrieval**: + + - Update `get_container(container_id: &String)` to invoke `crictl inspect $container_id` instead of `crictl ps`. + +4. **Logging**: + - Log at info level: "Scanning X containers" + +--- + +## Data Flow Diagram + +``` +┌─────────────────────────────────────────────────────────────────────────────┐ +│ HTTP Client │ +└─────────────────────────────────┬───────────────────────────────────────────┘ + │ + POST /gather_runtime_info + Body: {"containerIds": ["abc123", "def456"]} + │ + ▼ +┌─────────────────────────────────────────────────────────────────────────────┐ +│ EXPORTER (Go) - Port 8000 │ +│ │ +│ 1. Parse POST body as JSON │ +│ 2. Validate containerIds array │ +│ 3. Send comma-separated containerIds to extractor_server via TCP │ +└─────────────────────────────────┬───────────────────────────────────────────┘ + │ + TCP: abc123,def456 + │ + ▼ +┌─────────────────────────────────────────────────────────────────────────────┐ +│ EXTRACTOR_SERVER (Rust) - Port 3000 │ +│ │ +│ 1. Read String from TCP stream │ +│ 2. Execute: /coordinator --log-level info "abc123,def456" │ +└─────────────────────────────────┬───────────────────────────────────────────┘ + │ + Subprocess with args + │ + ▼ +┌─────────────────────────────────────────────────────────────────────────────┐ +│ COORDINATOR (Rust) │ +│ │ +│ 1. Parse executable argument │ +│ 2. Get all running containers via crictl │ +│ 4. Scan filtered containers │ +│ 5. Output path to stdout │ +└─────────────────────────────────────────────────────────────────────────────┘ +``` + +--- + +## API Specification + +### POST /gather_runtime_info + +#### Request + +**Headers**: +``` +Content-Type: application/json +``` + +**Body**: +```json +{ + "containerIds": ["container-id-1", "container-id-2"] +} +``` + +**Field Descriptions**: +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| containerIds | string[] | Yes | List of container IDs to scan. Empty array scans all containers. | + +#### Response + +Same as existing GET endpoint response: + +```json +{ + "node-name": { + "namespace-1": { + "pod-name-1": { + "container-name-1": { + "os": "rhel", + "osVersion": "8.9", + "kind": "Java", + "kindVersion": "17.0.9", + "kindImplementer": "Red Hat, Inc.", + "runtimes": [ + {"name": "Quarkus", "version": "3.2.0"} + ] + } + } + } + } +} +``` + +#### Error Responses + +| Status | Condition | Response Body | +|--------|-----------|---------------| +| 400 | Invalid JSON | `{"error": "Invalid JSON in request body"}` | +| 400 | Missing containerIds | `{"error": "containerIds field is required"}` | +| 400 | Invalid container ID format | `{"error": "Container ID at index N is invalid"}` | +| 500 | Extraction failed | `{"error": "Runtime extraction failed:
"}` | + +--- + +## Testing Requirements + +### Unit Tests + +1. **Exporter**: + - JSON parsing of valid request body + - Validation of container ID formats + - Empty array handling + - Malformed JSON rejection + +2. **Extractor Server**: + - JSON deserialization from TCP stream + - Container ID validation + - Command argument construction + +3. **Coordinator**: + - Comma-separated parsing of the main argument + - Container filtering logic + +### Integration Tests + +1. **End-to-end with specific containers**: + - POST request with known container IDs + - Verify only those containers appear in response + +2. **Empty array behavior**: + - POST with `{"containerIds": []}` should scan all containers + - Result should match GET endpoint + +3. **Non-existent container IDs**: + - POST with IDs that don't match running containers + - Should return empty/partial results gracefully + +4. **Mixed valid/invalid IDs**: + - Some IDs exist, some don't + - Should scan existing ones, log warnings for missing + +### E2E Tests + +Add test cases to existing e2e test suite in `runtime-samples/`: +- Test POST endpoint with subset of containers +- Verify filtering works correctly in Kubernetes environment + +--- + +## Implementation Phases + +### Phase 1: Coordinator Changes +1. Change main argument `container_ids` to `container_id` +2. Implement container filtering logic +3. Add unit tests +4. Manual testing with direct coordinator invocation + +### Phase 2: Extractor Server Changes +1. Update coordinator invocation +2. Add error handling +3. Add unit tests + +### Phase 3: Exporter Changes +1. Add POST endpoint handler +2. Implement JSON validation +3. Update TCP communication +4. Add unit tests + +### Phase 4: Integration & E2E +1. Integration tests +2. E2E tests in runtime-samples +3. Documentation updates + +--- + +## Backward Compatibility + +| Component | Backward Compatible | Notes | +|-----------|---------------------|-------| +| Exporter | Yes | GET endpoint unchanged | +| Extractor Server | Yes | Empty payload treated as empty containerIds | +| Coordinator | Yes | passing a singular container ID still works | + +--- + +## Security Considerations + +- No new privileges required +- No shell injection risk (IDs passed as single argument, not executed) +- Logging does not expose sensitive data + +--- + +## Open Questions + +1. **Container ID format**: Should we support both short (12-char) and full (64-char) container IDs? + - Recommendation: Yes, support both with prefix matching + +2. **Maximum number of container IDs**: Should there be a limit? + - Recommendation: No hard limit, but document performance implications + +3. **Response for non-existent IDs**: Should we return an error or just skip missing containers? + - Recommendation: Skip missing, return results for found containers, log warnings + +--- + +## Files to Modify + +| File | Changes | +|------|---------| +| `exporter/cmd/exporter/main.go` | Add POST handler, JSON parsing, update TCP payload | +| `extractor/src/bin/extractor_server.rs` | Add JSON deserialization, update coordinator args | +| `extractor/src/bin/coordinator.rs` | Change main argument, filtering logic | From cefcfea6e58916362a5306b44e110127fb1cded2 Mon Sep 17 00:00:00 2001 From: Jeff Mesnil Date: Thu, 29 Jan 2026 11:28:46 +0100 Subject: [PATCH 2/6] chore: Bump to use ocp/4.22 base images Signed-off-by: Jeff Mesnil --- .ci-operator.yaml | 2 +- Containerfile-exporter | 4 ++-- Containerfile-extractor | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/.ci-operator.yaml b/.ci-operator.yaml index e307e5af..284a9100 100644 --- a/.ci-operator.yaml +++ b/.ci-operator.yaml @@ -1,4 +1,4 @@ build_root_image: name: release namespace: openshift - tag: rhel-9-release-golang-1.24-openshift-4.21 + tag: rhel-9-release-golang-1.24-openshift-4.22 diff --git a/Containerfile-exporter b/Containerfile-exporter index 26047258..f8e0f03d 100644 --- a/Containerfile-exporter +++ b/Containerfile-exporter @@ -1,4 +1,4 @@ -FROM registry.ci.openshift.org/ocp/builder:rhel-9-golang-1.24-openshift-4.21 AS go-builder +FROM registry.ci.openshift.org/ocp/builder:rhel-9-golang-1.24-openshift-4.22 AS go-builder WORKDIR /workspace/exporter COPY exporter . @@ -6,7 +6,7 @@ ARG GO_LDFLAGS="" ENV GOEXPERIMENT=strictfipsruntime RUN CGO_ENABLED=0 GOOS=linux GO111MODULE=on make build -FROM registry.ci.openshift.org/ocp/4.21:base-rhel9 +FROM registry.ci.openshift.org/ocp/4.22:base-rhel9 COPY --from=go-builder /workspace/exporter/bin/exporter / ENTRYPOINT [ "/exporter" ] \ No newline at end of file diff --git a/Containerfile-extractor b/Containerfile-extractor index b186ca43..18734261 100644 --- a/Containerfile-extractor +++ b/Containerfile-extractor @@ -1,4 +1,4 @@ -FROM registry.ci.openshift.org/ocp/4.21:base-rhel9 AS rust-builder +FROM registry.ci.openshift.org/ocp/4.22:base-rhel9 AS rust-builder ARG TARGETARCH RUN dnf update -y && dnf -y install \ @@ -8,7 +8,7 @@ WORKDIR /workspace/extractor COPY extractor . RUN make TARGETARCH=${TARGETARCH} -FROM registry.ci.openshift.org/ocp/builder:rhel-9-golang-1.24-openshift-4.21 AS go-builder +FROM registry.ci.openshift.org/ocp/builder:rhel-9-golang-1.24-openshift-4.22 AS go-builder WORKDIR /workspace/fingerprints COPY fingerprints . @@ -16,7 +16,7 @@ ARG GO_LDFLAGS="" ENV GOEXPERIMENT=strictfipsruntime RUN CGO_ENABLED=0 GOOS=linux GO111MODULE=on make build -FROM registry.ci.openshift.org/ocp/4.21:base-rhel9 +FROM registry.ci.openshift.org/ocp/4.22:base-rhel9 RUN dnf update -y && dnf -y install \ cri-tools From 6498edb0f2d4e64456f76902ea514e220bdf3dee Mon Sep 17 00:00:00 2001 From: Jeff Mesnil Date: Fri, 30 Jan 2026 11:58:00 +0100 Subject: [PATCH 3/6] feat: Update coordinator to scan a list of containers The coordinator executable now accepts a comma-separated list of container ID that needs to be collected. If this list is empty, *all* running containers are collected. Signed-off-by: Jeff Mesnil --- extractor/src/bin/coordinator.rs | 28 +++++--- extractor/src/insights_runtime_extractor.rs | 2 +- .../insights_runtime_extractor/container.rs | 66 ++++--------------- extractor/src/lib.rs | 1 - 4 files changed, 32 insertions(+), 65 deletions(-) diff --git a/extractor/src/bin/coordinator.rs b/extractor/src/bin/coordinator.rs index f82d1aac..83c77f1f 100644 --- a/extractor/src/bin/coordinator.rs +++ b/extractor/src/bin/coordinator.rs @@ -2,7 +2,7 @@ use clap::Parser; use log::info; use std::time::{SystemTime, UNIX_EPOCH}; -use insights_runtime_extractor::{config, file, get_container, get_containers, perms}; +use insights_runtime_extractor::{config, file, get_containers, perms}; #[derive(Parser, Debug)] #[command(about, long_about = None)] @@ -14,8 +14,10 @@ struct Args { )] log_level: Option, - #[arg(help = "ID of the container to scan. If absent, all containers are scanned")] - container_id: Option, + #[arg( + help = "Comma-separated list of container IDs to scan. If absent, all containers are scanned" + )] + container_ids: Option, } fn main() { @@ -43,13 +45,19 @@ fn main() { &exec_dir ); - let containers = match args.container_id { - None => get_containers(), - Some(container_id) => match get_container(&container_id) { - Some(container) => vec![container], - None => vec![], - }, - }; + let container_ids: Vec = args + .container_ids + .map(|ids| { + ids.split(',') + .map(|id| id.trim().to_string()) + .filter(|id| !id.is_empty()) + .collect() + }) + .unwrap_or_default(); + + let containers = get_containers(container_ids); + + info!("Scanning {} containers", containers.len()); for container in containers { info!( diff --git a/extractor/src/insights_runtime_extractor.rs b/extractor/src/insights_runtime_extractor.rs index acd049ec..e09be686 100644 --- a/extractor/src/insights_runtime_extractor.rs +++ b/extractor/src/insights_runtime_extractor.rs @@ -71,7 +71,7 @@ pub fn scan_container(config: &Config, out: &String, container: &Container) { let leaves = process::get_process_leaves(&root_pid); // fingerprint only the first process - info!("🔎 Fingerprinting {} processes...", leaves.len()); + debug!("🔎 Fingerprinting {} processes...", leaves.len()); if let Some(process) = leaves.get(0) { // create a directory to store this process' fingerprints diff --git a/extractor/src/insights_runtime_extractor/container.rs b/extractor/src/insights_runtime_extractor/container.rs index c7bf0a1a..1cd31f57 100644 --- a/extractor/src/insights_runtime_extractor/container.rs +++ b/extractor/src/insights_runtime_extractor/container.rs @@ -18,58 +18,13 @@ pub struct Container { pub pid: u32, } -pub fn get_container(container_id: &String) -> Option { - info!("🔎 Reading container information with crictl..."); - - let output = Command::new("crictl") - .args(["ps", "-o", "json", "-s", "RUNNING"]) - .output() - .expect("List containers with crictl"); - let json = String::from_utf8(output.stdout).unwrap(); - - let deserialized_containers: Value = serde_json::from_str(&json).unwrap(); - - let container_id = match container_id.strip_prefix("cri-o://") { - Some(container_id) => container_id.to_string(), - None => container_id.to_string(), - }; - - for c in deserialized_containers["containers"].as_array().unwrap() { - let id = c["id"].as_str().unwrap().to_string(); - - match id == container_id { - false => {} - true => { - let pod_namespace = c["labels"]["io.kubernetes.pod.namespace"] - .as_str() - .unwrap() - .to_string(); - let image_ref = c["imageRef"].as_str().unwrap().to_string(); - let name = c["labels"]["io.kubernetes.container.name"] - .as_str() - .unwrap() - .to_string(); - let pod_name = c["labels"]["io.kubernetes.pod.name"] - .as_str() - .unwrap() - .to_string(); - let pid: u32 = get_root_pid(&id); - return Some(Container { - id: "cri-o://".to_owned() + &id, - image_ref, - name, - pod_name, - pod_namespace, - pid, - }); - } - } - } - - None -} +pub fn get_containers(container_ids: Vec) -> Vec { + // Normalize container_ids by stripping the cri-o:// prefix + let ids_to_collect: Vec = container_ids + .iter() + .map(|id| id.strip_prefix("cri-o://").unwrap_or(id).to_string()) + .collect(); -pub fn get_containers() -> Vec { info!("🔎 Reading container information with crictl..."); let output = Command::new("crictl") @@ -85,12 +40,17 @@ pub fn get_containers() -> Vec { let mut containers: Vec = Vec::new(); for c in deserialized_containers["containers"].as_array().unwrap() { + let id = c["id"].as_str().unwrap().to_string(); + + // If ids_to_collect is not empty, skip containers not in the list + if !ids_to_collect.is_empty() && !ids_to_collect.contains(&id) { + continue; + } + let pod_namespace = c["labels"]["io.kubernetes.pod.namespace"] .as_str() .unwrap() .to_string(); - - let id = c["id"].as_str().unwrap().to_string(); let image_ref = c["imageRef"].as_str().unwrap().to_string(); let name = c["labels"]["io.kubernetes.container.name"] .as_str() diff --git a/extractor/src/lib.rs b/extractor/src/lib.rs index d495e98a..5396f916 100644 --- a/extractor/src/lib.rs +++ b/extractor/src/lib.rs @@ -1,7 +1,6 @@ mod insights_runtime_extractor; pub use crate::insights_runtime_extractor::config; -pub use crate::insights_runtime_extractor::container::get_container; pub use crate::insights_runtime_extractor::container::get_containers; pub use crate::insights_runtime_extractor::file; pub use crate::insights_runtime_extractor::perms; From d51bacaf4e7c1ed58b71862b6ed98981fa18efd7 Mon Sep 17 00:00:00 2001 From: Jeff Mesnil Date: Fri, 30 Jan 2026 13:33:57 +0100 Subject: [PATCH 4/6] feat: accept list of container ids from the extractor TCP server When a TCP connection is created, read a String that is expected to contain a list of container IDs. If the string is empty (as was previously the case), all containers on the node are scanned. Signed-off-by: Jeff Mesnil --- extractor/src/bin/extractor_server.rs | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/extractor/src/bin/extractor_server.rs b/extractor/src/bin/extractor_server.rs index c6f7c868..b3105e4b 100644 --- a/extractor/src/bin/extractor_server.rs +++ b/extractor/src/bin/extractor_server.rs @@ -1,7 +1,7 @@ use clap::Parser; -use log::{error, info, trace}; +use log::{error, info, trace, warn}; use std::fs; -use std::io::Write; +use std::io::{Read, Write}; use std::net::{Shutdown, TcpListener, TcpStream}; use std::process::Command; use std::thread; @@ -60,10 +60,22 @@ fn handle_trigger_extraction(mut stream: TcpStream, log_level: String) { let start = Instant::now(); + // Read container IDs from TCP stream until EOF + let mut buffer = String::new(); + if let Err(e) = stream.read_to_string(&mut buffer) { + warn!("Failed to read from socket; err = {:?}", e); + if let Err(e) = stream.shutdown(Shutdown::Both) { + error!("Failed to shutdown socket; err = {:?}", e); + } + return; + } + let container_ids = buffer.trim().to_string(); + // Execute the "extractor_coordinator" program let output = Command::new("/coordinator") .arg("--log-level") .arg(log_level) + .arg(container_ids) .output(); match output { Ok(output) => { From 7d0c1c3233b66f45999ba8fe5970580e1dc00169 Mon Sep 17 00:00:00 2001 From: Jeff Mesnil Date: Fri, 30 Jan 2026 15:24:16 +0100 Subject: [PATCH 5/6] feat: Add POST endpoint to filter containers by ID Add POST /gather_runtime_info endpoint that accepts a JSON body with containerIds array to scan specific containers. The container IDs are sent as comma-separated values over TCP to the extractor server. Uses CloseWrite() to signal EOF for proper TCP stream handling. Signed-off-by: Jeff Mesnil --- exporter/cmd/exporter/main.go | 44 +++++++++++++++++++++++++++++++---- 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/exporter/cmd/exporter/main.go b/exporter/cmd/exporter/main.go index 2aef5f8a..5f6b0d6f 100644 --- a/exporter/cmd/exporter/main.go +++ b/exporter/cmd/exporter/main.go @@ -22,10 +22,32 @@ const ( EXTRACTOR_ADDRESS string = "127.0.0.1:3000" ) +// GatherRuntimeInfoRequest represents the JSON body for POST requests +type GatherRuntimeInfoRequest struct { + ContainerIds []string `json:"containerIds"` +} + // gatherRuntimeInfo will trigger a new extraction of runtime info // and reply with a JSON payload func gatherRuntimeInfo(w http.ResponseWriter, r *http.Request) { - if r.Method != "GET" { + var containerIds []string + + switch r.Method { + case "GET": + // GET scans all containers (empty containerIds) + containerIds = []string{} + case "POST": + var req GatherRuntimeInfoRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + http.Error(w, `{"error": "Invalid JSON in request body"}`, http.StatusBadRequest) + return + } + if req.ContainerIds == nil { + http.Error(w, `{"error": "containerIds field is required"}`, http.StatusBadRequest) + return + } + containerIds = req.ContainerIds + default: http.Error(w, "Method is not supported.", http.StatusNotFound) return } @@ -34,7 +56,8 @@ func gatherRuntimeInfo(w http.ResponseWriter, r *http.Request) { hash := hashParam == "" || hashParam == "true" startTime := time.Now() - dataPath, err := triggerRuntimeInfoExtraction() + dataPath, err := triggerRuntimeInfoExtraction(containerIds) + if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return @@ -63,7 +86,7 @@ func gatherRuntimeInfo(w http.ResponseWriter, r *http.Request) { w.Write(response) } -func triggerRuntimeInfoExtraction() (string, error) { +func triggerRuntimeInfoExtraction(containerIds []string) (string, error) { conn, err := net.Dial("tcp", EXTRACTOR_ADDRESS) if err != nil { return "", err @@ -71,10 +94,21 @@ func triggerRuntimeInfoExtraction() (string, error) { defer conn.Close() log.Println("Requesting a new runtime extraction") + tcpConn, ok := conn.(*net.TCPConn) + if !ok { + return "", fmt.Errorf("failed to get TCP connection") + } + + // Send comma-separated container IDs (empty string if no specific containers requested) + payload := strings.Join(containerIds, ",") // write to TCP connection to trigger a runtime extraction - fmt.Fprintf(conn, "") + fmt.Fprintf(tcpConn, "%s", payload) + // and close the write side to signal EOF to the server + if err := tcpConn.CloseWrite(); err != nil { + return "", fmt.Errorf("failed to close write side: %w", err) + } - dataPath, err := bufio.NewReader(conn).ReadString('\n') + dataPath, err := bufio.NewReader(tcpConn).ReadString('\n') if err != nil { return "", err } From e18320a7988c5a0d7238af8b7ad2d8878f3ec725 Mon Sep 17 00:00:00 2001 From: Jeff Mesnil Date: Fri, 30 Jan 2026 15:34:39 +0100 Subject: [PATCH 6/6] chore: update PRD for performance improvement Signed-off-by: Jeff Mesnil --- docs/prd/prd-0001-container-ids-filtering.md | 303 +++++++------------ 1 file changed, 117 insertions(+), 186 deletions(-) diff --git a/docs/prd/prd-0001-container-ids-filtering.md b/docs/prd/prd-0001-container-ids-filtering.md index 076ebc32..c2e2bbe7 100644 --- a/docs/prd/prd-0001-container-ids-filtering.md +++ b/docs/prd/prd-0001-container-ids-filtering.md @@ -2,7 +2,7 @@ ## Overview -This document describes the changes required to support targeted container scanning via a list of container IDs. Currently, the system either scans all running containers or a single container. This enhancement will allow clients to request scanning of a specific subset of containers by providing their IDs in a POST request. +This document describes the changes implemented to support targeted container scanning via a list of container IDs. Previously, the system only scanned all running containers. This enhancement allows clients to request scanning of a specific subset of containers by providing their IDs in a POST request. ## Motivation @@ -10,35 +10,35 @@ This document describes the changes required to support targeted container scann --- -## Current Architecture +## Architecture ### Request Flow ``` HTTP Client → Exporter (Go) → Extractor Server (Rust) → Coordinator (Rust) → Fingerprints - GET /gather_runtime_info TCP 3000 /coordinator + GET/POST /gather_runtime_info TCP 3000 /coordinator ``` -### Current Capabilities -- **Exporter**: Only supports `GET /gather_runtime_info?hash=true|false` -- **Extractor Server**: Receives empty TCP payload, spawns coordinator -- **Coordinator**: Accepts optional single `--container-id ` argument (not used by the extractor server) +### Capabilities +- **Exporter**: Supports `GET /gather_runtime_info?hash=true|false` (scans all) and `POST /gather_runtime_info` (scans specific containers) +- **Extractor Server**: Receives comma-separated container IDs via TCP, spawns coordinator with IDs as argument +- **Coordinator**: Accepts optional `container_ids` positional argument (comma-separated list) --- -## Proposed Changes +## Implemented Changes ### 1. Exporter HTTP Server (Go) **File**: `exporter/cmd/exporter/main.go` -#### New Endpoint -Add a new HTTP POST endpoint alongside the existing GET endpoint: +#### Endpoints | Method | Path | Request Body | Description | |--------|------|--------------|-------------| +| GET | `/gather_runtime_info` | None | Scan all containers | | POST | `/gather_runtime_info` | JSON object with container IDs | Scan specific containers | -#### Request Format +#### Request Format (POST) ```json { "containerIds": ["abc123...", "def456...", "ghi789..."] @@ -49,38 +49,35 @@ Add a new HTTP POST endpoint alongside the existing GET endpoint: - Request body must be valid JSON - `containerIds` field must be present and be an array - Array can be empty (equivalent to scanning all containers) -- Each container ID must be a non-empty string -- Container IDs should be the short or full CRI-O container ID format +- Container IDs must be in full CRI-O format, with or without `cri-o://` prefix -#### Changes Required +#### Implementation -1. **Add JSON request struct**: +1. **JSON request struct**: ```go type GatherRuntimeInfoRequest struct { ContainerIds []string `json:"containerIds"` } ``` -2. **Update HTTP handler**: - - Parse request method (GET vs POST) - - For POST: parse JSON body and validate - - Pass container IDs to `triggerRuntimeInfoExtraction()` +2. **HTTP handler** (`gatherRuntimeInfo`): + - For GET: uses empty `containerIds` slice + - For POST: decodes JSON body and validates `containerIds` field is present + - Passes container IDs to `triggerRuntimeInfoExtraction()` -3. **Update `triggerRuntimeInfoExtraction()` function**: - - Add parameter: `containerIds []string` - - Serialize container IDs as a single string of comma-separated container IDs before sending over TCP - - Send string payload instead of empty payload to extractor server +3. **`triggerRuntimeInfoExtraction()` function**: + - Accepts `containerIds []string` parameter + - Joins container IDs with commas: `strings.Join(containerIds, ",")` + - Sends payload via TCP + - **Calls `tcpConn.CloseWrite()` to signal EOF** - this is critical because the Rust server uses `read_to_string()` which reads until EOF -4. **TCP Protocol Change**: - - Current: Empty payload triggers extraction - - New: String payload with a list of comma-separated container IDs - ``` - "abc123,def456,..." - ``` - - Empty string means "scan all containers" +#### TCP Protocol +- Client sends comma-separated container IDs as plain text +- Client calls `CloseWrite()` to signal end of data +- Server responds with path to extracted data #### Backward Compatibility -- GET endpoint behavior remains unchanged (scans all containers) +- GET endpoint behavior unchanged (scans all containers) - GET internally sends an empty string `""` to maintain protocol consistency --- @@ -89,39 +86,28 @@ Add a new HTTP POST endpoint alongside the existing GET endpoint: **File**: `extractor/src/bin/extractor_server.rs` -#### Changes Required - -1. **Update `handle_trigger_extraction()` function**: - - Read incoming data from TCP stream (currently ignored) as a String - - Pas the incoming data to the CLI argument if not empty. Wrap them in a String. - -2. **Update coordinator invocation**: - - Current: - ```rust - Command::new("/coordinator") - .arg("--log-level") - .arg(log_level) - .output() - ``` - - New (when container IDs provided): - ```rust - Command::new("/coordinator") - .arg("--log-level") - .arg(log_level) - .arg(container_ids) - .output() - ``` - -3. **Error Handling**: - - Invalid String: Return error response via TCP - - Log validation errors at warn level +#### Implementation + +1. **`handle_trigger_extraction()` function**: + - Reads incoming data from TCP stream using `read_to_string()` (reads until EOF) + - Trims the received string + - Passes container IDs as positional argument to coordinator + +2. **Coordinator invocation**: + ```rust + Command::new("/coordinator") + .arg("--log-level") + .arg(log_level) + .arg(container_ids) // comma-separated string, passed even if empty + .output() + ``` #### TCP Protocol | Direction | Format | Example | |-----------|--------|---------| -| Request | Plain text | `abc123,def456` | -| Response | Plain text (path) | `data/out-1234567890` | +| Request | Plain text (read until EOF) | `abc123,def456` | +| Response | Plain text (path) | `data/out-1234567890\n` | --- @@ -129,61 +115,50 @@ Add a new HTTP POST endpoint alongside the existing GET endpoint: **File**: `extractor/src/bin/coordinator.rs` -#### Current Arguments +#### CLI Arguments ```rust #[arg(short, long)] log_level: Option -#[arg(help = "ID of the container to scan. If absent, all containers are scanned")] -container_id: Option +#[arg(help = "Comma-separated list of container IDs to scan. If absent, all containers are scanned")] +container_ids: Option ``` -#### New Arguments -```rust -#[arg(short, long)] -log_level: Option +#### Implementation + +1. **Parsing container IDs**: + ```rust + let container_ids: Vec = args + .container_ids + .map(|ids| { + ids.split(',') + .map(|id| id.trim().to_string()) + .filter(|id| !id.is_empty()) + .collect() + }) + .unwrap_or_default(); + ``` -#[arg(help = "Comma-separated list of container IDs to scan")] -container_ids: Option -``` +2. **Container retrieval**: + - Calls `get_containers(container_ids)` passing the parsed vector + - Empty vector means "scan all containers" -#### Changes Required - -1. **Add new CLI argument**: - - Change `container_id` to `container_ids` so that multiple containers ID can be passed (separated by commas) - - Parse them into `Vec` - -2. **Update container filtering logic**: - - - Current location in coordinator.rs `main()`: - ```rust - let containers = match args.container_id { - None => get_containers(), - Some(container_id) => match get_container(&container_id) { - Some(container) => vec![container], - None => vec![], - }, - }; - ``` - - New logic: - ```rust - let containers = match args.container_ids { - None => get_containers(), - Some(container_ids) => { - // split them from commas, - // iterate on them: for each trimmed non-empty container ID, - // call get_container(container_id) - // and collect them in a Vector - }, - }; - - ``` -3. **Update container information retrieval**: - - - Update `get_container(container_id: &String)` to invoke `crictl inspect $container_id` instead of `crictl ps`. - -4. **Logging**: - - Log at info level: "Scanning X containers" +3. **Logging**: + - Logs "Scanning X containers" at info level + +--- + +### 4. Container Module (Rust) + +**File**: `extractor/src/insights_runtime_extractor/container.rs` + +#### Implementation + +1. **`get_containers(container_ids: Vec)` function**: + - Normalizes container IDs by stripping `cri-o://` prefix if present + - Runs `crictl ps -o json -s RUNNING` to get all running containers + - If `container_ids` is not empty, filters to only include matching containers + - Returns `Vec` --- @@ -202,17 +177,18 @@ container_ids: Option │ EXPORTER (Go) - Port 8000 │ │ │ │ 1. Parse POST body as JSON │ -│ 2. Validate containerIds array │ +│ 2. Validate containerIds field is present │ │ 3. Send comma-separated containerIds to extractor_server via TCP │ +│ 4. Call CloseWrite() to signal EOF │ └─────────────────────────────────┬───────────────────────────────────────────┘ │ - TCP: abc123,def456 + TCP: "abc123,def456" + EOF │ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ EXTRACTOR_SERVER (Rust) - Port 3000 │ │ │ -│ 1. Read String from TCP stream │ +│ 1. Read String from TCP stream until EOF (read_to_string) │ │ 2. Execute: /coordinator --log-level info "abc123,def456" │ └─────────────────────────────────┬───────────────────────────────────────────┘ │ @@ -222,8 +198,9 @@ container_ids: Option ┌─────────────────────────────────────────────────────────────────────────────┐ │ COORDINATOR (Rust) │ │ │ -│ 1. Parse executable argument │ -│ 2. Get all running containers via crictl │ +│ 1. Parse comma-separated container IDs from argument │ +│ 2. Get all running containers via crictl ps │ +│ 3. Filter to only requested containers (if any specified) │ │ 4. Scan filtered containers │ │ 5. Output path to stdout │ └─────────────────────────────────────────────────────────────────────────────┘ @@ -252,7 +229,7 @@ Content-Type: application/json **Field Descriptions**: | Field | Type | Required | Description | |-------|------|----------|-------------| -| containerIds | string[] | Yes | List of container IDs to scan. Empty array scans all containers. | +| containerIds | string[] | Yes | List of container IDs to scan. Empty array scans all containers. Supports full CRI-O IDs, with or without `cri-o://` prefix. | #### Response @@ -260,19 +237,17 @@ Same as existing GET endpoint response: ```json { - "node-name": { - "namespace-1": { - "pod-name-1": { - "container-name-1": { - "os": "rhel", - "osVersion": "8.9", - "kind": "Java", - "kindVersion": "17.0.9", - "kindImplementer": "Red Hat, Inc.", - "runtimes": [ - {"name": "Quarkus", "version": "3.2.0"} - ] - } + "namespace-1": { + "pod-name-1": { + "container-id-1": { + "os": "rhel", + "osVersion": "8.9", + "kind": "Java", + "kindVersion": "17.0.9", + "kindImplementer": "Red Hat, Inc.", + "runtimes": [ + {"name": "Quarkus", "version": "3.2.0"} + ] } } } @@ -285,30 +260,10 @@ Same as existing GET endpoint response: |--------|-----------|---------------| | 400 | Invalid JSON | `{"error": "Invalid JSON in request body"}` | | 400 | Missing containerIds | `{"error": "containerIds field is required"}` | -| 400 | Invalid container ID format | `{"error": "Container ID at index N is invalid"}` | -| 500 | Extraction failed | `{"error": "Runtime extraction failed:
"}` | +| 500 | Extraction failed | Error message from extraction process | --- -## Testing Requirements - -### Unit Tests - -1. **Exporter**: - - JSON parsing of valid request body - - Validation of container ID formats - - Empty array handling - - Malformed JSON rejection - -2. **Extractor Server**: - - JSON deserialization from TCP stream - - Container ID validation - - Command argument construction - -3. **Coordinator**: - - Comma-separated parsing of the main argument - - Container filtering logic - ### Integration Tests 1. **End-to-end with specific containers**: @@ -325,7 +280,7 @@ Same as existing GET endpoint response: 4. **Mixed valid/invalid IDs**: - Some IDs exist, some don't - - Should scan existing ones, log warnings for missing + - Should scan existing ones only ### E2E Tests @@ -335,39 +290,13 @@ Add test cases to existing e2e test suite in `runtime-samples/`: --- -## Implementation Phases - -### Phase 1: Coordinator Changes -1. Change main argument `container_ids` to `container_id` -2. Implement container filtering logic -3. Add unit tests -4. Manual testing with direct coordinator invocation - -### Phase 2: Extractor Server Changes -1. Update coordinator invocation -2. Add error handling -3. Add unit tests - -### Phase 3: Exporter Changes -1. Add POST endpoint handler -2. Implement JSON validation -3. Update TCP communication -4. Add unit tests - -### Phase 4: Integration & E2E -1. Integration tests -2. E2E tests in runtime-samples -3. Documentation updates - ---- - ## Backward Compatibility | Component | Backward Compatible | Notes | |-----------|---------------------|-------| | Exporter | Yes | GET endpoint unchanged | -| Extractor Server | Yes | Empty payload treated as empty containerIds | -| Coordinator | Yes | passing a singular container ID still works | +| Extractor Server | Yes | Empty string treated as "scan all" | +| Coordinator | Yes | Empty/missing argument scans all containers | --- @@ -379,23 +308,25 @@ Add test cases to existing e2e test suite in `runtime-samples/`: --- -## Open Questions +## Design Decisions + +1. **Container ID format**: Full (64-char) container IDs are supported. The `cri-o://` prefix is automatically stripped if present. + +2. **Maximum number of container IDs**: No hard limit imposed. -1. **Container ID format**: Should we support both short (12-char) and full (64-char) container IDs? - - Recommendation: Yes, support both with prefix matching +3. **Response for non-existent IDs**: Missing containers are silently skipped. Results are returned for found containers only. -2. **Maximum number of container IDs**: Should there be a limit? - - Recommendation: No hard limit, but document performance implications +4. **TCP EOF signaling**: The Go exporter uses `CloseWrite()` to signal EOF because the Rust server uses `read_to_string()` which reads until EOF. -3. **Response for non-existent IDs**: Should we return an error or just skip missing containers? - - Recommendation: Skip missing, return results for found containers, log warnings +5. **Filtering approach**: Instead of calling `crictl inspect` for each container ID, the implementation fetches all running containers with `crictl ps` and filters in memory. This is simpler and avoids multiple subprocess calls. --- -## Files to Modify +## Files Modified | File | Changes | |------|---------| -| `exporter/cmd/exporter/main.go` | Add POST handler, JSON parsing, update TCP payload | -| `extractor/src/bin/extractor_server.rs` | Add JSON deserialization, update coordinator args | -| `extractor/src/bin/coordinator.rs` | Change main argument, filtering logic | +| `exporter/cmd/exporter/main.go` | Add POST handler, JSON parsing, TCP EOF signaling with CloseWrite() | +| `extractor/src/bin/extractor_server.rs` | Read container IDs from TCP stream, pass to coordinator | +| `extractor/src/bin/coordinator.rs` | Accept container_ids positional argument, parse and pass to get_containers | +| `extractor/src/insights_runtime_extractor/container.rs` | Update get_containers to accept Vec and filter containers |