diff --git a/.github/instructions/code-review.instructions.md b/.github/instructions/code-review.instructions.md new file mode 100644 index 0000000..0849eb8 --- /dev/null +++ b/.github/instructions/code-review.instructions.md @@ -0,0 +1,26 @@ +--- +applyTo: "**/*.rs" +--- + +# Code Review Instructions for xml-sec + +## Scope Rules (CRITICAL) + +- **Review ONLY code within the PR's diff.** Do not suggest inline fixes for unchanged lines. +- For issues in code **outside the diff**, suggest creating a **separate issue** instead of proposing code changes. Example: "Consider opening an issue to add namespace validation here — this is outside this PR's scope." +- **Read the PR description carefully.** If the PR body has an "out of scope" section listing items handled by other PRs, do NOT flag those items. + +## Rust Standards + +- `unsafe` blocks require `// SAFETY:` comments explaining the invariant +- Prefer `#[expect(lint)]` over `#[allow(lint)]` — `#[expect]` warns when suppression becomes unnecessary +- Use `TryFrom`/`TryInto` for fallible conversions; `as` casts need `#[expect(clippy::cast_possible_truncation)]` with reason +- No `unwrap()` / `expect()` on I/O paths — use `Result` propagation +- `expect()` is acceptable for programmer invariants (lock poisoning) with `#[expect(clippy::expect_used, reason = "...")]` +- Code must pass `cargo clippy --all-features -- -D warnings` + +## Testing + +- Corruption/validation tests: tamper the relevant field (e.g., digest value, signature bytes, base64 encoding) and assert the error +- Use same serialization as production (e.g., canonical XML output, not shortcuts) +- Test naming: `fn __()` diff --git a/.github/instructions/rust.instructions.md b/.github/instructions/rust.instructions.md new file mode 100644 index 0000000..7d7a51d --- /dev/null +++ b/.github/instructions/rust.instructions.md @@ -0,0 +1,80 @@ +--- +applyTo: "**/*.rs" +--- + +# Rust Code Review Instructions + +## Review Priority (HIGH -> LOW) + +Focus review effort on real bugs, not cosmetics. Stop after finding issues in higher tiers -- do not pad reviews with low-priority nitpicks. + +### Tier 1 -- Logic Bugs and Correctness (MUST flag) +- Data corruption: wrong algorithm, incorrect ordering, dropped or duplicated data +- Off-by-one in boundaries, index lookups, or range operations +- Checksum/hash mismatches: computing over wrong byte range, verifying against stale value +- TOCTOU: checking state then acting on it without holding a lock or atomic operation +- Missing validation: unchecked index, unvalidated input from network/disk/external source +- Resource leaks: unclosed file handles, missing cleanup on error paths +- Concurrency: data races, lock ordering violations, missing synchronization +- Error swallowing: `let _ = fallible_call()` silently dropping errors that affect correctness +- Integer overflow/truncation on sizes, offsets, lengths, or security-critical values + +### Tier 2 -- Safety and Crash Recovery (MUST flag) +- `unsafe` without `// SAFETY:` invariant explanation +- `unwrap()`/`expect()` on I/O, network, or deserialization paths (must use `Result` propagation) +- Crash safety: write ordering that leaves data unrecoverable after power loss +- Sensitive data (keys, passwords, tokens) not wrapped in `Zeroize`/`Zeroizing` +- Constant-time comparison not used for cryptographic MACs/checksums +- Hardcoded secrets, credentials, or private URLs + +### Tier 3 -- API Design and Robustness (flag if clear improvement) +- Public API missing `#[must_use]` on builder-style methods or non-`Result` types callers might discard +- `pub` visibility where `pub(crate)` suffices +- Missing `Send + Sync` bounds on types used across threads +- `Clone` on large types where a reference would work +- Fallible operations returning `()` instead of `Result` + +### Tier 4 -- Style (ONLY flag if misleading or confusing) +- Variable/function names that actively mislead about behavior +- Dead code (unused functions, unreachable branches) + +## DO NOT Flag (Explicit Exclusions) + +These are not actionable review findings. Do not raise them: + +- **Comment wording vs code behavior**: If a comment says "flush when full" but the threshold uses `>=` not `>`, the intent is clear. Do not suggest rewording comments to match exact operators or implementation details. Comments describe intent, not repeat the code. +- **Comment precision**: "returns the key" when it technically returns `Result` -- the comment conveys meaning, not type signature. +- **Magic numbers with context**: `4` in `assert_eq!(header.len(), 4, "expected u32 checksum")` -- the assertion message provides context. Do not suggest a named constant when the value is used once in a test with an explanatory message. +- **Domain constants**: Specific numeric values for block sizes (e.g., `4096`), key sizes, protocol version numbers, port numbers, or configuration defaults are domain constants, not magic numbers, when used with surrounding context. +- **Minor naming preferences**: `lvl` vs `level`, `opts` vs `options`, `enc_part` vs `encrypted_part` -- these are team style, not bugs. +- **Import ordering**: Import grouping or ordering style. Unused imports are NOT cosmetic -- they cause `clippy -D warnings` failures and must be removed. +- **Test code style**: Tests prioritize readability and explicitness over DRY. Repeated setup code in tests is acceptable. +- **`#[allow(clippy::...)]` with justification comment**: When `#[allow]` has an adjacent comment explaining why it is used instead of `#[expect]`, do not suggest switching. Common reason: the lint does not fire on the current code but the suppression is kept defensively. `#[expect]` would fail the build with "unfulfilled expectation" in that case. +- **`#[allow(clippy::...)]` in existing/unchanged code**: Do not flag `#[allow]` suppressions in unchanged lines. New code should use `#[expect]` when the lint actually fires. +- **Temporary directory strategies in existing code**: Existing tests using manual temp paths are not a finding. New tests should prefer `tempfile::tempdir()`. +- **Semver concerns on pre-1.0 crates**: Adding required methods to public traits, changing public API signatures, or restructuring modules in a crate with version `0.x.y` is expected and does not require a default implementation or sealed trait pattern. Semver breakage only matters at `>= 1.0.0`. +- **Test infrastructure scripts (shell)**: Shell scripts in `tests/` that set up Docker containers, KDCs, databases, or other test infrastructure are not production code. Do not flag hardcoded test credentials (they are overridable via env vars), redundant principal creation (required by the specific technology), or non-production patterns in these scripts. +- **Previously resolved review threads**: If the same suggestion was already raised and resolved in a prior review round on this PR, do not re-raise it. Check the resolved threads before flagging. + +## Scope Rules + +- **Review ONLY code within the PR's diff.** Do not suggest inline fixes for unchanged lines. +- For issues **outside the diff**, suggest opening a separate issue. +- **Read the PR description.** If it lists known limitations or deferred items, do not re-flag them. + +## Rust-Specific Standards + +- Prefer `#[expect(lint)]` over `#[allow(lint)]` when the lint actually fires -- `#[expect]` warns when suppression becomes unnecessary. Use `#[allow(lint)]` with a justification comment when the lint does not fire but defensive suppression is desired. +- `TryFrom`/`TryInto` for fallible conversions; `as` casts need justification +- No `unwrap()` / `expect()` on I/O paths -- use `?` propagation +- `expect()` is acceptable for programmer invariants (e.g., lock poisoning, `const` construction) with reason +- Code must pass `cargo clippy --all-features -- -D warnings` +- RFC/spec compliance comments (e.g., "RFC 4120 section 7.5.1") are documentation -- preserve them + +## Testing Standards + +- Test naming: `fn __()` or `fn test_()` +- Use `tempfile::tempdir()` for test directories -- ensures cleanup even on panic +- Integration tests that require infrastructure use `#[ignore = "reason"]` +- Prefer `assert_eq!` with message over bare `assert!` for better failure output +- Hardcoded values in tests are fine when accompanied by explanatory comments or assertion messages diff --git a/src/xmldsig/mod.rs b/src/xmldsig/mod.rs index c68dd30..36a5fbc 100644 --- a/src/xmldsig/mod.rs +++ b/src/xmldsig/mod.rs @@ -8,7 +8,9 @@ //! - ID attribute resolution with configurable attribute names //! - Node set types for the transform pipeline +pub mod transforms; pub mod types; pub mod uri; +pub use transforms::{execute_transforms, parse_transforms, Transform}; pub use types::{NodeSet, TransformData, TransformError}; diff --git a/src/xmldsig/transforms.rs b/src/xmldsig/transforms.rs new file mode 100644 index 0000000..ce613c2 --- /dev/null +++ b/src/xmldsig/transforms.rs @@ -0,0 +1,677 @@ +//! Transform pipeline for XMLDSig `` processing. +//! +//! Implements [XMLDSig §6.6](https://www.w3.org/TR/xmldsig-core1/#sec-Transforms): +//! each `` specifies a chain of transforms applied sequentially to +//! produce bytes for digest computation. +//! +//! The pipeline is a simple `Vec` iterated front-to-back — a dramatic +//! simplification of xmlsec1's bidirectional push/pop doubly-linked list with +//! auto-inserted type adapters. +//! +//! ## Supported transforms +//! +//! | Transform | Direction | Priority | +//! |-----------|-----------|----------| +//! | Enveloped signature | NodeSet → NodeSet | P0 (SAML) | +//! | Inclusive C14N 1.0/1.1 | NodeSet → Binary | P0 | +//! | Exclusive C14N 1.0 | NodeSet → Binary | P0 | +//! | Base64 decode | Binary → Binary | P1 (future) | + +use roxmltree::Node; + +use super::types::{TransformData, TransformError}; +use crate::c14n::{self, C14nAlgorithm}; + +/// The algorithm URI for the enveloped signature transform. +pub const ENVELOPED_SIGNATURE_URI: &str = "http://www.w3.org/2000/09/xmldsig#enveloped-signature"; + +/// XMLDSig namespace URI for `` elements. +const XMLDSIG_NS_URI: &str = "http://www.w3.org/2000/09/xmldsig#"; + +/// Namespace URI for Exclusive C14N `` elements. +const EXCLUSIVE_C14N_NS_URI: &str = "http://www.w3.org/2001/10/xml-exc-c14n#"; + +/// A single transform in the pipeline. +#[derive(Debug, Clone)] +pub enum Transform { + /// Enveloped signature: removes the `` element subtree + /// that contains the `` being processed. + /// + /// Input: `NodeSet` → Output: `NodeSet` + Enveloped, + + /// XML Canonicalization (any supported variant). + /// + /// Input: `NodeSet` → Output: `Binary` + C14n(C14nAlgorithm), +} + +/// Apply a single transform to the pipeline data. +/// +/// `signature_node` is the `` element that contains the +/// `` being processed. It is used by the enveloped transform +/// to know which signature subtree to exclude. The node must belong to the +/// same document as the `NodeSet` in `input`; a cross-document mismatch +/// returns [`TransformError::CrossDocumentSignatureNode`]. +pub(crate) fn apply_transform<'a>( + signature_node: Node<'a, 'a>, + transform: &Transform, + input: TransformData<'a>, +) -> Result, TransformError> { + match transform { + Transform::Enveloped => { + let mut nodes = input.into_node_set()?; + // Exclude the Signature element and all its descendants from + // the node set. This is the core mechanism of the enveloped + // signature transform: the digest is computed as if the + // were not present in the document. + // + // xmlsec1 equivalent: + // xmlSecNodeSetGetChildren(doc, signatureNode, 1, 1) // inverted tree + // xmlSecNodeSetAdd(inNodes, children, Intersection) // intersect = subtract + if !std::ptr::eq(signature_node.document(), nodes.document()) { + return Err(TransformError::CrossDocumentSignatureNode); + } + nodes.exclude_subtree(signature_node); + Ok(TransformData::NodeSet(nodes)) + } + Transform::C14n(algo) => { + let nodes = input.into_node_set()?; + let mut output = Vec::new(); + let predicate = |node: Node| nodes.contains(node); + c14n::canonicalize(nodes.document(), Some(&predicate), algo, &mut output) + .map_err(|e| TransformError::C14n(e.to_string()))?; + Ok(TransformData::Binary(output)) + } + } +} + +/// Execute a chain of transforms for a single ``. +/// +/// 1. Start with `initial_data` (from URI dereference). +/// 2. Apply each transform sequentially. +/// 3. If the result is still a `NodeSet`, apply default inclusive C14N 1.0 +/// to produce bytes (per [XMLDSig §4.3.3.2](https://www.w3.org/TR/xmldsig-core1/#sec-ReferenceProcessingModel)). +/// +/// Returns the final byte sequence ready for digest computation. +pub fn execute_transforms<'a>( + signature_node: Node<'a, 'a>, + initial_data: TransformData<'a>, + transforms: &[Transform], +) -> Result, TransformError> { + let mut data = initial_data; + + for transform in transforms { + data = apply_transform(signature_node, transform, data)?; + } + + // Final coercion: if the result is still a NodeSet, canonicalize with + // default inclusive C14N 1.0 per XMLDSig spec §4.3.3.2. + match data { + TransformData::Binary(bytes) => Ok(bytes), + TransformData::NodeSet(nodes) => { + #[expect(clippy::expect_used, reason = "hardcoded URI is a known constant")] + let algo = C14nAlgorithm::from_uri("http://www.w3.org/TR/2001/REC-xml-c14n-20010315") + .expect("default C14N algorithm URI must be supported by C14nAlgorithm::from_uri"); + let mut output = Vec::new(); + let predicate = |node: Node| nodes.contains(node); + c14n::canonicalize(nodes.document(), Some(&predicate), &algo, &mut output) + .map_err(|e| TransformError::C14n(e.to_string()))?; + Ok(output) + } + } +} + +/// Parse a `` element into a `Vec`. +/// +/// Reads each `` child element and constructs +/// the corresponding [`Transform`] variant. Unrecognized algorithm URIs +/// produce an error. +/// +/// For Exclusive C14N, also parses the optional `` child element. +pub fn parse_transforms(transforms_node: Node) -> Result, TransformError> { + // Validate that we received a element. + if !transforms_node.is_element() { + return Err(TransformError::UnsupportedTransform( + "expected element but got non-element node".into(), + )); + } + let transforms_tag = transforms_node.tag_name(); + if transforms_tag.name() != "Transforms" || transforms_tag.namespace() != Some(XMLDSIG_NS_URI) { + return Err(TransformError::UnsupportedTransform( + "expected element in XMLDSig namespace".into(), + )); + } + + let mut chain = Vec::new(); + + for child in transforms_node.children() { + if !child.is_element() { + continue; + } + + // Only children are allowed; fail closed on any other element. + let tag = child.tag_name(); + if tag.name() != "Transform" || tag.namespace() != Some(XMLDSIG_NS_URI) { + return Err(TransformError::UnsupportedTransform( + "unexpected child element of ; only is allowed" + .into(), + )); + } + let uri = child.attribute("Algorithm").ok_or_else(|| { + TransformError::UnsupportedTransform( + "missing Algorithm attribute on ".into(), + ) + })?; + + let transform = if uri == ENVELOPED_SIGNATURE_URI { + Transform::Enveloped + } else if let Some(mut algo) = C14nAlgorithm::from_uri(uri) { + // For exclusive C14N, check for InclusiveNamespaces child + if algo.mode() == c14n::C14nMode::Exclusive1_0 { + if let Some(prefix_list) = parse_inclusive_prefixes(child)? { + algo = algo.with_prefix_list(&prefix_list); + } + } + Transform::C14n(algo) + } else { + return Err(TransformError::UnsupportedTransform(uri.to_string())); + }; + chain.push(transform); + } + + Ok(chain) +} + +/// Parse the `PrefixList` attribute from an `` child +/// element, if present. +/// +/// Per the [Exclusive C14N spec](https://www.w3.org/TR/xml-exc-c14n/#def-InclusiveNamespaces-PrefixList), +/// the element MUST be in the `http://www.w3.org/2001/10/xml-exc-c14n#` namespace. +/// Elements with the same local name but a different namespace are ignored. +/// +/// Returns `Ok(None)` if no `` child is present. +/// Returns `Err` if the element exists but lacks the required `PrefixList` attribute +/// (fail-closed: malformed control elements are rejected, not silently ignored). +/// +/// The element is typically: +/// ```xml +/// +/// ``` +fn parse_inclusive_prefixes(transform_node: Node) -> Result, TransformError> { + for child in transform_node.children() { + if child.is_element() { + let tag = child.tag_name(); + if tag.name() == "InclusiveNamespaces" && tag.namespace() == Some(EXCLUSIVE_C14N_NS_URI) + { + let prefix_list = child.attribute("PrefixList").ok_or_else(|| { + TransformError::UnsupportedTransform( + "missing PrefixList attribute on ".into(), + ) + })?; + return Ok(Some(prefix_list.to_string())); + } + } + } + Ok(None) +} + +#[cfg(test)] +#[expect(clippy::unwrap_used, reason = "tests use trusted XML fixtures")] +mod tests { + use super::*; + use crate::xmldsig::NodeSet; + use roxmltree::Document; + + // ── Enveloped transform ────────────────────────────────────────── + + #[test] + fn enveloped_excludes_signature_subtree() { + // Simulates a SAML-like document with an enveloped signature + let xml = r#" + hello + + + abc + + "#; + let doc = Document::parse(xml).unwrap(); + + // Find the Signature element + let sig_node = doc + .descendants() + .find(|n| n.is_element() && n.tag_name().name() == "Signature") + .unwrap(); + + // Start with entire document without comments (empty URI) + let node_set = NodeSet::entire_document_without_comments(&doc); + let data = TransformData::NodeSet(node_set); + + // Apply enveloped transform + let result = apply_transform(sig_node, &Transform::Enveloped, data).unwrap(); + let node_set = result.into_node_set().unwrap(); + + // Root and data should be in the set + assert!(node_set.contains(doc.root_element())); + let data_elem = doc + .descendants() + .find(|n| n.is_element() && n.tag_name().name() == "data") + .unwrap(); + assert!(node_set.contains(data_elem)); + + // Signature and its children should be excluded + assert!( + !node_set.contains(sig_node), + "Signature element should be excluded" + ); + let signed_info = doc + .descendants() + .find(|n| n.is_element() && n.tag_name().name() == "SignedInfo") + .unwrap(); + assert!( + !node_set.contains(signed_info), + "SignedInfo (child of Signature) should be excluded" + ); + } + + #[test] + fn enveloped_requires_node_set_input() { + let xml = ""; + let doc = Document::parse(xml).unwrap(); + // Binary input should fail with TypeMismatch + let data = TransformData::Binary(vec![1, 2, 3]); + let result = apply_transform(doc.root_element(), &Transform::Enveloped, data); + assert!(result.is_err()); + match result.unwrap_err() { + TransformError::TypeMismatch { expected, got } => { + assert_eq!(expected, "NodeSet"); + assert_eq!(got, "Binary"); + } + other => panic!("expected TypeMismatch, got: {other:?}"), + } + } + + #[test] + fn enveloped_rejects_cross_document_signature_node() { + // Signature node from a different Document must be rejected, + // not silently used to exclude wrong subtree. + let xml = r#""#; + let doc1 = Document::parse(xml).unwrap(); + let doc2 = Document::parse(xml).unwrap(); + + // NodeSet from doc1, Signature node from doc2 + let node_set = NodeSet::entire_document_without_comments(&doc1); + let input = TransformData::NodeSet(node_set); + let sig_from_doc2 = doc2 + .descendants() + .find(|n| n.is_element() && n.tag_name().name() == "Signature") + .unwrap(); + + let result = apply_transform(sig_from_doc2, &Transform::Enveloped, input); + assert!(matches!( + result, + Err(TransformError::CrossDocumentSignatureNode) + )); + } + + // ── C14N transform ─────────────────────────────────────────────── + + #[test] + fn c14n_transform_produces_bytes() { + let xml = r#""#; + let doc = Document::parse(xml).unwrap(); + + let node_set = NodeSet::entire_document_without_comments(&doc); + let data = TransformData::NodeSet(node_set); + + let algo = + C14nAlgorithm::from_uri("http://www.w3.org/TR/2001/REC-xml-c14n-20010315").unwrap(); + let result = apply_transform(doc.root_element(), &Transform::C14n(algo), data).unwrap(); + + let bytes = result.into_binary().unwrap(); + let output = String::from_utf8(bytes).unwrap(); + // Attributes sorted, empty element expanded + assert_eq!(output, r#""#); + } + + #[test] + fn c14n_transform_requires_node_set() { + let xml = ""; + let doc = Document::parse(xml).unwrap(); + + let algo = + C14nAlgorithm::from_uri("http://www.w3.org/TR/2001/REC-xml-c14n-20010315").unwrap(); + let data = TransformData::Binary(vec![1, 2, 3]); + let result = apply_transform(doc.root_element(), &Transform::C14n(algo), data); + + assert!(result.is_err()); + assert!(matches!( + result.unwrap_err(), + TransformError::TypeMismatch { .. } + )); + } + + // ── Pipeline execution ─────────────────────────────────────────── + + #[test] + fn pipeline_enveloped_then_c14n() { + // Standard SAML transform chain: enveloped-signature → exc-c14n + let xml = r#" + hello + + + abc + + "#; + let doc = Document::parse(xml).unwrap(); + + let sig_node = doc + .descendants() + .find(|n| n.is_element() && n.tag_name().name() == "Signature") + .unwrap(); + + let initial = TransformData::NodeSet(NodeSet::entire_document_without_comments(&doc)); + let transforms = vec![ + Transform::Enveloped, + Transform::C14n( + C14nAlgorithm::from_uri("http://www.w3.org/2001/10/xml-exc-c14n#").unwrap(), + ), + ]; + + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + + let output = String::from_utf8(result).unwrap(); + // Signature subtree should be gone; attributes sorted + assert!(!output.contains("Signature")); + assert!(!output.contains("SignedInfo")); + assert!(!output.contains("SignatureValue")); + assert!(output.contains("hello")); + } + + #[test] + fn pipeline_no_transforms_applies_default_c14n() { + // No explicit transforms → pipeline falls back to inclusive C14N 1.0 + let xml = r#""#; + let doc = Document::parse(xml).unwrap(); + + let initial = TransformData::NodeSet(NodeSet::entire_document_without_comments(&doc)); + let result = execute_transforms(doc.root_element(), initial, &[]).unwrap(); + + let output = String::from_utf8(result).unwrap(); + assert_eq!(output, r#""#); + } + + #[test] + fn pipeline_binary_passthrough() { + // If initial data is already binary (unusual, but spec-compliant) + // and no transforms, returns bytes directly + let xml = ""; + let doc = Document::parse(xml).unwrap(); + + let initial = TransformData::Binary(b"raw bytes".to_vec()); + let result = execute_transforms(doc.root_element(), initial, &[]).unwrap(); + + assert_eq!(result, b"raw bytes"); + } + + // ── Nested signatures ──────────────────────────────────────────── + + #[test] + fn enveloped_only_excludes_own_signature() { + // Two real elements: enveloped transform should only + // exclude the specific one being verified, not the other. + let xml = r#" + hello + + + + + + + "#; + let doc = Document::parse(xml).unwrap(); + + // We are verifying sig-target, not sig-other + let sig_node = doc + .descendants() + .find(|n| n.is_element() && n.attribute("Id") == Some("sig-target")) + .unwrap(); + + let node_set = NodeSet::entire_document_without_comments(&doc); + let data = TransformData::NodeSet(node_set); + + let result = apply_transform(sig_node, &Transform::Enveloped, data).unwrap(); + let node_set = result.into_node_set().unwrap(); + + // sig-other should still be in the set + let sig_other = doc + .descendants() + .find(|n| n.is_element() && n.attribute("Id") == Some("sig-other")) + .unwrap(); + assert!( + node_set.contains(sig_other), + "other Signature elements should NOT be excluded" + ); + + // Signature should be excluded + assert!( + !node_set.contains(sig_node), + "the specific Signature being verified should be excluded" + ); + } + + // ── parse_transforms ───────────────────────────────────────────── + + #[test] + fn parse_transforms_enveloped_and_exc_c14n() { + let xml = r#" + + + "#; + let doc = Document::parse(xml).unwrap(); + let transforms_node = doc.root_element(); + + let chain = parse_transforms(transforms_node).unwrap(); + assert_eq!(chain.len(), 2); + assert!(matches!(chain[0], Transform::Enveloped)); + assert!(matches!(chain[1], Transform::C14n(_))); + } + + #[test] + fn parse_transforms_with_inclusive_prefixes() { + let xml = r#" + + + + "#; + let doc = Document::parse(xml).unwrap(); + let transforms_node = doc.root_element(); + + let chain = parse_transforms(transforms_node).unwrap(); + assert_eq!(chain.len(), 1); + match &chain[0] { + Transform::C14n(algo) => { + assert!(algo.inclusive_prefixes().contains("ds")); + assert!(algo.inclusive_prefixes().contains("saml")); + assert!(algo.inclusive_prefixes().contains("")); // #default + } + other => panic!("expected C14n, got: {other:?}"), + } + } + + #[test] + fn parse_transforms_ignores_wrong_ns_inclusive_namespaces() { + // InclusiveNamespaces in a foreign namespace should be ignored — + // only elements in http://www.w3.org/2001/10/xml-exc-c14n# are valid. + let xml = r#" + + + + "#; + let doc = Document::parse(xml).unwrap(); + + let chain = parse_transforms(doc.root_element()).unwrap(); + assert_eq!(chain.len(), 1); + match &chain[0] { + Transform::C14n(algo) => { + // PrefixList from wrong namespace should NOT be honoured + assert!( + algo.inclusive_prefixes().is_empty(), + "should ignore InclusiveNamespaces in wrong namespace" + ); + } + other => panic!("expected C14n, got: {other:?}"), + } + } + + #[test] + fn parse_transforms_missing_prefix_list_is_error() { + // InclusiveNamespaces in correct namespace but without PrefixList + // attribute should be rejected (fail-closed), not silently ignored. + let xml = r#" + + + + "#; + let doc = Document::parse(xml).unwrap(); + + let result = parse_transforms(doc.root_element()); + assert!(result.is_err()); + assert!(matches!( + result.unwrap_err(), + TransformError::UnsupportedTransform(_) + )); + } + + #[test] + fn parse_transforms_unsupported_algorithm() { + let xml = r#" + + "#; + let doc = Document::parse(xml).unwrap(); + + let result = parse_transforms(doc.root_element()); + assert!(result.is_err()); + assert!(matches!( + result.unwrap_err(), + TransformError::UnsupportedTransform(_) + )); + } + + #[test] + fn parse_transforms_missing_algorithm() { + let xml = r#" + + "#; + let doc = Document::parse(xml).unwrap(); + + let result = parse_transforms(doc.root_element()); + assert!(result.is_err()); + assert!(matches!( + result.unwrap_err(), + TransformError::UnsupportedTransform(_) + )); + } + + #[test] + fn parse_transforms_empty() { + let xml = r#""#; + let doc = Document::parse(xml).unwrap(); + + let chain = parse_transforms(doc.root_element()).unwrap(); + assert!(chain.is_empty()); + } + + #[test] + fn parse_transforms_inclusive_c14n_variants() { + let xml = r#" + + + + "#; + let doc = Document::parse(xml).unwrap(); + + let chain = parse_transforms(doc.root_element()).unwrap(); + assert_eq!(chain.len(), 3); + // All should be C14n variants + for t in &chain { + assert!(matches!(t, Transform::C14n(_))); + } + } + + // ── Integration: SAML-like full pipeline ───────────────────────── + + #[test] + fn saml_enveloped_signature_full_pipeline() { + // Realistic SAML Response with enveloped signature + let xml = r#" + + user@example.com + + + + + + + + + + + fakesig== + + "#; + let doc = Document::parse(xml).unwrap(); + + // Find the Signature element + let sig_node = doc + .descendants() + .find(|n| n.is_element() && n.tag_name().name() == "Signature") + .unwrap(); + + // Parse the transforms from the XML + let reference = doc + .descendants() + .find(|n| n.is_element() && n.tag_name().name() == "Reference") + .unwrap(); + let transforms_elem = reference + .children() + .find(|n| n.is_element() && n.tag_name().name() == "Transforms") + .unwrap(); + let transforms = parse_transforms(transforms_elem).unwrap(); + assert_eq!(transforms.len(), 2); + + // Execute the pipeline with empty URI (entire document) + let initial = TransformData::NodeSet(NodeSet::entire_document_without_comments(&doc)); + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + + let output = String::from_utf8(result).unwrap(); + + // Signature subtree must be completely absent + assert!(!output.contains("Signature"), "Signature should be removed"); + assert!( + !output.contains("SignedInfo"), + "SignedInfo should be removed" + ); + assert!( + !output.contains("SignatureValue"), + "SignatureValue should be removed" + ); + assert!( + !output.contains("fakesig"), + "signature value should be removed" + ); + + // Document content should be present and canonicalized + assert!(output.contains("samlp:Response")); + assert!(output.contains("saml:Assertion")); + assert!(output.contains("user@example.com")); + } +} diff --git a/src/xmldsig/types.rs b/src/xmldsig/types.rs index 8513bd6..27314e6 100644 --- a/src/xmldsig/types.rs +++ b/src/xmldsig/types.rs @@ -198,7 +198,10 @@ fn collect_subtree_ids(node: Node<'_, '_>, ids: &mut HashSet) { } /// Errors during transform processing. +/// +/// New variants may be added as more transforms are implemented (Base64, XPath). #[derive(Debug, thiserror::Error)] +#[non_exhaustive] pub enum TransformError { /// Data type mismatch between transforms. #[error("type mismatch: expected {expected}, got {got}")] @@ -224,4 +227,9 @@ pub enum TransformError { /// Canonicalization error during transform. #[error("C14N error: {0}")] C14n(String), + + /// The Signature node passed to the enveloped transform belongs to a + /// different `Document` than the input `NodeSet`. + #[error("enveloped-signature transform: invalid Signature node for this document")] + CrossDocumentSignatureNode, } diff --git a/tests/transforms_integration.rs b/tests/transforms_integration.rs new file mode 100644 index 0000000..f54f864 --- /dev/null +++ b/tests/transforms_integration.rs @@ -0,0 +1,487 @@ +//! Integration tests: transform pipeline (enveloped signature + C14N). +//! +//! Tests the full chain: URI dereference → parse_transforms → execute_transforms +//! → verify canonical output matches expected bytes. +//! +//! These tests exercise multi-component interaction: +//! - `UriReferenceResolver` (URI dereference) +//! - `parse_transforms` (XML → Vec) +//! - `execute_transforms` (sequential pipeline) +//! - `NodeSet::exclude_subtree` (enveloped-signature) +//! - `c14n::canonicalize` (C14N serializer) + +use xml_sec::c14n::{C14nAlgorithm, C14nMode}; +use xml_sec::xmldsig::transforms::{execute_transforms, parse_transforms, Transform}; +use xml_sec::xmldsig::uri::UriReferenceResolver; +use xml_sec::xmldsig::NodeSet; + +// ── Helper ─────────────────────────────────────────────────────────────────── + +/// Find a descendant element by local name. +fn find_element<'a>(doc: &'a roxmltree::Document<'a>, name: &str) -> roxmltree::Node<'a, 'a> { + doc.descendants() + .find(|n| n.is_element() && n.tag_name().name() == name) + .unwrap_or_else(|| panic!("element <{name}> not found")) +} + +// ── Enveloped + Exclusive C14N (most common SAML pattern) ──────────────────── + +#[test] +fn enveloped_exc_c14n_saml_response() { + // Standard SAML Response with enveloped signature over entire document. + // Transform chain: enveloped-signature → exclusive C14N. + let xml = r#" + + user@example.com + + + + + + + + + + + + + placeholder== + + + fakesig== + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let resolver = UriReferenceResolver::new(&doc); + let sig_node = find_element(&doc, "Signature"); + + // Parse transforms from the XML + let reference = find_element(&doc, "Reference"); + let transforms_elem = reference + .children() + .find(|n| n.is_element() && n.tag_name().name() == "Transforms") + .unwrap(); + let transforms = parse_transforms(transforms_elem).unwrap(); + assert_eq!(transforms.len(), 2); + + // Dereference empty URI → entire document without comments + let initial = resolver.dereference("").unwrap(); + + // Execute pipeline + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // Signature must be completely absent + assert!(!output.contains("Signature"), "output: {output}"); + assert!(!output.contains("SignedInfo"), "output: {output}"); + assert!(!output.contains("SignatureValue"), "output: {output}"); + assert!(!output.contains("fakesig"), "output: {output}"); + assert!(!output.contains("DigestValue"), "output: {output}"); + + // Document content must be present + assert!(output.contains("samlp:Response"), "output: {output}"); + assert!(output.contains("saml:Assertion"), "output: {output}"); + assert!(output.contains("user@example.com"), "output: {output}"); + assert!(output.contains("_assert1"), "output: {output}"); + assert!(output.contains("_resp1"), "output: {output}"); + + // Exclusive C14N: only visibly-utilized namespaces. + // samlp: is used on Response, saml: is used on Assertion/Subject/Conditions + assert!( + output.contains("xmlns:samlp="), + "samlp ns should be present: {output}" + ); + assert!( + output.contains("xmlns:saml="), + "saml ns should be present: {output}" + ); +} + +// ── Enveloped + Inclusive C14N ──────────────────────────────────────────────── + +#[test] +fn enveloped_inclusive_c14n() { + // Less common but valid: enveloped + inclusive C14N 1.0 + let xml = r#" + content + + + sig + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let resolver = UriReferenceResolver::new(&doc); + let sig_node = find_element(&doc, "Signature"); + + let initial = resolver.dereference("").unwrap(); + let transforms = vec![ + Transform::Enveloped, + Transform::C14n(C14nAlgorithm::new(C14nMode::Inclusive1_0, false)), + ]; + + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // Signature gone + assert!(!output.contains("Signature"), "output: {output}"); + assert!(!output.contains("SignedInfo"), "output: {output}"); + + // Content present + assert!(output.contains(" + content + + + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let sig_node = find_element(&doc, "Signature"); + + let initial = + xml_sec::xmldsig::TransformData::NodeSet(NodeSet::entire_document_without_comments(&doc)); + let transforms = vec![ + Transform::Enveloped, + Transform::C14n(C14nAlgorithm::new(C14nMode::Inclusive1_1, false)), + ]; + + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + assert!(!output.contains("Signature"), "output: {output}"); + assert!(output.contains("content"), "output: {output}"); +} + +// ── #id URI + enveloped (signature inside assertion) ───────────────────────── + +#[test] +fn id_uri_enveloped_assertion_level_signature() { + // Signature inside the Assertion, signing only the Assertion (not the whole Response). + // URI="#_assert1" → subtree of Assertion, then enveloped removes Signature within it. + let xml = r#" + + user@example.com + + + sig + + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let resolver = UriReferenceResolver::new(&doc); + let sig_node = find_element(&doc, "Signature"); + + // Dereference #_assert1 → subtree of Assertion element + let initial = resolver.dereference("#_assert1").unwrap(); + + let transforms = vec![ + Transform::Enveloped, + Transform::C14n( + C14nAlgorithm::from_uri("http://www.w3.org/2001/10/xml-exc-c14n#").unwrap(), + ), + ]; + + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // Should contain the Assertion element and Subject + assert!(output.contains("saml:Assertion"), "output: {output}"); + assert!(output.contains("user@example.com"), "output: {output}"); + + // Signature subtree removed + assert!(!output.contains("Signature"), "output: {output}"); + assert!(!output.contains("SignedInfo"), "output: {output}"); + + // Response should NOT appear (we dereferenced only the assertion subtree) + assert!(!output.contains("samlp:Response"), "output: {output}"); +} + +// ── Nested signatures ──────────────────────────────────────────────────────── + +#[test] +fn nested_signatures_only_own_excluded() { + // Two signatures: one at Response level, one inside Assertion. + // When verifying the Response-level signature, only IT should be excluded. + let xml = r#" + + user + + + + + + + outer + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let resolver = UriReferenceResolver::new(&doc); + + // The outer signature (Id="outer-sig") is being verified + let outer_sig = doc + .descendants() + .find(|n| n.is_element() && n.attribute("Id") == Some("outer-sig")) + .unwrap(); + + let initial = resolver.dereference("").unwrap(); + let transforms = vec![ + Transform::Enveloped, + Transform::C14n(C14nAlgorithm::new(C14nMode::Inclusive1_0, false)), + ]; + + let result = execute_transforms(outer_sig, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // Outer signature should be excluded + assert!(!output.contains("outer-sig"), "output: {output}"); + assert!(!output.contains("outer"), "output: {output}"); + + // Inner signature should REMAIN (it's not the one being verified) + assert!( + output.contains("inner-sig"), + "inner signature should remain: {output}" + ); + + // Document content should be present + assert!( + output.contains("user"), + "output: {output}" + ); + assert!(output.contains("Assertion"), "output: {output}"); +} + +// ── No explicit C14N → default inclusive C14N 1.0 ──────────────────────────── + +#[test] +fn enveloped_only_falls_back_to_default_c14n() { + // If transform chain has only enveloped (no explicit C14N), + // the pipeline applies default inclusive C14N 1.0 per XMLDSig spec. + let xml = r#" + content + + + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let sig_node = find_element(&doc, "Signature"); + + let initial = + xml_sec::xmldsig::TransformData::NodeSet(NodeSet::entire_document_without_comments(&doc)); + let transforms = vec![Transform::Enveloped]; + + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // Signature gone + assert!(!output.contains("Signature"), "output: {output}"); + + // Attributes sorted (inclusive C14N) + assert!( + output.contains(r#"a="1" b="2""#), + "attributes should be sorted: {output}" + ); +} + +// ── Exclusive C14N with InclusiveNamespaces PrefixList ─────────────────────── + +#[test] +fn exc_c14n_with_prefix_list_from_xml() { + // Parse a transform chain that includes InclusiveNamespaces PrefixList, + // then execute the pipeline. + let xml = r#" + text + + + + + + + + + + + + sig + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let resolver = UriReferenceResolver::new(&doc); + let sig_node = find_element(&doc, "Signature"); + + // Parse transforms from the XML + let reference = find_element(&doc, "Reference"); + let transforms_elem = reference + .children() + .find(|n| n.is_element() && n.tag_name().name() == "Transforms") + .unwrap(); + let transforms = parse_transforms(transforms_elem).unwrap(); + assert_eq!(transforms.len(), 2); + + // Verify PrefixList was parsed + match &transforms[1] { + Transform::C14n(algo) => { + assert!( + algo.inclusive_prefixes().contains("ns2"), + "PrefixList should include ns2" + ); + } + other => panic!("expected C14n transform, got: {other:?}"), + } + + // Execute pipeline + let initial = resolver.dereference("").unwrap(); + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // Signature removed + assert!(!output.contains("Signature"), "output: {output}"); + + // ns1: is visibly utilized (attr on child) → should appear + assert!( + output.contains("xmlns:ns1="), + "ns1 should be present (visibly utilized): {output}" + ); + + // ns2: is forced via PrefixList → should appear even though not visibly utilized on child + // (It IS on root via inclusive ns rendering forced by PrefixList) + assert!( + output.contains("xmlns:ns2="), + "ns2 should be present (forced by PrefixList): {output}" + ); +} + +// ── parse_transforms end-to-end with realistic SAML ────────────────────────── + +#[test] +fn parse_and_execute_transforms_roundtrip() { + // Parse transforms from XML, execute them, verify output. + // This tests the full integration path that verification will use. + let xml = r#" + important data + + + + + + + + + + base64sig== + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let resolver = UriReferenceResolver::new(&doc); + let sig_node = find_element(&doc, "Signature"); + + // Step 1: Parse transforms from XML + let reference = find_element(&doc, "Reference"); + let transforms_elem = reference + .children() + .find(|n| n.is_element() && n.tag_name().name() == "Transforms") + .unwrap(); + let transforms = parse_transforms(transforms_elem).unwrap(); + + // Step 2: Dereference URI + let uri = reference.attribute("URI").unwrap_or(""); + let initial = resolver.dereference(uri).unwrap(); + + // Step 3: Execute transforms + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // Step 4: Verify + assert!(!output.contains("Signature")); + assert!(output.contains(r#"important data"#)); +} + +// ── Edge case: signature is the only child ─────────────────────────────────── + +#[test] +fn enveloped_signature_only_child() { + // Edge case: the Signature is the only element child of root. + // After the enveloped transform, the Signature subtree is removed and + // only the root element (plus any surrounding whitespace text nodes) + // remains in the NodeSet / canonical output. + let xml = r#" + + + sig + +"#; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let sig_node = find_element(&doc, "Signature"); + + let initial = + xml_sec::xmldsig::TransformData::NodeSet(NodeSet::entire_document_without_comments(&doc)); + let transforms = vec![ + Transform::Enveloped, + Transform::C14n(C14nAlgorithm::new(C14nMode::Inclusive1_0, false)), + ]; + + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // The Signature subtree should be removed; only the root element and any + // surrounding whitespace text nodes should remain. + assert!(!output.contains("Signature"), "output: {output}"); + assert!(output.contains(""), "output: {output}"); +} + +// ── Whitespace handling ────────────────────────────────────────────────────── + +#[test] +fn enveloped_preserves_surrounding_whitespace() { + // Whitespace text nodes between elements should be preserved + // (they are part of the document, not part of the Signature subtree) + let xml = "\n text\n \n \n \n"; + + let doc = roxmltree::Document::parse(xml).unwrap(); + let sig_node = find_element(&doc, "Signature"); + + let initial = + xml_sec::xmldsig::TransformData::NodeSet(NodeSet::entire_document_without_comments(&doc)); + let transforms = vec![ + Transform::Enveloped, + Transform::C14n(C14nAlgorithm::new(C14nMode::Inclusive1_0, false)), + ]; + + let result = execute_transforms(sig_node, initial, &transforms).unwrap(); + let output = String::from_utf8(result).unwrap(); + + // The text nodes ("\n ") around Signature are preserved — they're not + // part of the Signature subtree (they're siblings). C14N will emit them. + assert!(output.contains("text"), "output: {output}"); + // Verify whitespace around removed Signature is actually in the output + assert!( + output.contains("\n text\n"), + "whitespace around Signature should be preserved: {output}" + ); + assert!(!output.contains("SignedInfo"), "output: {output}"); +}