Skip to content

added anomaly model to MSstatsClean call#14

Merged
devonjkohler merged 1 commit intodevelfrom
feature-diann-anomaly
Mar 2, 2026
Merged

added anomaly model to MSstatsClean call#14
devonjkohler merged 1 commit intodevelfrom
feature-diann-anomaly

Conversation

@devonjkohler
Copy link
Contributor

@devonjkohler devonjkohler commented Mar 2, 2026

Motivation and Context

Integrate MSstast+ into MStatsBig DIA-NN converter.

Changes

  • add anomaly features to MSstatsClean

Testing

Please describe any unit tests you added or modified to verify your changes.

Checklist Before Requesting a Review

  • I have read the MSstats contributing guidelines
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

Motivation & Context

This PR integrates MSstats+ anomaly detection capabilities into the MSstatsBig DIA-NN converter pipeline. The motivation is to enable users to leverage MSstats+ anomaly scoring features when processing DIA-NN proteomics data. The solution adds anomaly model parameters (calculateAnomalyScores and anomalyModelFeatures) to the data processing functions and passes them through to the underlying MSstatsClean call, along with improved support for annotation files.

Changes

Core Functionality

  • R/clean_DIANN.R: Extended the public function reduceBigDIANN() with three new optional parameters:

    • calculateAnomalyScores (default: FALSE) — enables anomaly detection scoring in MSstats+
    • anomalyModelFeatures (default: empty vector) — specifies features for anomaly model
    • annotation (default: NULL) — annotation data frame or file path
    • Propagated these parameters through the internal cleanDIANNChunk() helper function
    • Updated the MSstatsClean() call to pass calculateAnomalyScores and anomalyModelFeatures
    • Modified cleanDIANNChunk() to call MSstatsMakeAnnotation() to handle annotation data
    • Fixed read_delim_chunked() invocation to determine delimiter (delim) dynamically based on input file type (CSV, TSV/XLS, or semicolon)
  • R/converters.R: Modified bigDIANNtoMSstatsFormat() to:

    • Add annotation parameter (default: NULL) immediately after input_file in the signature
    • Forward calculateAnomalyScores, anomalyModelFeatures, and annotation to the reduceBigDIANN() call

Documentation & Dependencies

  • NAMESPACE: Added import of MSstatsMakeAnnotation from MSstatsConvert package
  • man/reduceBigDIANN.Rd: Added documentation for new parameters: calculateAnomalyScores, anomalyModelFeatures, and annotation
  • man/cleanDIANNChunk.Rd: Added documentation for the three new parameters in the internal helper function
  • man/bigDIANNtoMSstatsFormat.Rd: Added annotation parameter to function signature and documentation

Unit Tests

No unit tests were added or modified in this PR to verify the new anomaly scoring parameters. While existing tests verify annotation handling (test in tests/testthat/test-clean_DIANN.R), there are no new tests for the calculateAnomalyScores and anomalyModelFeatures functionality.

Coding Guidelines

No violations of coding guidelines are apparent. The implementation:

  • Follows existing naming conventions and parameter ordering patterns
  • Includes roxygen2 documentation for all new parameters
  • Uses appropriate default values (FALSE, empty vector, NULL)
  • Maintains backward compatibility through default parameter values
  • Properly propagates parameters through the call chain

@devonjkohler devonjkohler merged commit 24d3d1b into devel Mar 2, 2026
1 of 2 checks passed
@coderabbitai
Copy link

coderabbitai bot commented Mar 2, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5eed6f and 51782b6.

📒 Files selected for processing (6)
  • NAMESPACE
  • R/clean_DIANN.R
  • R/converters.R
  • man/bigDIANNtoMSstatsFormat.Rd
  • man/cleanDIANNChunk.Rd
  • man/reduceBigDIANN.Rd

📝 Walkthrough

Walkthrough

The PR extends the DIANN-to-MSstats preprocessing pipeline by introducing anomaly scoring parameters and annotation support, integrating MSstatsMakeAnnotation from MSstatsConvert, and propagating these new capabilities through the conversion and cleaning functions via parameter threading.

Changes

Cohort / File(s) Summary
Namespace & Imports
NAMESPACE
Added importFrom entry for MSstatsConvert:MSstatsMakeAnnotation to extend package's public dependency surface.
DIANN Cleaning Pipeline
R/clean_DIANN.R, R/converters.R
Introduced calculateAnomalyScores and anomalyModelFeatures parameters to reduceBigDIANN and cleanDIANNChunk, threading them through to MSstatsClean invocations. Updated reduceBigDIANN call in bigDIANNtoMSstatsFormat converter to forward new parameters.
Documentation Updates
man/reduceBigDIANN.Rd, man/cleanDIANNChunk.Rd, man/bigDIANNtoMSstatsFormat.Rd
Added parameter documentation for calculateAnomalyScores (Boolean), anomalyModelFeatures (character vector), and annotation (file or data frame) across function signatures and argument descriptions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • Rudhik1904
  • tonywu1999

Poem

🐰 Anomalies beware, parameters flow free,
Through DIANN and cleaners, annotation's the key!
MSstats and scoring now dance hand in hand,
A pipeline enhanced across all the land! ✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature-diann-anomaly

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant