Skip to content

feat(getSubnetworkFromIndra): Add filter for including infinity fold change#80

Merged
tonywu1999 merged 3 commits intodevelfrom
filter-infinity
Mar 5, 2026
Merged

feat(getSubnetworkFromIndra): Add filter for including infinity fold change#80
tonywu1999 merged 3 commits intodevelfrom
filter-infinity

Conversation

@tonywu1999
Copy link
Contributor

@tonywu1999 tonywu1999 commented Mar 5, 2026

Motivation and Context

This PR adds support for including proteins with infinite log fold change values in network construction from INDRA and adds directional filtering for regulation. Proteins with infinite logFC arise when they are observed in only one experimental condition (e.g., zero in one condition and non-zero in the other); previously these proteins could be excluded. The new options give users explicit control to include such proteins and to filter by regulation direction ("both", "up", "down") while preserving backward compatibility.

Solution Summary

  • Public API: getSubnetworkFromIndra gains two new parameters:
    • include_infinite_fc (logical, default FALSE) — whether to include proteins with infinite log2 fold change.
    • direction (one of "both", "up", "down") — direction-based filtering of log2FC.
  • Filtering pipeline: include_infinite_fc and direction are propagated into the internal input-filtering routine, which now can collect and prepend infinite-logFC proteins, apply directional filtering, and deduplicate before downstream INDRA processing.
  • Visualization: logFC→color mapping was updated to handle positive/negative infinity by clamping infinite values to finite computed bounds so color scales remain stable.
  • Metadata: new internal helper .addAdditionalMetadataToIndraEdge(edge, input) was introduced to enrich INDRA edge metadata from the filtered input.

Detailed Changes

  • R/getSubnetworkFromIndra.R

    • Added parameters include_infinite_fc = FALSE and direction (c("both","up","down")) to the function signature and Roxygen docs.
    • Call to .filterGetSubnetworkFromIndraInput updated to pass include_infinite_fc and direction.
  • R/utils_getSubnetworkFromIndra.R

    • .filterGetSubnetworkFromIndraInput signature expanded to accept include_infinite_fc and direction.
    • Implemented detection of proteins with infinite log2FC (is.infinite) when include_infinite_fc is TRUE and integration of those proteins with exempt/filtered sets, followed by deduplication.
    • Added directional filtering logic (both/up/down) applied to filtered input.
    • Removed prior ad-hoc filtering based on an "issue" column.
    • Added internal helper .addAdditionalMetadataToIndraEdge(edge, input) to attach extra metadata (e.g., PTM site extraction and GlobalProtein mapping) to INDRA edges.
  • R/utils_cytoscapeNetwork.R

    • Updated .mapLogFCToColor to:
      • Identify positive and negative infinities and compute finite_values for scale computation.
      • Compute max_logFC from finite absolute values with fallback default_max = 2, set min_logFC = -max_logFC.
      • Clamp infinite logFC entries to min/max before normalization so color mapping handles infinities correctly.
      • Preserve NA handling (mapped to middle color).
  • man/getSubnetworkFromIndra.Rd

    • Documentation updated to include include_infinite_fc and direction parameters and their descriptions.
  • Minor: Roxygen and function documentation updated to reflect new parameters.

Unit Tests

  • New unit tests were added in this PR to cover:
    • include_infinite_fc behavior (including/integrating infinite-logFC proteins).
    • direction filtering ("both"/"up"/"down") behavior.
    • Infinite value clamping in .mapLogFCToColor (positive/negative infinities).

@coderabbitai
Copy link

coderabbitai bot commented Mar 5, 2026

📝 Walkthrough

Walkthrough

Adds include_infinite_fc and direction parameters to getSubnetworkFromIndra and propagates them through input filtering and color-mapping; enhances handling of proteins with infinite logFC, adds edge metadata enrichment, and clamps infinite logFC for visualization.

Changes

Cohort / File(s) Summary
Public API & Docs
R/getSubnetworkFromIndra.R, man/getSubnetworkFromIndra.Rd
Added include_infinite_fc (default FALSE) and direction (c("both","up","down")) to signature and docs; new args passed into internal filter.
Input Filtering & Metadata
R/utils_getSubnetworkFromIndra.R
.filterGetSubnetworkFromIndraInput signature extended with include_infinite_fc and direction; implements collection/merging/deduplication of infinite-logFC proteins, directional filtering, removes ad-hoc "issue" filtering, and adds .addAdditionalMetadataToIndraEdge.
Visualization / Color Mapping
R/utils_cytoscapeNetwork.R
.mapLogFCToColor detects positive/negative infinities, computes color scale from finite values with fallback, clamps infinities to bounds, preserves NA handling.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

Review effort 3/5

Poem

🐰 I nibble through proteins, far and near,
Infinite folds now hop into clear.
I stitch metadata, colors that gleam,
Directions chosen — up, down, or both in a dream.
Hooray for edges that sparkle and sing! 🎶

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description is empty with no actual content provided; only the template structure is mentioned but not filled in by the author. Provide a complete pull request description including: (1) Motivation and Context explaining the problem and solution, (2) Detailed bullet-point list of Changes, (3) Description of Testing and any unit tests added, (4) Confirmation of the pre-submission checklist items.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title addresses one aspect of the changes (including infinity fold change), but the raw summary shows the PR also adds a directional filtering feature (direction parameter). The title is partially related but doesn't fully capture the main scope of changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch filter-infinity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
R/utils_getSubnetworkFromIndra.R (1)

158-184: ⚠️ Potential issue | 🟡 Minor

Potential issue: infinite values may be dropped by logfc_cutoff filter before re-merging.

The current logic extracts infinite_fc_proteins at line 160, but the subsequent logfc_cutoff filter at line 171 uses abs(input$log2FC) > logfc_cutoff. Since abs(Inf) > logfc_cutoff evaluates to TRUE, infinite values would actually pass this filter. However, if include_infinite_fc = FALSE and logfc_cutoff is set, infinite values would still pass the logfc filter and remain in the result, which may not be the intended behavior.

Consider explicitly excluding infinite values during the logfc_cutoff filtering when include_infinite_fc = FALSE:

Proposed fix
     if (!is.null(logfc_cutoff)) {
         if (!is.numeric(logfc_cutoff) || length(logfc_cutoff) != 1 || logfc_cutoff < 0) {
             stop("logfc_cutoff must be a single positive numeric value")
         }
-        input <- input[!is.na(input$log2FC) & abs(input$log2FC) > logfc_cutoff, ]
+        input <- input[!is.na(input$log2FC) & is.finite(input$log2FC) & abs(input$log2FC) > logfc_cutoff, ]
     }

This ensures infinite values are only included when explicitly requested via include_infinite_fc = TRUE.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/utils_getSubnetworkFromIndra.R` around lines 158 - 184, The filter for
logfc_cutoff currently lets infinite log2FC values pass unless
include_infinite_fc is TRUE; update the logfc cutoff step in the function to
explicitly exclude infinite values when include_infinite_fc is FALSE by changing
the filter from input <- input[!is.na(input$log2FC) & abs(input$log2FC) >
logfc_cutoff, ] to a conditional that adds !is.infinite(input$log2FC) when
include_infinite_fc is FALSE (keep existing extraction of infinite_fc_proteins
when include_infinite_fc is TRUE and continue to re-merge them only when
requested).
🧹 Nitpick comments (1)
R/utils_getSubnetworkFromIndra.R (1)

174-184: Consider consolidating the merge operations for clarity.

The two separate rbind + deduplication blocks could be combined into a single operation for better readability and maintainability. The current approach works correctly but the precedence order (exempt → infinite → filtered) could be made more explicit.

Optional consolidation
-    # Combine filtered data with exempt proteins and remove duplicates
-    if (!is.null(exempt_proteins) && nrow(exempt_proteins) > 0) {
-        combined_input <- rbind(exempt_proteins, input)
-        # Remove duplicates based on Protein column, keeping first occurrence
-        input <- combined_input[!duplicated(combined_input$Protein), ]
-    }
-    
-    if (!is.null(infinite_fc_proteins) && nrow(infinite_fc_proteins) > 0) {
-        combined_input <- rbind(infinite_fc_proteins, input)
-        input <- combined_input[!duplicated(combined_input$Protein), ]
-    }
+    # Combine filtered data with exempt and infinite proteins
+    # Priority order: exempt > infinite > filtered (first occurrence kept)
+    parts_to_merge <- list(exempt_proteins, infinite_fc_proteins, input)
+    parts_to_merge <- Filter(function(x) !is.null(x) && nrow(x) > 0, parts_to_merge)
+    if (length(parts_to_merge) > 1) {
+        combined_input <- do.call(rbind, parts_to_merge)
+        input <- combined_input[!duplicated(combined_input$Protein), ]
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/utils_getSubnetworkFromIndra.R` around lines 174 - 184, Consolidate the two
rbind+dedup blocks by building a single stacked data.frame in the explicit
precedence order (exempt_proteins first, then infinite_fc_proteins, then input)
and then remove duplicates by Protein keeping the first occurrence; locate the
logic using the variables exempt_proteins, infinite_fc_proteins, input and
replace the two separate combined_input creations with one combined_input <-
rbind(...) followed by the single deduplication step input <-
combined_input[!duplicated(combined_input$Protein), ] to make precedence and
intent clearer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@R/utils_getSubnetworkFromIndra.R`:
- Around line 158-184: The filter for logfc_cutoff currently lets infinite
log2FC values pass unless include_infinite_fc is TRUE; update the logfc cutoff
step in the function to explicitly exclude infinite values when
include_infinite_fc is FALSE by changing the filter from input <-
input[!is.na(input$log2FC) & abs(input$log2FC) > logfc_cutoff, ] to a
conditional that adds !is.infinite(input$log2FC) when include_infinite_fc is
FALSE (keep existing extraction of infinite_fc_proteins when include_infinite_fc
is TRUE and continue to re-merge them only when requested).

---

Nitpick comments:
In `@R/utils_getSubnetworkFromIndra.R`:
- Around line 174-184: Consolidate the two rbind+dedup blocks by building a
single stacked data.frame in the explicit precedence order (exempt_proteins
first, then infinite_fc_proteins, then input) and then remove duplicates by
Protein keeping the first occurrence; locate the logic using the variables
exempt_proteins, infinite_fc_proteins, input and replace the two separate
combined_input creations with one combined_input <- rbind(...) followed by the
single deduplication step input <-
combined_input[!duplicated(combined_input$Protein), ] to make precedence and
intent clearer.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 05f1e744-d301-40de-b149-1210e02160e4

📥 Commits

Reviewing files that changed from the base of the PR and between 1db5b56 and 7654ca7.

📒 Files selected for processing (4)
  • R/getSubnetworkFromIndra.R
  • R/utils_cytoscapeNetwork.R
  • R/utils_getSubnetworkFromIndra.R
  • man/getSubnetworkFromIndra.Rd

@codecov-commenter
Copy link

codecov-commenter commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.09%. Comparing base (1db5b56) to head (cc9306d).
⚠️ Report is 2 commits behind head on devel.

Additional details and impacted files
@@            Coverage Diff             @@
##            devel      #80      +/-   ##
==========================================
+ Coverage   78.66%   80.09%   +1.42%     
==========================================
  Files           7        7              
  Lines         825      839      +14     
==========================================
+ Hits          649      672      +23     
+ Misses        176      167       -9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
R/utils_getSubnetworkFromIndra.R (1)

187-191: Consider adding test coverage for direction filtering.

The Codecov report indicates missing coverage in this file. The "up" and "down" direction branches should be tested to ensure the filtering behaves correctly for both directions, including edge cases with infinite log2FC values.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/utils_getSubnetworkFromIndra.R` around lines 187 - 191, Add unit tests for
the direction filtering logic in getSubnetworkFromIndra: create test cases that
call getSubnetworkFromIndra (or the helper containing the shown block) with
direction = "up" and direction = "down" and an input data.frame that includes
NA, positive, negative, zero, Inf and -Inf values for the log2FC column to
assert only the expected rows remain (positive and finite for "up", negative and
finite for "down"), and include edge assertions that infinite values are
excluded and NA rows are dropped; ensure tests cover both branches that use
input[!is.na(input$log2FC) & input$log2FC > 0, ] and input[!is.na(input$log2FC)
& input$log2FC < 0, ] so Codecov reflects the coverage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@R/utils_getSubnetworkFromIndra.R`:
- Around line 187-191: Add unit tests for the direction filtering logic in
getSubnetworkFromIndra: create test cases that call getSubnetworkFromIndra (or
the helper containing the shown block) with direction = "up" and direction =
"down" and an input data.frame that includes NA, positive, negative, zero, Inf
and -Inf values for the log2FC column to assert only the expected rows remain
(positive and finite for "up", negative and finite for "down"), and include edge
assertions that infinite values are excluded and NA rows are dropped; ensure
tests cover both branches that use input[!is.na(input$log2FC) & input$log2FC >
0, ] and input[!is.na(input$log2FC) & input$log2FC < 0, ] so Codecov reflects
the coverage.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 71cd13dd-32a0-4a81-be17-02c0508a80cd

📥 Commits

Reviewing files that changed from the base of the PR and between 7654ca7 and 9cf4739.

📒 Files selected for processing (3)
  • R/getSubnetworkFromIndra.R
  • R/utils_getSubnetworkFromIndra.R
  • man/getSubnetworkFromIndra.Rd

@tonywu1999 tonywu1999 merged commit 5fc89b1 into devel Mar 5, 2026
4 of 5 checks passed
@tonywu1999 tonywu1999 deleted the filter-infinity branch March 5, 2026 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants