Skip to content

duckdb changes#6

Open
DeJong-Kees wants to merge 14 commits intorefactoringfrom
dev_duckdb
Open

duckdb changes#6
DeJong-Kees wants to merge 14 commits intorefactoringfrom
dev_duckdb

Conversation

@DeJong-Kees
Copy link

Ha Paul,

Dit zijn de veranderingen zodat de refactoring branch met duckdb als backend werkt.

Ik heb de rvat tutorials gerund en deze werken met duckdb als backend. Misschien nog wel goed om ook de unit tests te doen, voor een meer volledige check.

Een aandachtspunt is nog dit stukje code in gbd-utils.R:

DBI::dbExecute(gdb, "update var set VAR_id=rowid") ##TODO: see how to do this with duckdb (where rowid is not supported?)
DBI::dbExecute(gdb, "update dosage set VAR_id=rowid")

duckdb support namelijk geen rowid. Hier zal ik nog naar kijken.

(en zou jij weer copilot kunnen toevoegen als reviewer?)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates RVAT’s database layer so the refactoring branch can run with DuckDB as the backend instead of SQLite/RSQLite, adjusting connections, SQL queries, and gdb schema/build logic accordingly.

Changes:

  • Switch gdb/aggdb constructors and DB operations to DuckDB via DBI + duckdb.
  • Update schema/build pipeline to generate VAR_id in DuckDB (sequence + RETURNING) and pass var_id into insertDosageRecord().
  • Adjust various SQL placeholders and attach/index-copy logic to be more DuckDB-compatible.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
R/spatialClust.R Replaces RSQLite-specific result handling with DBI equivalents.
R/mapToCDS.R Uses DBI::dbGetQuery() for overlap queries.
R/gdb.R Switches gdb connection to DuckDB and updates path retrieval + query helper usage.
R/gdb-utils.R Updates attach parameterization and index-copy logic; still contains SQLite rowid remap.
R/gdb-shared-helpers.R Switches to positional parameters and updates meta/index logic (introduces meta regression).
R/gdb-getGT.R Switches genotype query to DBI::dbGetQuery().
R/gdb-buildGdb.R Reworks schema + insert pipeline for DuckDB (RETURNING VAR_id, explicit dosage.VAR_id, drop/recreate helpers).
R/gdb-anno-cohort.R Switches delete statement parameter style to positional placeholders.
R/allGenerics.R Updates insertDosageRecord() generic signature to accept var_id.
R/allClasses.R Changes gdb/aggdb classes to extend duckdb_connection (roxygen import tag currently malformed).
R/aggDb.R Switches aggdb connections to DuckDB.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

params = list("genomeBuild", genomeBuild)
)
}
if (!genomeBuild %in% names(nonPAR)) {
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!genomeBuild %in% names(nonPAR)) will throw “argument is of length zero” when genomeBuild is NULL (default in buildGdb). Guard this check (and the warning) with !is.null(genomeBuild) or handle NULL explicitly before using %in%.

Suggested change
if (!genomeBuild %in% names(nonPAR)) {
if (!is.null(genomeBuild) && !genomeBuild %in% names(nonPAR)) {

Copilot uses AI. Check for mistakes.
DBI::dbExecute(
gdb,
"INSERT INTO meta VALUES (:name, :value)",
"INSERT INTO meta VALUES (?, ?)",
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.gdb_populate_meta_table() no longer inserts a creationDate row into meta, but getCreationDate() expects it and existing tests rely on it. Reintroduce writing creationDate (and consider writing genomeBuild as NA_character_ when NULL to preserve the previous metadata shape).

Copilot uses AI. Check for mistakes.
R/gdb-utils.R Outdated
Comment on lines 114 to 115
DBI::dbExecute(gdb, "update var set VAR_id=rowid") ##TODO: see how to do this with duckdb (where rowid is not supported?)
DBI::dbExecute(gdb, "update dosage set VAR_id=rowid")
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DuckDB does not support SQLite’s rowid, so update var set VAR_id=rowid / update dosage set VAR_id=rowid will fail (and concatGdb() defaults to running this). Replace this remapping step with a DuckDB-compatible approach (e.g., assign sequential ids using row_number() OVER (ORDER BY ...) and update dosage via a join/mapping).

Copilot uses AI. Check for mistakes.
R/allClasses.R Outdated
#' @rdname gdb
#' @import DBI
#' @importClassesFrom RSQLite SQLiteConnection
#' @importClassesFrom duckdb_connection
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The roxygen tag is malformed: @importClassesFrom must specify a package and one or more class names (e.g. @importClassesFrom duckdb duckdb_connection). As written, roxygen/R CMD check is likely to fail, and the duckdb_connection dependency for setClass('gdb', contains=...) won’t be declared correctly.

Suggested change
#' @importClassesFrom duckdb_connection
#' @importClassesFrom duckdb duckdb_connection

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants