Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates RVAT’s database layer so the refactoring branch can run with DuckDB as the backend instead of SQLite/RSQLite, adjusting connections, SQL queries, and gdb schema/build logic accordingly.
Changes:
- Switch
gdb/aggdbconstructors and DB operations to DuckDB viaDBI+duckdb. - Update schema/build pipeline to generate
VAR_idin DuckDB (sequence +RETURNING) and passvar_idintoinsertDosageRecord(). - Adjust various SQL placeholders and attach/index-copy logic to be more DuckDB-compatible.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| R/spatialClust.R | Replaces RSQLite-specific result handling with DBI equivalents. |
| R/mapToCDS.R | Uses DBI::dbGetQuery() for overlap queries. |
| R/gdb.R | Switches gdb connection to DuckDB and updates path retrieval + query helper usage. |
| R/gdb-utils.R | Updates attach parameterization and index-copy logic; still contains SQLite rowid remap. |
| R/gdb-shared-helpers.R | Switches to positional parameters and updates meta/index logic (introduces meta regression). |
| R/gdb-getGT.R | Switches genotype query to DBI::dbGetQuery(). |
| R/gdb-buildGdb.R | Reworks schema + insert pipeline for DuckDB (RETURNING VAR_id, explicit dosage.VAR_id, drop/recreate helpers). |
| R/gdb-anno-cohort.R | Switches delete statement parameter style to positional placeholders. |
| R/allGenerics.R | Updates insertDosageRecord() generic signature to accept var_id. |
| R/allClasses.R | Changes gdb/aggdb classes to extend duckdb_connection (roxygen import tag currently malformed). |
| R/aggDb.R | Switches aggdb connections to DuckDB. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| params = list("genomeBuild", genomeBuild) | ||
| ) | ||
| } | ||
| if (!genomeBuild %in% names(nonPAR)) { |
There was a problem hiding this comment.
if (!genomeBuild %in% names(nonPAR)) will throw “argument is of length zero” when genomeBuild is NULL (default in buildGdb). Guard this check (and the warning) with !is.null(genomeBuild) or handle NULL explicitly before using %in%.
| if (!genomeBuild %in% names(nonPAR)) { | |
| if (!is.null(genomeBuild) && !genomeBuild %in% names(nonPAR)) { |
| DBI::dbExecute( | ||
| gdb, | ||
| "INSERT INTO meta VALUES (:name, :value)", | ||
| "INSERT INTO meta VALUES (?, ?)", |
There was a problem hiding this comment.
.gdb_populate_meta_table() no longer inserts a creationDate row into meta, but getCreationDate() expects it and existing tests rely on it. Reintroduce writing creationDate (and consider writing genomeBuild as NA_character_ when NULL to preserve the previous metadata shape).
R/gdb-utils.R
Outdated
| DBI::dbExecute(gdb, "update var set VAR_id=rowid") ##TODO: see how to do this with duckdb (where rowid is not supported?) | ||
| DBI::dbExecute(gdb, "update dosage set VAR_id=rowid") |
There was a problem hiding this comment.
DuckDB does not support SQLite’s rowid, so update var set VAR_id=rowid / update dosage set VAR_id=rowid will fail (and concatGdb() defaults to running this). Replace this remapping step with a DuckDB-compatible approach (e.g., assign sequential ids using row_number() OVER (ORDER BY ...) and update dosage via a join/mapping).
R/allClasses.R
Outdated
| #' @rdname gdb | ||
| #' @import DBI | ||
| #' @importClassesFrom RSQLite SQLiteConnection | ||
| #' @importClassesFrom duckdb_connection |
There was a problem hiding this comment.
The roxygen tag is malformed: @importClassesFrom must specify a package and one or more class names (e.g. @importClassesFrom duckdb duckdb_connection). As written, roxygen/R CMD check is likely to fail, and the duckdb_connection dependency for setClass('gdb', contains=...) won’t be declared correctly.
| #' @importClassesFrom duckdb_connection | |
| #' @importClassesFrom duckdb duckdb_connection |
…e in getAnno using read_csv_auto
minor fixes + update units tests for duckdb implementation
couple of fixes based on copilot review
latest updates from dev_duckdb_paul
Ha Paul,
Dit zijn de veranderingen zodat de refactoring branch met duckdb als backend werkt.
Ik heb de rvat tutorials gerund en deze werken met duckdb als backend. Misschien nog wel goed om ook de unit tests te doen, voor een meer volledige check.
Een aandachtspunt is nog dit stukje code in gbd-utils.R:
DBI::dbExecute(gdb, "update var set VAR_id=rowid") ##TODO: see how to do this with duckdb (where rowid is not supported?)
DBI::dbExecute(gdb, "update dosage set VAR_id=rowid")
duckdb support namelijk geen rowid. Hier zal ik nog naar kijken.
(en zou jij weer copilot kunnen toevoegen als reviewer?)