Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
d850a85
split into separate pages for single/independent/paired proportions, …
petelaud Feb 18, 2026
79b7d2e
reflect updated SAS option name for CL=Newcombe instead of CL=Wilson …
petelaud Feb 18, 2026
99316f0
initial minor edits, add reference to macro available on GitHub
petelaud Feb 18, 2026
84bf12c
move introduction to separate page for general information on CIs for…
petelaud Feb 19, 2026
1610a98
typos & clarification
petelaud Feb 19, 2026
3fb8a76
add reference to new COMMONRISKDIFF option in SAS Viya
petelaud Feb 19, 2026
3060c33
add warning that PROC FREQ fails to produce an interval for 0/n1 vs 0/n2
petelaud Feb 19, 2026
797fc4b
add text to categorical data subheadings to make connection with the …
petelaud Feb 19, 2026
507f8e8
add further details about continuity adjustment and consistency with …
petelaud Feb 19, 2026
3404973
move intro page to method_summary folder, and add to index page
petelaud Feb 19, 2026
01ee908
clarification about COMMONRISKDIFF option and new sub-options in SAS …
petelaud Feb 19, 2026
a9b3659
remove intro text. Initial outline sections added and first draft of …
petelaud Feb 19, 2026
feffacc
split R page into separate sections for single/independent/paired pro…
petelaud Feb 19, 2026
5c4a92f
remove packages not relevant to the section, & add placeholder for ad…
petelaud Feb 19, 2026
c12abf1
replace intro text with reference to new summary section
petelaud Feb 19, 2026
9b08c89
add placeholders for new sections - content to be added
petelaud Feb 19, 2026
8e3a83f
add note about development version of ratesci::rateci()
petelaud Feb 19, 2026
af55a7b
clarification for MN/Mee
petelaud Feb 22, 2026
1a9a5e6
remove 'asymptotic', which is non-specific
petelaud Feb 22, 2026
fd3bd47
add section describing methods that apply across different contrast p…
petelaud Feb 22, 2026
43c1d72
add detail about historic advice for small cell counts, & some other …
petelaud Feb 22, 2026
532dfa6
initial creation of references file for easier management of citation…
petelaud Feb 23, 2026
181b59b
tidy up text about small cell counts
petelaud Feb 23, 2026
4ae7bba
add comment about one-sided coverage and central location
petelaud Feb 23, 2026
12ba11b
extend details on asymptotic score, SCAS, MOVER & mid-P methods
petelaud Feb 23, 2026
a0dd7c8
remove rogue quotation mark (and test commit/push from RStudio on Mac)
petelaud Feb 23, 2026
b8bf4d7
add citations
petelaud Feb 23, 2026
15bf671
add subscripts for p1 and p2
petelaud Feb 23, 2026
3d8eb57
make headings consistent
petelaud Feb 23, 2026
949103a
mention continuity adjustment for Jeffreys
petelaud Feb 23, 2026
18e1dd5
add details for continuity adjusted methods and SAS PROC FREQ options
petelaud Feb 23, 2026
2daec26
intro improvements & additions
petelaud Feb 26, 2026
fe1bf37
several general edits/clarifications, plus correction to Jeffreys and…
petelaud Feb 26, 2026
5d7307a
make some subtle language improvements and add Cai reference
petelaud Mar 5, 2026
498a517
add bibliography file
petelaud Mar 5, 2026
affd708
expand introduction, and add link to method summary page
petelaud Mar 5, 2026
521a9e4
add SAS code for importing the csv file
petelaud Mar 5, 2026
785c827
add some references
petelaud Mar 5, 2026
1463683
tidy up description of available methods in PROC FREQ, and add exampl…
petelaud Mar 5, 2026
bfd5281
improve description of Wald and Newcome methods, and add brief descri…
petelaud Mar 5, 2026
fa586bd
expand section on continuity adjustments
petelaud Mar 5, 2026
f84a0fc
tidy up references
petelaud Mar 5, 2026
2d75571
expand intro with warning for situations when SAS fails to produce an…
petelaud Mar 5, 2026
f512e98
add sort step in code to facilitate later examples
petelaud Mar 5, 2026
7e34beb
move example code to end section
petelaud Mar 5, 2026
b2412bd
add further description for score methods
petelaud Mar 5, 2026
f82e2ac
improve description of 'exact' methods
petelaud Mar 5, 2026
3926be6
combine sections for RR and RR, since SAS provides a similar set of m…
petelaud Mar 5, 2026
a16922b
minor updates to content and formatting of continuity adjusted section
petelaud Mar 5, 2026
e6279a8
updates to section on consistency with hypothesis tests
petelaud Mar 5, 2026
9cde348
update text around example code
petelaud Mar 5, 2026
0fce38a
update ci_for_prop table entries
petelaud Mar 5, 2026
a93b208
minor edits
petelaud Mar 5, 2026
80da58f
add .csl file for citations
petelaud Mar 5, 2026
69a9e82
Merge branch 'main' into ratesci
petelaud Mar 5, 2026
7489122
remove #| eval: false from SAS chunks
petelaud Mar 5, 2026
2670cd2
delete spurious SAS chunk
petelaud Mar 5, 2026
2191c0d
add eval: false in header
petelaud Mar 6, 2026
c9c76d7
add eval: false to header
petelaud Mar 6, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 179 additions & 0 deletions R/ci_for_2indep_prop.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
---
title: "Confidence Intervals for Independent Proportions in R"
---

## Introduction

\[See separate page for general introductory information on confidence intervals for proportions.\]

\[Note: information about cicalc package will be added to this page soon.\]

## Data used

The adcibc data stored [here](../data/adcibc.csv) was used in this example, creating a binary treatment variable `trt` taking the values of `ACT` or `PBO` and a binary response variable `resp` taking the values of `Yes` or `No`. For this example, a response is defined as a score greater than 4.

```{r}
#| echo: false
#| include: false
library(tidyverse)
library(cardx)
library(DescTools)
adcibc2 <- read_csv("../data/adcibc.csv")

adcibc <- adcibc2 |>
select(AVAL, TRTP) |>
mutate(
resp = if_else(AVAL > 4, "Yes", "No"),
respn = if_else(AVAL > 4, 1, 0),
trt = if_else(TRTP == "Placebo", "PBO", "ACT"),
trtn = if_else(TRTP == "Placebo", 0, 1)
) |>
select(trt, trtn, resp, respn)

# cardx package required a vector with 0 and 1s for a single proportion CI
act <- filter(adcibc, trt == "ACT") |>
select(respn)
act2 <- act$respn
```

The below shows that for the Actual Treatment, there are 36 responders out of 154 subjects = 0.234 (23.4% responders).

```{r}
#| echo: false
adcibc |>
group_by(trt, resp) |>
tally()
```

## Packages

**The {cardx} package** is an extension of the {cards} package, providing additional functions to create Analysis Results Data Objects (ARDs)^1^. It was developed as part of {NEST} and pharmaverse. This package requires the binary endpoint to be a logical (TRUE/FALSE) vector or a numeric/integer coded as (0, 1) with 1 (TRUE) being the success you want to calculate the confidence interval for.

See [here](R:%20Functions%20for%20Calculating%20Proportion%20Confidence%20Intervals) for full description of the {cardx} proportions equations.

If calculating the CI for a difference in proportions, the package requires both the response and the treatment variable to be numeric/integer coded as (0, 1) (or logical vector).

Instead of the code presented below, you can use `ard_categorical_ci(data, variables=resp, method ='wilson')` for example. This invokes the code below but returns an analysis results dataset (ARD) format as the output.

Methods included are \[TBC\] methods for 2 independent samples.

**The {ratesci} package** is ... \[TBC\]

**The {DescTools} package** has a function BinomDiffCI which produces CIs for two independent proportions (unmatched pairs) including methods for Agresti/Caffo, Wald, Wald with Continuity correction, Newcombe Score, Newcombe score with continuity correction, and more computationally intensive methods such as Miettinen and Nurminen, Mee, Brown Li's Jeffreys, Hauck-Anderson and Haldane. See [here](https://search.r-project.org/CRAN/refmans/DescTools/html/BinomDiffCI.html) for more detail.

**The {presize} package** has a function prec_prop() which also calculates CIs for 2 independent samples using the Wilson, Agresti-Coull, Exact or Wald approaches. The package is not described in further detail here since in most cases **{DescTools}** will be able to compute what is needed. However, it's mentioned due to other functionality it has available such as sample size and precision calculations for AUC, correlations, cronbach's alpha, intraclass correlation, Cohen's kappa, likelihood ratios, means, mean differences, odds ratios, rates, rate ratios, risk differences and risk ratios.

## Methods for Calculating Confidence Intervals for Proportion Difference from 2 independent samples

This [paper](https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf)^4^ describes many methods for the calculation of confidence intervals for 2 independent proportions.

### Normal Approximation Method (Also known as the Wald or asymptotic CI Method) using {cardx}

For more technical information regarding the Wald method see the corresponding [SAS page](https://psiaims.github.io/CAMIS/SAS/ci_for_prop.html).

#### Example code

`cardx::ard_stats_prop_test function` uses `stats::prop.test` which also allows a continuity correction to be applied.

Although this website [here](https://rdrr.io/r/stats/prop.test.html) and this one [here](https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/prop.test) both reference Newcombe for the CI that this function uses, replication of the results by hand and compared to SAS show that the results below match the Normal Approximation (Wald method).

Both the Treatment variable (ACT,PBO) and the Response variable (Yes,No) have to be numeric (0,1) or Logit (TRUE,FALSE) variables.

The prop.test default with 2 groups, is the null hypothesis that the proportions in each group are the same and a 2-sided CI.

```{r}
indat1 <- adcibc2 |>
select(AVAL, TRTP) |>
mutate(
resp = if_else(AVAL > 4, "Yes", "No"),
respn = if_else(AVAL > 4, 1, 0),
trt = if_else(TRTP == "Placebo", "PBO", "ACT"),
trtn = if_else(TRTP == "Placebo", 1, 0)
) |>
select(trt, trtn, resp, respn)

# cardx package required a vector with 0 and 1s for a single proportion CI
# To get the comparison the correct way around Placebo must be 1, and Active 0

indat <- select(indat1, trtn, respn)

cardx::ard_stats_prop_test(
data = indat,
by = trtn,
variables = respn,
conf.level = 0.95,
correct = FALSE
)
cardx::ard_stats_prop_test(
data = indat,
by = trtn,
variables = respn,
conf.level = 0.95,
correct = TRUE
)
```

### Normal Approximation (Wald) and Other Methods for 2 independent samples using {DescTools}

For more technical information regarding the derivations of these methods see the corresponding [SAS page](https://psiaims.github.io/CAMIS/SAS/ci_for_prop.html) or {DescTools} package documentation [here](https://search.r-project.org/CRAN/refmans/DescTools/html/BinomDiffCI.html). **The {DescTools} package** has a function BinomDiffCI which produces CIs for two independent proportions (unmatched pairs) including methods for Agresti/Caffo, Wald, Wald with Continuity correction, Newcombe Score, Newcombe score with continuity correction, and more computationally intensive (less commonly used) methods such as Miettinen and Nurminen, Mee, Brown Li's Jeffreys, Hauck-Anderson, Haldane and Jeffreys-Perks.

#### Example code

With 2 groups, the null hypothesis that the proportions in each group are the same and a 2-sided CI.

```{r}
count_dat <- indat |>
count(trtn, respn)
count_dat

# BinomDiffCI requires
# x1 = successes in active, n1 = total subjects in active,
# x2 = successes in placebo, n2 = total subjects in placebo

DescTools::BinomDiffCI(
x1 = 36,
n1 = 154,
x2 = 12,
n2 = 77,
conf.level = 0.95,
sides = c("two.sided"),
method = c(
"wald",
"waldcc",
"score",
"scorecc",
"ac",
"mn",
"mee",
"blj",
"ha",
"hal",
"jp"
)
)
```

## Methods for Calculating Confidence Intervals for Relative Risk from 2 independent samples

\[TBC\]

###

## Methods for Calculating Confidence Intervals for Odds Ratio from 2 independent samples

\[TBC\]

## Continuity Adjusted Methods

\[TBC\]

## Consistency with hypothesis tests

\[TBC\] - cf. chi-squared tests

## References

1. [pharmaverse cardx package](https://insightsengineering.github.io/cardx/main/#:~:text=The%20%7Bcardx%7D%20package%20is%20an%20extension%20of%20the,Data%20Objects%20%28ARDs%29%20using%20the%20R%20programming%20language.)
2. [PropCIs package](https://cran.r-project.org/web//packages/PropCIs/PropCIs.pdf)
3. D. Altman, D. Machin, T. Bryant, M. Gardner (eds). Statistics with Confidence: Confidence Intervals and Statistical Guidelines, 2nd edition. John Wiley and Sons 2000.
4. <https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf>
134 changes: 134 additions & 0 deletions R/ci_for_paired_prop.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
title: "Confidence Intervals for Paired Proportions in R"
---

## Introduction

\[See separate page for general introductory information on confidence intervals for proportions.\]

\[Note: information about cicalc package will be added to this page soon.\]

## Data used

The adcibc data stored [here](../data/adcibc.csv) was used in this example, creating a binary treatment variable `trt` taking the values of `ACT` or `PBO` and a binary response variable `resp` taking the values of `Yes` or `No`. For this example, a response is defined as a score greater than 4.

```{r}
#| echo: false
#| include: false
library(tidyverse)
library(cardx)
library(DescTools)
adcibc2 <- read_csv("../data/adcibc.csv")

adcibc <- adcibc2 |>
select(AVAL, TRTP) |>
mutate(
resp = if_else(AVAL > 4, "Yes", "No"),
respn = if_else(AVAL > 4, 1, 0),
trt = if_else(TRTP == "Placebo", "PBO", "ACT"),
trtn = if_else(TRTP == "Placebo", 0, 1)
) |>
select(trt, trtn, resp, respn)

# cardx package required a vector with 0 and 1s for a single proportion CI
act <- filter(adcibc, trt == "ACT") |>
select(respn)
act2 <- act$respn
```

The below shows that for the Actual Treatment, there are 36 responders out of 154 subjects = 0.234 (23.4% responders).

```{r}
#| echo: false
adcibc |>
group_by(trt, resp) |>
tally()
```

## Packages

**The {ratesci} package** is ... \[TBC\]

**The {ExactCIdiff} package** produces exact CIs for two dependent proportions (matched pairs).

## Methods for Calculating Confidence Intervals for Proportion Difference from matched pairs using {ExactCIdiff} and {ratesci}

For more information about the detailed methods for calculating confidence intervals for a matched pair proportion see [here](https://psiaims.github.io/CAMIS/SAS/ci_for_prop.html#methods-for-calculating-confidence-intervals-for-a-matched-pair-proportion). When you have 2 measurements on the same subject, the 2 sets of measures are not independent and you have matched pair of responses.

To date we have not found an R package which calculates a CI for matched pair proportions using the normal approximation or Wilson methods although they can be done by hand using the equations provided on the SAS page link above.

**The {ExactCIdiff} package** produces exact CIs for two dependent proportions (matched pairs), claiming to be the first package in R to do this method. However, it should only be used when the sample size is not too large as it can be computationally intensive.\
NOTE that the {ExactNumCI} package should not be used for this task. More detail on these two packages can be found [here](RJ-2013-026.pdf).

Using a cross over study as our example, a 2 x 2 table can be formed as follows:

+-----------------------+---------------+---------------+---------------+
| | Placebo\ | Placebo\ | Total |
| | Response= Yes | Response = No | |
+=======================+===============+===============+===============+
| Active Response = Yes | r | s | r+s |
+-----------------------+---------------+---------------+---------------+
| Active Response = No | t | u | t+u |
+-----------------------+---------------+---------------+---------------+
| Total | r+t | s+u | N = r+s+t+u |
+-----------------------+---------------+---------------+---------------+

: The proportions of subjects responding on each treatment are:

Active: $\hat p_1 = (r+s)/n$ and Placebo: $\hat p_2= (r+t)/n$

Difference between the proportions for each treatment are: $D=p1-p2=(s-t)/n$

Suppose :

+-----------------------+---------------+---------------+------------------+
| | Placebo\ | Placebo\ | Total |
| | Response= Yes | Response = No | |
+=======================+===============+===============+==================+
| Active Response = Yes | r = 20 | s = 15 | r+s = 35 |
+-----------------------+---------------+---------------+------------------+
| Active Response = No | t = 6 | u = 5 | t+u = 11 |
+-----------------------+---------------+---------------+------------------+
| Total | r+t = 26 | s+u = 20 | N = r+s+t+u = 46 |
+-----------------------+---------------+---------------+------------------+

Active: $\hat p_1 = (r+s)/n$ =35/46 =0.761 and Placebo: $\hat p_2= (r+t)/n$ = 26/46 =0.565

Difference = 0.761-0.565 = 0.196, then PairedCI() function can provide an exact confidence interval as shown below

-0.00339 to 0.38065

```{r}
#| eval: false
# ExactCIdiff::PairedCI(s, r+u, t, conf.level = 0.95)

CI <- ExactCIdiff::PairedCI(15, 25, 6, conf.level = 0.95)$ExactCI
CI
```

## Methods for Calculating Confidence Intervals for Relative Risk from matched pairs using {ratesci}

\[TBC\]

## Methods for Calculating Confidence Intervals for Conditional Odds Ratio from matched pairs using {ratesci}

\[TBC\]

## Continuity Adjusted Methods

\[TBC\]

## Consistency with Hypothesis Tests

\[TBC\] - cf. McNemar test

##

##

## References

1. [pharmaverse cardx package](https://insightsengineering.github.io/cardx/main/#:~:text=The%20%7Bcardx%7D%20package%20is%20an%20extension%20of%20the,Data%20Objects%20%28ARDs%29%20using%20the%20R%20programming%20language.)
2. [PropCIs package](https://cran.r-project.org/web//packages/PropCIs/PropCIs.pdf)
3. D. Altman, D. Machin, T. Bryant, M. Gardner (eds). Statistics with Confidence: Confidence Intervals and Statistical Guidelines, 2nd edition. John Wiley and Sons 2000.
4. <https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf>
Loading