-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Dear professor Grayling,
As I anticipated in the previous post, I am a little embarrassed in interpreting the tables printed. I know it is surely my fault, not having studied in depth your project, that is complex indeed.
Until I will be more aware of its thorough working, to provisionally simplify (I hope not unduly!), my aim is to get a kind of local alpha taking account for sequentiality to guarantee "the overall test" to stay below the, say, alpha=0.05. I would enter it as a sequentially-adjusted alpha level into any kind of multiple comparison I will do next.
As an example I report the output of the rpact application, which though does not allow for multi-arm design.
Power calculation for a binary endpoint
Sequential analysis with a maximum of 2 looks (group sequential design), overall
significance level 5% (two-sided).
The results were calculated for a two-sample test for rates (normal approximation),
H0: pi(1) - pi(2) = 0, H1; treatment rate pi(1) = 0.75, control rate pi(2) = 0.4,
maximum number of subjects = 80.
Stage 1 2
Information rate 50% 100%
Efficacy boundary (z-value scale) 2.178 2.178
Overall power 0.5477 0.8716
Expected number of subjects 58.1
Number of subjects 40.0 80.0
Cumulative alpha spent 0.0294 0.0500
Two-sided local significance level 0.0294 0.0294
Lower efficacy boundary (t) 0.341 0.243
Upper efficacy boundary (t) -0.299 -0.221
Exit probability for efficacy (under H0) 0.0294
Exit probability for efficacy (under H1) 0.5477
Legend:
(t): treatment effect scale
Here, in my understanding, the wanted alpha level for the first interim would be 0.0294, that should be corrected for sequentiality only, being it relative to a two-arm design. It is equal to the final stage, having been chosen the Pocock's design. Setting the O'Brien & Fleming design the two numbers become:
Two-sided local significance level 0.0052 0.0480
in agreement with my expectations.
In the "Operating characteristics summary: "(key, error rates and others) I am unsure to correctly interpret the big deal of data so, to be safe, I would prefer to use a more simple and well-identified datum.
Please, may you tell me what is the number I am looking for?
In alternative, may you tell me which numbers should I look at to get the alpha level to use with the omnibus test and with the direct comparisons between control and experimental groups?
Thank you in advance!
Paolo
PS This information is easily found even in your function in case of one-stage design, in which the multiplicity correction method can be chosen.
As an example:
Design summary
Inputs
The following choices were made:
K=3 experimental treatments will be included in the trial.
A significance level of α=0.05 will be used, in combination with the Holm-Bonferroni correction.
The response rate in the control arm will be assumed to be: π0=0.3 .
The marginal power for each null hypothesis will be controlled to level 1−β=0.8 under each of their respective least favourable configurations.
The interesting and uninteresting treatment effects will be: δ1=0.45 and δ0=0 respectively.
The target allocation to each of the experimental arms will be: the same as the control arm.
The sample size in each arm will be required to be an integer.
Plots will not be produced.
Outputs
The total required sample size is: N=72 .
The required sample size in each arm is: (n0,…,nK)=(18,18,18,18) .
Therefore, the realised allocation ratios to the experimental arms are: (r1,…,rK)=(1,1,1) .
The maximum familywise error-rate is: 0.043 .
The minimum marginal power is: 0.816 .
The following critical thresholds should be used with the chosen multiple comparison correction: (0.017,0.025,0.05) .
Here the output is clearer: sample size in each arm = 18, critical thresholds for multiple comparisons with the step-down method = (0.017,0.025,0.05) in particular. I am not sure, however, what exactly the "maximum familywise error rate" = 0.043 represents, i.e. why it is not 0.05 (I believe it is something may be overlooked for it is only linked to the three comparisons with the Bonferroni-Holm's method, since it seems independent from specific data).
In the sequential section I would have expected something similar. However, the correction method is no longer selectable.
I guess there is a reason for this, but I hope there is a way to get the equivalent information, maybe in the tables ...
As an instance, if the single-stage design above were to become a stage of two (for a total N=72*2=144 samples) having the aim of decreasing the effect to detect but allowing to stop earlier the trial in case a more substantial one was present, by how much the multiple comparisons thresholds of the Bonferroni-Holm's correction would be penalized in the two stages using the Pocock design? Is there a simple way to answer this question with your package?