1. Table of Contents

This project explores different sample size and power calculations for proportion comparison tests in clinical research using various helpful packages in R. The important factors to be assessed prior to determining the appropriate sample sizes were evaluated for the One-Sample, Unpaired Two-Sample, Paired Two-Sample and Multiple-Sample One-Way ANOVA Pairwise designs. Power analyses were conducted to address the requirements across different study hypotheses including Tests of Equality, Non-Inferiority, Superiority, Equivalence and Categorical Shift. All results were consolidated in a Summary presented at the end of the document.

Proportion comparison tests applied during clinical research refer to trials evaluated in terms of discrete clinical endpoints. The objectives of the intended clinical trials usually include the evaluation of clinical response (including complete response, partial response, and stable disease), survival in cancer trials, and the presence of adverse events in clinical trials. The equations applied in this study (mostly contained from the book Sample Size Calculations in Clinical Research and implemented using the base package) attempt to provide calculations for the sample size and statistical power based on the study objectives and hypotheses of interest.

1.1 One-Sample Design (OSD)

One-Sample Design involves individual binary response classes from sampled subjects. In clinical research, classes could be the binary indicator of the responsiveness to the intervention applied within a treatment group. Responses are assumed to be independent and identically distributed normal random variables with proportion equal to the true response rate. Parameters needed for the computations are as follows:

[A] True Response Rate : True response rate obtained from sampled subjects

[B] Reference : Baseline value for comparison

[C] Alpha : Maximum allowed Type I error

[D] Beta : Maximum allowed Type II error

[E] Delta :
     [E.1] Minimum testing margin to establish that the true response rate is much better than reference
     [E.2] Minimum testing margin to establish that the true response rate is not much worse as the reference
     [E.3] Minimum testing margin to establish that the true response rate is no better and no worse than the reference

A case scenario was provided as follows to demonstrate sample size computation and power analysis as applied on different types of hypotheses:

[Case Scenario] A clinical research study is being pursued concerning osteoporosis in post-menopausal women. Osteoporosis and osteopenia (or decreased bone mass) most commonly develop in post-menopausal women. The consequences of osteoporosis are vertebral crush fractures and hip fractures. It is of interest to evaluate the treatment effect in terms of the response rate at the end of the study. A subject may be defined as a responder if there is an improvement in bone density by more than one standard deviation (SD) or 30% of the measurements of bone density.

1.1.1 Test for Equality (OSD_EQUALITY)

[Research Question] Suppose that the response rate of the patient population under study after treatment is expected to be range from 40% to 60% with increments of 10%. For prevention of progression from osteopenia to osteoporosis, the study aims to demonstrate that the true response rate post-treatment and a reference rate of 15% are different. Given a level of significance of 0.05, evaluate the sample size range for a test of equality to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Post-treatment true response rate - Reference response rate equals 0

[Alternative Hypothesis] Post-treatment mean bone density - Reference response rate does not equal 0

[Hypothesis Test] Rejection of the null hypothesis implies difference between the true response rate post-treatment and reference rate

[A] True Response Rate : 0.40 to 0.60 at 0.10 intervals

[B] Reference : 0.15

[C] Alpha : 0.05

[D] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[F] Delta : 0 testing margin for the Test of Equality hypothesis

Code Chunk | Output

##################################
# Loading R libraries
##################################
library(moments)
library(car)
library(multcomp)
library(effects)
library(psych)
library(ggplot2)
library(dplyr)
library(ggpubr)
library(rstatix)
library(ggfortify)
library(trend)

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
p_ResponseRate=seq(0.4,0.6,0.05)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.15

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha/2)+qnorm(1-beta_1))/(p_ResponseRate-p_Reference))^2)

## [1] 23.700737 16.973184 12.596055  9.547416  7.315042

ceiling(n_Beta30)

## [1] 24 17 13 10  8

z_Beta30=(p_ResponseRate-p_Reference)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha/2))+pnorm(-z_Beta30-qnorm(1-alpha/2)))

## [1] 0.7000044 0.7000044 0.7000044 0.7000044 0.7000044

beta_2=beta[2]
(n_Beta20=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha/2)+qnorm(1-beta_2))/(p_ResponseRate-p_Reference))^2)

## [1] 30.139698 21.584419 16.018122 12.141236  9.302376

ceiling(n_Beta20)

## [1] 31 22 17 13 10

z_Beta20=(p_ResponseRate-p_Reference)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha/2))+pnorm(-z_Beta20-qnorm(1-alpha/2)))

## [1] 0.800001 0.800001 0.800001 0.800001 0.800001

beta_3=beta[3]
(n_Beta10=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha/2)+qnorm(1-beta_3))/(p_ResponseRate-p_Reference))^2)

## [1] 40.34850 28.89541 21.44372 16.25367 12.45324

ceiling(n_Beta10)

## [1] 41 29 22 17 13

z_Beta10=(p_ResponseRate-p_Reference)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha/2))+pnorm(-z_Beta10-qnorm(1-alpha/2)))

## [1] 0.9000001 0.9000001 0.9000001 0.9000001 0.9000001

OneSampleDesign.Equality <- as.data.frame(cbind(p_ResponseRate,n_Beta30,n_Beta20,n_Beta10))
names(OneSampleDesign.Equality) <- c("Posttreatment.True.Response.Rate",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
OneSampleDesign.Equality.Reshaped <- gather(OneSampleDesign.Equality,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
OneSampleDesign.Equality.Reshaped$Label <- rep("OSD_EQUALITY",nrow(OneSampleDesign.Equality.Reshaped))

OneSampleDesign.Equality.Reshaped

##    Posttreatment.True.Response.Rate     Power Sample.Size        Label
## 1                              0.40 Power=70%   23.700737 OSD_EQUALITY
## 2                              0.45 Power=70%   16.973184 OSD_EQUALITY
## 3                              0.50 Power=70%   12.596055 OSD_EQUALITY
## 4                              0.55 Power=70%    9.547416 OSD_EQUALITY
## 5                              0.60 Power=70%    7.315042 OSD_EQUALITY
## 6                              0.40 Power=80%   30.139698 OSD_EQUALITY
## 7                              0.45 Power=80%   21.584419 OSD_EQUALITY
## 8                              0.50 Power=80%   16.018122 OSD_EQUALITY
## 9                              0.55 Power=80%   12.141236 OSD_EQUALITY
## 10                             0.60 Power=80%    9.302376 OSD_EQUALITY
## 11                             0.40 Power=90%   40.348505 OSD_EQUALITY
## 12                             0.45 Power=90%   28.895413 OSD_EQUALITY
## 13                             0.50 Power=90%   21.443721 OSD_EQUALITY
## 14                             0.55 Power=90%   16.253670 OSD_EQUALITY
## 15                             0.60 Power=90%   12.453242 OSD_EQUALITY

##################################
# Plotting the sample sizes and power curves
##################################
(OneSampleDesign.Equality.PowerAnalysis <- ggplot(OneSampleDesign.Equality.Reshaped,
                                                      aes(x=Posttreatment.True.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("One-Sample Design : Test for Equality"))

1.1.2 Test for Non-Inferiority (OSD_NONINFERIORITY)

[Research Question] Suppose that the response rate of the patient population under study after treatment is expected to be range from 40% to 60% with increments of 10%. For prevention of progression from osteopenia to osteoporosis, the study aims to demonstrate that the true response rate post-treatment is not much worse than the reference rate of 15% by a clinically meaningful difference equal to −5%. Given a level of significance of 0.05, evaluate the sample size range for a test of non-inferiority to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Post-treatment true response rate - Reference response rate <= Clinically meaningful difference

[Alternative Hypothesis] Post-treatment mean bone density - Reference response rate > Clinically meaningful difference

[Hypothesis Test] Rejection of the null hypothesis implies non-inferiority of the true response rate post-treatment as compared to the reference rate, given the delta

[A] True Response Rate : 0.40 to 0.60 at 0.10 intervals

[B] Reference : 0.15

[C] Alpha : 0.05

[D] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[F] Delta : -5% testing margin for the Test of Non-Inferiority hypothesis

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
p_ResponseRate=seq(0.4,0.6,0.05)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.15

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining the non-inferiority delta
##################################
delta=-0.05

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_1))/(p_ResponseRate-p_Reference-delta))^2)

## [1] 12.548436  9.507361  7.352599  5.751367  4.517437

ceiling(n_Beta30)

## [1] 13 10  8  6  5

z_Beta30=(p_ResponseRate-p_Reference-delta)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha))+pnorm(-z_Beta30-qnorm(1-alpha)))

## [1] 0.7000683 0.7000683 0.7000683 0.7000683 0.7000683

beta_2=beta[2]
(n_Beta20=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_2))/(p_ResponseRate-p_Reference-delta))^2)

## [1] 16.486819 12.491289  9.660246  7.556459  5.935255

ceiling(n_Beta20)

## [1] 17 13 10  8  6

z_Beta20=(p_ResponseRate-p_Reference-delta)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha))+pnorm(-z_Beta20-qnorm(1-alpha)))

## [1] 0.800018 0.800018 0.800018 0.800018 0.800018

beta_3=beta[3]
(n_Beta10=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_3))/(p_ResponseRate-p_Reference-delta))^2)

## [1] 22.836926 17.302467 13.381011 10.466925  8.221293

ceiling(n_Beta10)

## [1] 23 18 14 11  9

z_Beta10=(p_ResponseRate-p_Reference-delta)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha))+pnorm(-z_Beta10-qnorm(1-alpha)))

## [1] 0.9000024 0.9000024 0.9000024 0.9000024 0.9000024

OneSampleDesign.NonInferiority <- as.data.frame(cbind(p_ResponseRate,n_Beta30,n_Beta20,n_Beta10))
names(OneSampleDesign.NonInferiority) <- c("Posttreatment.True.Response.Rate",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
OneSampleDesign.NonInferiority.Reshaped <- gather(OneSampleDesign.NonInferiority,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
OneSampleDesign.NonInferiority.Reshaped$Label <- rep("OSD_NONINFERIORITY",nrow(OneSampleDesign.NonInferiority.Reshaped))

OneSampleDesign.NonInferiority.Reshaped

##    Posttreatment.True.Response.Rate     Power Sample.Size              Label
## 1                              0.40 Power=70%   12.548436 OSD_NONINFERIORITY
## 2                              0.45 Power=70%    9.507361 OSD_NONINFERIORITY
## 3                              0.50 Power=70%    7.352599 OSD_NONINFERIORITY
## 4                              0.55 Power=70%    5.751367 OSD_NONINFERIORITY
## 5                              0.60 Power=70%    4.517437 OSD_NONINFERIORITY
## 6                              0.40 Power=80%   16.486819 OSD_NONINFERIORITY
## 7                              0.45 Power=80%   12.491289 OSD_NONINFERIORITY
## 8                              0.50 Power=80%    9.660246 OSD_NONINFERIORITY
## 9                              0.55 Power=80%    7.556459 OSD_NONINFERIORITY
## 10                             0.60 Power=80%    5.935255 OSD_NONINFERIORITY
## 11                             0.40 Power=90%   22.836926 OSD_NONINFERIORITY
## 12                             0.45 Power=90%   17.302467 OSD_NONINFERIORITY
## 13                             0.50 Power=90%   13.381011 OSD_NONINFERIORITY
## 14                             0.55 Power=90%   10.466925 OSD_NONINFERIORITY
## 15                             0.60 Power=90%    8.221293 OSD_NONINFERIORITY

##################################
# Plotting the sample sizes and power curves
##################################
(OneSampleDesign.NonInferiority.PowerAnalysis <- ggplot(OneSampleDesign.NonInferiority.Reshaped,
                                                      aes(x=Posttreatment.True.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("One-Sample Design : Test for Non-Inferiority"))

1.1.3 Test for Superiority (OSD_SUPERIORITY)

[Research Question] Suppose that the response rate of the patient population under study after treatment is expected to be range from 40% to 60% with increments of 10%. For prevention of progression from osteopenia to osteoporosis, the study aims to demonstrate that the true response rate post-treatment is much better than the reference rate of 15% by a clinically meaningful difference equal to +20%. Given a level of significance of 0.05, evaluate the sample size range for a test of superiority to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Post-treatment true response rate - Reference response rate <= Clinically meaningful difference

[Alternative Hypothesis] Post-treatment mean bone density - Reference response rate > Clinically meaningful difference

[Hypothesis Test] Rejection of the null hypothesis implies superiority of the true response rate post-treatment as compared to the reference rate, given the delta

[A] True Response Rate : 0.40 to 0.60 at 0.10 intervals

[B] Reference : 0.15

[C] Alpha : 0.05

[D] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[F] Delta : +20% testing margin for the Test of Superiority hypothesis

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
p_ResponseRate=seq(0.4,0.6,0.05)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.15

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining the Superiority delta
##################################
delta=0.05

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_1))/(p_ResponseRate-p_Reference-delta))^2)

## [1] 28.233981 18.634428 13.071288  9.507361  7.058495

ceiling(n_Beta30)

## [1] 29 19 14 10  8

z_Beta30=(p_ResponseRate-p_Reference-delta)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha))+pnorm(-z_Beta30-qnorm(1-alpha)))

## [1] 0.7000683 0.7000683 0.7000683 0.7000683 0.7000683

beta_2=beta[2]
(n_Beta20=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_2))/(p_ResponseRate-p_Reference-delta))^2)

## [1] 37.095343 24.482927 17.173770 12.491289  9.273836

ceiling(n_Beta20)

## [1] 38 25 18 13 10

z_Beta20=(p_ResponseRate-p_Reference-delta)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha))+pnorm(-z_Beta20-qnorm(1-alpha)))

## [1] 0.800018 0.800018 0.800018 0.800018 0.800018

beta_3=beta[3]
(n_Beta10=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_3))/(p_ResponseRate-p_Reference-delta))^2)

## [1] 51.38308 33.91284 23.78846 17.30247 12.84577

ceiling(n_Beta10)

## [1] 52 34 24 18 13

z_Beta10=(p_ResponseRate-p_Reference-delta)/sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha))+pnorm(-z_Beta10-qnorm(1-alpha)))

## [1] 0.9000024 0.9000024 0.9000024 0.9000024 0.9000024

OneSampleDesign.Superiority <- as.data.frame(cbind(p_ResponseRate,n_Beta30,n_Beta20,n_Beta10))
names(OneSampleDesign.Superiority) <- c("Posttreatment.True.Response.Rate",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
OneSampleDesign.Superiority.Reshaped <- gather(OneSampleDesign.Superiority,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
OneSampleDesign.Superiority.Reshaped$Label <- rep("OSD_SUPERIORITY",nrow(OneSampleDesign.Superiority.Reshaped))

OneSampleDesign.Superiority.Reshaped

##    Posttreatment.True.Response.Rate     Power Sample.Size           Label
## 1                              0.40 Power=70%   28.233981 OSD_SUPERIORITY
## 2                              0.45 Power=70%   18.634428 OSD_SUPERIORITY
## 3                              0.50 Power=70%   13.071288 OSD_SUPERIORITY
## 4                              0.55 Power=70%    9.507361 OSD_SUPERIORITY
## 5                              0.60 Power=70%    7.058495 OSD_SUPERIORITY
## 6                              0.40 Power=80%   37.095343 OSD_SUPERIORITY
## 7                              0.45 Power=80%   24.482927 OSD_SUPERIORITY
## 8                              0.50 Power=80%   17.173770 OSD_SUPERIORITY
## 9                              0.55 Power=80%   12.491289 OSD_SUPERIORITY
## 10                             0.60 Power=80%    9.273836 OSD_SUPERIORITY
## 11                             0.40 Power=90%   51.383084 OSD_SUPERIORITY
## 12                             0.45 Power=90%   33.912836 OSD_SUPERIORITY
## 13                             0.50 Power=90%   23.788465 OSD_SUPERIORITY
## 14                             0.55 Power=90%   17.302467 OSD_SUPERIORITY
## 15                             0.60 Power=90%   12.845771 OSD_SUPERIORITY

##################################
# Plotting the sample sizes and power curves
##################################
(OneSampleDesign.Superiority.PowerAnalysis <- ggplot(OneSampleDesign.Superiority.Reshaped,
                                                      aes(x=Posttreatment.True.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("One-Sample Design : Test for Superiority"))

1.1.4 Test for Equivalence (OSD_EQUIVALENCE)

[Research Question] Suppose that the response rate of the patient population under study after treatment is expected to be range from 40% to 60% with increments of 10%. For a special investigation, the study aims to demonstrate that the true response rate post-treatment is not much worse or better than the reference rate of 15% by a clinically meaningful difference equal to 10%. Given a level of significance of 0.05, evaluate the sample size range for a test of superiority to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Absolute value of (Post-treatment true response rate - Reference response rate) >= Clinically meaningful difference

[Alternative Hypothesis] Absolute value of (Post-treatment true response rate - Reference response rate) < Clinically meaningful difference

[Hypothesis Test] Rejection of the null hypothesis implies equivalence of the true response rate post-treatment as compared to the reference rate, given the delta

[A] True Response Rate : 0.40 to 0.60 at 0.10 intervals

[B] Reference : 0.15

[C] Alpha : 0.05

[D] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[F] Delta : +10% testing margin for the Test of Equivalence hypothesis

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
p_ResponseRate=seq(0.4,0.6,0.05)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.15

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining the Equivalence delta
##################################
delta=0.05

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_1/2))/(abs(p_ResponseRate-p_Reference)-delta))^2)

## [1] 43.13580 28.46963 19.97028 14.52532 10.78395

ceiling(n_Beta30)

## [1] 44 29 20 15 11

z_Beta30=(abs(p_ResponseRate-p_Reference)-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30)
(Power_Beta30=2*(pnorm(z_Beta30-qnorm(1-alpha))+pnorm(-z_Beta30-qnorm(1-alpha)))-1)

## [1] 0.7000152 0.7000152 0.7000152 0.7000152 0.7000152

beta_2=beta[2]
(n_Beta20=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_2/2))/(abs(p_ResponseRate-p_Reference)-delta))^2)

## [1] 51.38308 33.91284 23.78846 17.30247 12.84577

ceiling(n_Beta20)

## [1] 52 34 24 18 13

z_Beta20=(abs(p_ResponseRate-p_Reference)-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20)
(Power_Beta20=2*(pnorm(z_Beta20-qnorm(1-alpha))+pnorm(-z_Beta20-qnorm(1-alpha)))-1)

## [1] 0.8000048 0.8000048 0.8000048 0.8000048 0.8000048

beta_3=beta[3]
(n_Beta10=p_ResponseRate*(1-p_ResponseRate)*
    ((qnorm(1-alpha)+qnorm(1-beta_3/2))/(abs(p_ResponseRate-p_Reference)-delta))^2)

## [1] 64.93304 42.85581 30.06159 21.86521 16.23326

ceiling(n_Beta10)

## [1] 65 43 31 22 17

z_Beta10=(abs(p_ResponseRate-p_Reference)-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10)
(Power_Beta10=2*(pnorm(z_Beta10-qnorm(1-alpha))+pnorm(-z_Beta10-qnorm(1-alpha)))-1)

## [1] 0.9000008 0.9000008 0.9000008 0.9000008 0.9000008

OneSampleDesign.Equivalence <- as.data.frame(cbind(p_ResponseRate,n_Beta30,n_Beta20,n_Beta10))
names(OneSampleDesign.Equivalence) <- c("Posttreatment.True.Response.Rate",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
OneSampleDesign.Equivalence.Reshaped <- gather(OneSampleDesign.Equivalence,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
OneSampleDesign.Equivalence.Reshaped$Label <- rep("OSD_EQUIVALENCE",nrow(OneSampleDesign.Equivalence.Reshaped))

OneSampleDesign.Equivalence.Reshaped

##    Posttreatment.True.Response.Rate     Power Sample.Size           Label
## 1                              0.40 Power=70%    43.13580 OSD_EQUIVALENCE
## 2                              0.45 Power=70%    28.46963 OSD_EQUIVALENCE
## 3                              0.50 Power=70%    19.97028 OSD_EQUIVALENCE
## 4                              0.55 Power=70%    14.52532 OSD_EQUIVALENCE
## 5                              0.60 Power=70%    10.78395 OSD_EQUIVALENCE
## 6                              0.40 Power=80%    51.38308 OSD_EQUIVALENCE
## 7                              0.45 Power=80%    33.91284 OSD_EQUIVALENCE
## 8                              0.50 Power=80%    23.78846 OSD_EQUIVALENCE
## 9                              0.55 Power=80%    17.30247 OSD_EQUIVALENCE
## 10                             0.60 Power=80%    12.84577 OSD_EQUIVALENCE
## 11                             0.40 Power=90%    64.93304 OSD_EQUIVALENCE
## 12                             0.45 Power=90%    42.85581 OSD_EQUIVALENCE
## 13                             0.50 Power=90%    30.06159 OSD_EQUIVALENCE
## 14                             0.55 Power=90%    21.86521 OSD_EQUIVALENCE
## 15                             0.60 Power=90%    16.23326 OSD_EQUIVALENCE

##################################
# Plotting the sample sizes and power curves
##################################
(OneSampleDesign.Equivalence.PowerAnalysis <- ggplot(OneSampleDesign.Equivalence.Reshaped,
                                                      aes(x=Posttreatment.True.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("One-Sample Design : Test for Equivalence"))

1.1.5 Consolidated Evaluation

[A] The minimum sample size required for the treatment group in a One-Sample Design study is dependent upon the following factors defined and hypothesized prior to the clinical research:
     [A.1] Study Hypothesis (Equality, Non-Inferiority, Superiority, Equivalence)
     [A.2] Level of significance (Alpha) : Lower value increases sample size
     [A.3] Power of the test (Beta) : Higher value increases sample size
     [A.5] Epsilon (Difference between True Response Rate and Reference Rate) : Lower value increases sample size
     [A.6] Delta (Testing margin, as applicable) : Higher value increases sample size
[B] Given the level of significance = 5%, reference rate = 15% and true response rate after treatment = 40% to 60% representing epsilon values = 25% to 45%, the range of sample sizes is given on the respective power analyses provided for each study hypotheses -
     [B.1] OSD_EQUALITY :
            [B.1.1] For 70% power, sample size = 8 to 24
            [B.1.2] For 80% power, sample size = 10 to 31
            [B.1.3] For 90% power, sample size = 13 to 41
     [B.2] OSD_NONINFERIORITY with Delta = -5% :
            [B.2.1] For 70% power, sample size = 5 to 13
            [B.2.2] For 80% power, sample size = 6 to 17
            [B.2.3] For 90% power, sample size = 9 to 23
     [B.3] OSD_SUPERIORITY with Delta = +20% :
            [B.3.1] For 70% power, sample size = 8 to 29
            [B.3.2] For 80% power, sample size = 10 to 38
            [B.3.3] For 90% power, sample size = 13 to 52
     [B.4] OSD_EQUIVALENCE with Delta = +10% :
            [B.4.1] For 70% power, sample size = 11 to 44
            [B.4.2] For 80% power, sample size = 13 to 52
            [B.4.3] For 90% power, sample size = 17 to 65

Code Chunk | Output

##################################
# Consolidating the power analyses charts
##################################

OSD_EQUALITY.PowerAnalysis <- ggplot(OneSampleDesign.Equality.Reshaped,
                                     aes(x=Posttreatment.True.Response.Rate,
                                         y=Sample.Size,
                                         color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) 

OSD_NONINFERIORITY.PowerAnalysis <- ggplot(OneSampleDesign.NonInferiority.Reshaped,
                                           aes(x=Posttreatment.True.Response.Rate,
                                               y=Sample.Size, 
                                               color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

OSD_SUPERIORITY.PowerAnalysis <- ggplot(OneSampleDesign.Superiority.Reshaped,
                                        aes(x=Posttreatment.True.Response.Rate,
                                            y=Sample.Size, 
                                            color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

OSD_EQUIVALENCE.PowerAnalysis <- ggplot(OneSampleDesign.Equivalence.Reshaped,
                                        aes(x=Posttreatment.True.Response.Rate,
                                            y=Sample.Size,
                                            color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Post-Treatment True Response Rate",
                      limits=c(0.40,0.60),
                      breaks=seq(0.40,0.60,
                                 by=0.05)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))


OSD_PowerAnalysis <- ggarrange(OSD_EQUALITY.PowerAnalysis,
                               OSD_NONINFERIORITY.PowerAnalysis,
                               OSD_SUPERIORITY.PowerAnalysis,
                               OSD_EQUIVALENCE.PowerAnalysis,
                               ncol=2, nrow=2)

annotate_figure(OSD_PowerAnalysis,
                top = text_grob("One-Sample Design Power Analysis by Statistical Hypothesis",
                                color = "black",
                                face = "bold",
                                size = 14))

1.2 Two-Sample Unpaired Design (TSUD)

Two-Sample Unpaired Design involves individual proportions from sampled subjects between treatment and control groups. In clinical research, responses could be between a test drug and a placebo control, or an active agent. Responses are assumed to be independent and identically distributed normal random variables with proportion equal to the true response rate for each group. Parameters needed for the computations are as follows:

[A] Treatment Response Rate : Response rate obtained from sampled subjects in the treatment group

[B] Control Response Rate : Response rate obtained from sampled subjects in the control group

[C] Epsilon : Response rate differences between the treatment and control groups

[D] Alpha : Maximum allowed Type I error

[E] Beta : Maximum allowed Type II error

[F] Kappa : Matching ratio between the Treatment and Control groups

[G] Delta :
     [G.1] Minimum testing margin to establish that the treatment response rate is much better than control
     [G.2] Minimum testing margin to establish that the treatment response rate is much better than control
     [G.3] Minimum testing margin to establish that the treatment response rate is much better than control

A case scenario was provided as follows to demonstrate sample size computation and power analysis as applied on different types of hypotheses:

[Case Scenario] A clinical research study is being pursued concerning the evaluation of anti-infective agents in the treatment of patients with skin and skin structure infections. As it is well known, gram-positive and gram-negative pathogens are commonly associated with skin and skin structure infections such as streptococci, staphylococci, and various strains of enterobacteriaceae. For the evaluation of the effectiveness of a test antibiotic agent, clinical assessments and cultures are usually done at a post-treatment visits (e.g., between 4 and 8 days) after treatment has been completed but prior to treatment with another anti-microbial agent. If the culture is positive, the pathogen(s) is usually identified and susceptibility testingis performed. The effectiveness of therapy is usually assessed based on clinical and bacteriological responses at post-treatment visit. For example, clinical responses may include cure (e.g., no signs of skin infection at post-treatment visits), improvment (e.g., the skin infection has resolved to the extent that no further systemic antibiotic therapy is needed based on the best judgment of the investigator), failure (e.g., lack of significant improvement in the signs and symptoms of the skin infection at or before post-treatment visits such that a change in antibiotic treatment is required). On the other hand, bacteriological responses may include cure (e.g., all pathogens eradicated at post-treatment day 4-8 or material suitable for culturinghas diminished to a degree that proper cultures cannot be obtained), colonization (e.g., isolation of pathogen(s) from the original site of infection in the absence of local or systemic signs of infection at post-treatment visits), and failure (e.g., any pathogen(s) isolated at post-treatment visits coupled with the investigator’s decision to prescribe alternate antibiotic therapy). Suppose that a pharmaceutical company is interested in conducting a clinical trial to compare the efficacy, safety, and tolerability of two antimicrobial agents when administered orally once daily in the treatment of patients with skin and skin structure infections.

1.2.1 Test for Equality (TSUD_EQUALITY)

[Research Question] Suppose that a difference of 20 to 30% in clinical response of cure is considered of clinically meaningful difference between the two compared anti-microbial agents. The study aims to demonstrate that the response rate change between the new anti-microbial agent and active control agent with true cure rate of 65% are different. Given a level of significance of 0.05 and matching ratio between both groups equal to 1, evaluate the sample size range for a test of equality to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Cure rate of the new anti-microbial agent - Cure rate of the active control agent equals 0

[Alternative Hypothesis] Cure rate of the new anti-microbial agent - Cure rate of the active control agent does not equal 0

[Hypothesis Test] Rejection of the null hypothesis implies difference between the cure rate between both anti-microbial agents

[A] Epsilon : 20 to 30% at 2% intervals
[A.1] Control Response Rate : 65%
[A.2] Treatment Response Rate : 85% to 95% at 2% intervals

[B] Alpha : 0.05

[C] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[D] Kappa : 1.0

[E] Delta : 0 testing margin for the Test of Equality hypothesis

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for epsilon
##################################
epsilon=seq(0.20,0.30,0.02)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.65

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
(p_ResponseRate=p_Reference + epsilon)

## [1] 0.85 0.87 0.89 0.91 0.93 0.95

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining a fixed hypothesized value for kappa
##################################
kappa=1.0

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha/2)+qnorm(1-beta_1))/
                                      (p_ResponseRate-p_Reference))^2)

## [1] 54.77709 43.43401 34.86789 28.24908 23.03504 18.85909

ceiling(n_Beta30)

## [1] 55 44 35 29 24 19

z_Beta30=(p_ResponseRate-p_Reference)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30/kappa+p_Reference
       *(1-p_Reference)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha/2))+pnorm(-z_Beta30-qnorm(1-alpha/2)))

## [1] 0.7000044 0.7000044 0.7000044 0.7000044 0.7000044 0.7000044

beta_2=beta[2]
(n_Beta20=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha/2)+qnorm(1-beta_2))/
                                      (p_ResponseRate-p_Reference))^2)

## [1] 69.65881 55.23406 44.34072 35.92372 29.29314 23.98269

ceiling(n_Beta20)

## [1] 70 56 45 36 30 24

z_Beta20=(p_ResponseRate-p_Reference)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20/kappa+p_Reference
       *(1-p_Reference)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha/2))+pnorm(-z_Beta20-qnorm(1-alpha/2)))

## [1] 0.800001 0.800001 0.800001 0.800001 0.800001 0.800001

beta_3=beta[3]
(n_Beta10=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha/2)+qnorm(1-beta_3))/
                                      (p_ResponseRate-p_Reference))^2)

## [1] 93.25338 73.94273 59.35964 48.09167 39.21520 32.10601

ceiling(n_Beta10)

## [1] 94 74 60 49 40 33

z_Beta10=(p_ResponseRate-p_Reference)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10/kappa+p_Reference
       *(1-p_Reference)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha/2))+pnorm(-z_Beta10-qnorm(1-alpha/2)))

## [1] 0.9000001 0.9000001 0.9000001 0.9000001 0.9000001 0.9000001

TwoSampleUnpairedDesign.Equality <- as.data.frame(cbind(epsilon,n_Beta30,n_Beta20,n_Beta10))
names(TwoSampleUnpairedDesign.Equality) <- c("Cure.Rate.Change",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
TwoSampleUnpairedDesign.Equality.Reshaped <- gather(TwoSampleUnpairedDesign.Equality,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
TwoSampleUnpairedDesign.Equality.Reshaped$Label <- rep("TSUD_EQUALITY",nrow(TwoSampleUnpairedDesign.Equality.Reshaped))

TwoSampleUnpairedDesign.Equality.Reshaped

##    Cure.Rate.Change     Power Sample.Size         Label
## 1              0.20 Power=70%    54.77709 TSUD_EQUALITY
## 2              0.22 Power=70%    43.43401 TSUD_EQUALITY
## 3              0.24 Power=70%    34.86789 TSUD_EQUALITY
## 4              0.26 Power=70%    28.24908 TSUD_EQUALITY
## 5              0.28 Power=70%    23.03504 TSUD_EQUALITY
## 6              0.30 Power=70%    18.85909 TSUD_EQUALITY
## 7              0.20 Power=80%    69.65881 TSUD_EQUALITY
## 8              0.22 Power=80%    55.23406 TSUD_EQUALITY
## 9              0.24 Power=80%    44.34072 TSUD_EQUALITY
## 10             0.26 Power=80%    35.92372 TSUD_EQUALITY
## 11             0.28 Power=80%    29.29314 TSUD_EQUALITY
## 12             0.30 Power=80%    23.98269 TSUD_EQUALITY
## 13             0.20 Power=90%    93.25338 TSUD_EQUALITY
## 14             0.22 Power=90%    73.94273 TSUD_EQUALITY
## 15             0.24 Power=90%    59.35964 TSUD_EQUALITY
## 16             0.26 Power=90%    48.09167 TSUD_EQUALITY
## 17             0.28 Power=90%    39.21520 TSUD_EQUALITY
## 18             0.30 Power=90%    32.10601 TSUD_EQUALITY

##################################
# Plotting the sample sizes and power curves
##################################
(TwoSampleUnpairedDesign.Equality.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.Equality.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Two-Sample Unpaired Design : Test for Equality"))

1.2.2 Test for Non-Inferiority (TSUD_NONINFERIORITY)

[Research Question] Suppose that a difference of 20 to 30% in clinical response of cure is considered of clinically meaningful difference between the two compared anti-microbial agents. The study aims to demonstrate that the response rate change for the new anti-microbial agent is not much worse than active control agent with true cure rate of 65% by a clinically meaningful difference equal to -10%. Given a level of significance of 0.05 and matching ratio between both groups equal to 1, evaluate the sample size range for a test of non-inferiority to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Cure rate of the new anti-microbial agent - Cure rate of the active control agent <= Clinically meaningful difference

[Alternative Hypothesis] Cure rate of the new anti-microbial agent - Cure rate of the active control agent > Clinically meaningful difference

[Hypothesis Test] Rejection of the null hypothesis implies non-inferiority between the cure rate between both anti-microbial agents, given the delta

[A] Epsilon : 20 to 30% at 2% intervals
[A.1] Control Response Rate : 65%
[A.2] Treatment Response Rate : 85% to 95% at 2% intervals

[B] Alpha : 0.05

[C] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[D] Kappa : 1.0

[E] Delta : -10% testing margin for the Test of Non-Inferiority hypothesis

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for epsilon
##################################
epsilon=seq(0.20,0.30,0.02)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.65

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
(p_ResponseRate=p_Reference + epsilon)

## [1] 0.85 0.87 0.89 0.91 0.93 0.95

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining the non-inferiority delta
##################################
delta=-0.10

##################################
# Defining a fixed hypothesized value for kappa
##################################
kappa=1.0

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_1))/
                                      (p_ResponseRate-p_Reference-delta))^2)

## [1] 18.561228 15.651846 13.245873 11.234045  9.535160  8.087859

ceiling(n_Beta30)

## [1] 19 16 14 12 10  9

z_Beta30=(p_ResponseRate-p_Reference-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30/kappa+p_Reference
       *(1-p_Reference)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha))+pnorm(-z_Beta30-qnorm(1-alpha)))

## [1] 0.7000683 0.7000683 0.7000683 0.7000683 0.7000683 0.7000683

beta_2=beta[2]
(n_Beta20=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_2))/
                                      (p_ResponseRate-p_Reference-delta))^2)

## [1] 24.38675 20.56425 17.40315 14.75990 12.52781 10.62627

ceiling(n_Beta20)

## [1] 25 21 18 15 13 11

z_Beta20=(p_ResponseRate-p_Reference-p_Reference-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20/kappa+p_Reference
       *(1-p_Reference)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha))+pnorm(-z_Beta20-qnorm(1-alpha)))

## [1] 0.8954509 0.8210496 0.7331493 0.6400130 0.5488149 0.4645128

beta_3=beta[3]
(n_Beta10=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_3))/
                                      (p_ResponseRate-p_Reference-delta))^2)

## [1] 33.77962 28.48483 24.10619 20.44486 17.35306 14.71911

ceiling(n_Beta10)

## [1] 34 29 25 21 18 15

z_Beta10=(p_ResponseRate-p_Reference-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10/kappa+p_Reference
       *(1-p_Reference)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha))+pnorm(-z_Beta10-qnorm(1-alpha)))

## [1] 0.9000024 0.9000024 0.9000024 0.9000024 0.9000024 0.9000024

TwoSampleUnpairedDesign.NonInferiority <- as.data.frame(cbind(epsilon,n_Beta30,n_Beta20,n_Beta10))
names(TwoSampleUnpairedDesign.NonInferiority) <- c("Cure.Rate.Change",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
TwoSampleUnpairedDesign.NonInferiority.Reshaped <- gather(TwoSampleUnpairedDesign.NonInferiority,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
TwoSampleUnpairedDesign.NonInferiority.Reshaped$Label <- rep("TSUD_NONINFERIORITY",nrow(TwoSampleUnpairedDesign.NonInferiority.Reshaped))

TwoSampleUnpairedDesign.NonInferiority.Reshaped

##    Cure.Rate.Change     Power Sample.Size               Label
## 1              0.20 Power=70%   18.561228 TSUD_NONINFERIORITY
## 2              0.22 Power=70%   15.651846 TSUD_NONINFERIORITY
## 3              0.24 Power=70%   13.245873 TSUD_NONINFERIORITY
## 4              0.26 Power=70%   11.234045 TSUD_NONINFERIORITY
## 5              0.28 Power=70%    9.535160 TSUD_NONINFERIORITY
## 6              0.30 Power=70%    8.087859 TSUD_NONINFERIORITY
## 7              0.20 Power=80%   24.386754 TSUD_NONINFERIORITY
## 8              0.22 Power=80%   20.564248 TSUD_NONINFERIORITY
## 9              0.24 Power=80%   17.403150 TSUD_NONINFERIORITY
## 10             0.26 Power=80%   14.759901 TSUD_NONINFERIORITY
## 11             0.28 Power=80%   12.527813 TSUD_NONINFERIORITY
## 12             0.30 Power=80%   10.626270 TSUD_NONINFERIORITY
## 13             0.20 Power=90%   33.779620 TSUD_NONINFERIORITY
## 14             0.22 Power=90%   28.484828 TSUD_NONINFERIORITY
## 15             0.24 Power=90%   24.106193 TSUD_NONINFERIORITY
## 16             0.26 Power=90%   20.444864 TSUD_NONINFERIORITY
## 17             0.28 Power=90%   17.353059 TSUD_NONINFERIORITY
## 18             0.30 Power=90%   14.719113 TSUD_NONINFERIORITY

##################################
# Plotting the sample sizes and power curves
##################################
(TwoSampleUnpairedDesign.NonInferiority.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.NonInferiority.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Two-Sample Unpaired Design : Test for NonInferiority"))

1.2.3 Test for Superiority (TSUD_SUPERIORITY)

[Research Question] Suppose that a difference of 20 to 30% in clinical response of cure is considered of clinically meaningful difference between the two compared anti-microbial agents. The study aims to demonstrate that the response rate change for the new anti-microbial agent is much better than active control agent with true cure rate of 65% by a clinically meaningful difference equal to +2%. Given a level of significance of 0.05 and matching ratio between both groups equal to 1, evaluate the sample size range for a test of non-inferiority to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Cure rate of the new anti-microbial agent - Cure rate of the active control agent <= Clinically meaningful difference

[Alternative Hypothesis] Cure rate of the new anti-microbial agent - Cure rate of the active control agent > Clinically meaningful difference

[Hypothesis Test] Rejection of the null hypothesis implies non-inferiority between the cure rate between both anti-microbial agents, given the delta

[A] Epsilon : 20 to 30% at 2% intervals
[A.1] Control Response Rate : 65%
[A.2] Treatment Response Rate : 85% to 95% at 2% intervals

[B] Alpha : 0.05

[C] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[D] Kappa : 1.0

[E] Delta : +2% testing margin for the Test of Superiority hypothesis

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for epsilon
##################################
epsilon=seq(0.20,0.30,0.02)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.65

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
(p_ResponseRate=p_Reference + epsilon)

## [1] 0.85 0.87 0.89 0.91 0.93 0.95

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining the non-inferiority delta
##################################
delta= 0.02

##################################
# Defining a fixed hypothesized value for kappa
##################################
kappa=1.0

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_1))/
                                      (p_ResponseRate-p_Reference-delta))^2)

## [1] 51.55897 40.06872 31.63684 25.27660 20.36801 16.50584

ceiling(n_Beta30)

## [1] 52 41 32 26 21 17

z_Beta30=(p_ResponseRate-p_Reference-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30/kappa+p_Reference
       *(1-p_Reference)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha))+pnorm(-z_Beta30-qnorm(1-alpha)))

## [1] 0.7000683 0.7000683 0.7000683 0.7000683 0.7000683 0.7000683

beta_2=beta[2]
(n_Beta20=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_2))/
                                      (p_ResponseRate-p_Reference-delta))^2)

## [1] 67.74098 52.64447 41.56620 33.20978 26.76060 21.68627

ceiling(n_Beta20)

## [1] 68 53 42 34 27 22

z_Beta20=(p_ResponseRate-p_Reference-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20/kappa+p_Reference
       *(1-p_Reference)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha))+pnorm(-z_Beta20-qnorm(1-alpha)))

## [1] 0.800018 0.800018 0.800018 0.800018 0.800018 0.800018

beta_3=beta[3]
(n_Beta10=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_3))/
                                      (p_ResponseRate-p_Reference-delta))^2)

## [1] 93.83228 72.92116 57.57595 46.00094 37.06778 30.03901

ceiling(n_Beta10)

## [1] 94 73 58 47 38 31

z_Beta10=(p_ResponseRate-p_Reference-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10/kappa+p_Reference
       *(1-p_Reference)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha))+pnorm(-z_Beta10-qnorm(1-alpha)))

## [1] 0.9000024 0.9000024 0.9000024 0.9000024 0.9000024 0.9000024

TwoSampleUnpairedDesign.Superiority <- as.data.frame(cbind(epsilon,n_Beta30,n_Beta20,n_Beta10))
names(TwoSampleUnpairedDesign.Superiority) <- c("Cure.Rate.Change",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
TwoSampleUnpairedDesign.Superiority.Reshaped <- gather(TwoSampleUnpairedDesign.Superiority,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
TwoSampleUnpairedDesign.Superiority.Reshaped$Label <- rep("TSUD_SUPERIORITY",nrow(TwoSampleUnpairedDesign.Superiority.Reshaped))

TwoSampleUnpairedDesign.Superiority.Reshaped

##    Cure.Rate.Change     Power Sample.Size            Label
## 1              0.20 Power=70%    51.55897 TSUD_SUPERIORITY
## 2              0.22 Power=70%    40.06872 TSUD_SUPERIORITY
## 3              0.24 Power=70%    31.63684 TSUD_SUPERIORITY
## 4              0.26 Power=70%    25.27660 TSUD_SUPERIORITY
## 5              0.28 Power=70%    20.36801 TSUD_SUPERIORITY
## 6              0.30 Power=70%    16.50584 TSUD_SUPERIORITY
## 7              0.20 Power=80%    67.74098 TSUD_SUPERIORITY
## 8              0.22 Power=80%    52.64447 TSUD_SUPERIORITY
## 9              0.24 Power=80%    41.56620 TSUD_SUPERIORITY
## 10             0.26 Power=80%    33.20978 TSUD_SUPERIORITY
## 11             0.28 Power=80%    26.76060 TSUD_SUPERIORITY
## 12             0.30 Power=80%    21.68627 TSUD_SUPERIORITY
## 13             0.20 Power=90%    93.83228 TSUD_SUPERIORITY
## 14             0.22 Power=90%    72.92116 TSUD_SUPERIORITY
## 15             0.24 Power=90%    57.57595 TSUD_SUPERIORITY
## 16             0.26 Power=90%    46.00094 TSUD_SUPERIORITY
## 17             0.28 Power=90%    37.06778 TSUD_SUPERIORITY
## 18             0.30 Power=90%    30.03901 TSUD_SUPERIORITY

##################################
# Plotting the sample sizes and power curves
##################################
(TwoSampleUnpairedDesign.Superiority.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.Superiority.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Two-Sample Unpaired Design : Test for Superiority"))

1.2.4 Test for Equivalence (TSUD_EQUIVALENCE)

[Research Question] Suppose that a difference of 20 to 30% in clinical response of cure is considered of clinically meaningful difference between the two compared anti-microbial agents. The study aims to demonstrate that the response rate change for the new anti-microbial agent is not much worse or better than active control agent with true cure rate of 65% by a clinically meaningful difference equal to +0.5%. Given a level of significance of 0.05 and matching ratio between both groups equal to 1, evaluate the sample size range for a test of non-inferiority to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Absolute value of (Cure rate of the new anti-microbial agent - Cure rate of the active control agent) >= Clinically meaningful difference

[Alternative Hypothesis] Absolute value of (Cure rate of the new anti-microbial agent - Cure rate of the active control agent) < Clinically meaningful difference

[Hypothesis Test] Rejection of the null hypothesis implies non-inferiority between the cure rate between both anti-microbial agents, given the delta

[A] Epsilon : 20 to 30% at 2% intervals
[A.1] Control Response Rate : 65%
[A.2] Treatment Response Rate : 85% to 95% at 2% intervals

[B] Alpha : 0.05

[C] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[D] Kappa : 1.0

[E] Delta : +1% testing margin for the Test of Equivalence hypothesis

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for epsilon
##################################
epsilon=seq(0.20,0.30,0.02)

##################################
# Defining a fixed hypothesized value for p_Reference
##################################
p_Reference=0.65

##################################
# Defining a range of possible hypothesized values for p_ResponseRate
##################################
(p_ResponseRate=p_Reference + epsilon)

## [1] 0.85 0.87 0.89 0.91 0.93 0.95

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining the non-inferiority delta
##################################
delta= 0.02

##################################
# Defining a fixed hypothesized value for kappa
##################################
kappa=1.0

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_1/2))/
                                      (abs(p_ResponseRate-p_Reference)-delta))^2)

## [1] 78.77165 61.21689 48.33467 38.61752 31.11818 25.21757

ceiling(n_Beta30)

## [1] 79 62 49 39 32 26

z_Beta30=(abs(p_ResponseRate-p_Reference)-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta30/kappa+p_Reference
       *(1-p_Reference)/n_Beta30)
(Power_Beta30=2*(pnorm(z_Beta30-qnorm(1-alpha))+pnorm(-z_Beta30-qnorm(1-alpha)))-1)

## [1] 0.7000152 0.7000152 0.7000152 0.7000152 0.7000152 0.7000152

beta_2=beta[2]
(n_Beta20=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_2/2))/
                                      (abs(p_ResponseRate-p_Reference)-delta))^2)

## [1] 93.83228 72.92116 57.57595 46.00094 37.06778 30.03901

ceiling(n_Beta20)

## [1] 94 73 58 47 38 31

z_Beta20=(abs(p_ResponseRate-p_Reference)-p_Reference-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta20/kappa+p_Reference
       *(1-p_Reference)/n_Beta20)
(Power_Beta20=2*(pnorm(z_Beta20-qnorm(1-alpha))+pnorm(-z_Beta20-qnorm(1-alpha)))-1)

## [1] 1.0000000 0.9999992 0.9999540 0.9992047 0.9939444 0.9737290

beta_3=beta[3]
(n_Beta10=(p_ResponseRate*(1-p_ResponseRate)/kappa+p_Reference*
             (1-p_Reference))*((qnorm(1-alpha)+qnorm(1-beta_3/2))/
                                      (abs(p_ResponseRate-p_Reference)-delta))^2)

## [1] 118.57629  92.15081  72.75900  58.13161  46.84272  37.96043

ceiling(n_Beta10)

## [1] 119  93  73  59  47  38

z_Beta10=(abs(p_ResponseRate-p_Reference)-delta)/
  sqrt(p_ResponseRate*(1-p_ResponseRate)/n_Beta10/kappa+p_Reference
       *(1-p_Reference)/n_Beta10)
(Power_Beta10=2*(pnorm(z_Beta10-qnorm(1-alpha))+pnorm(-z_Beta10-qnorm(1-alpha)))-1)

## [1] 0.9000008 0.9000008 0.9000008 0.9000008 0.9000008 0.9000008

TwoSampleUnpairedDesign.Equivalence <- as.data.frame(cbind(epsilon,n_Beta30,n_Beta20,n_Beta10))
names(TwoSampleUnpairedDesign.Equivalence) <- c("Cure.Rate.Change",
                                 "Power=70%",
                                 "Power=80%",
                                 "Power=90%")

##################################
# Restructuring the data
##################################
TwoSampleUnpairedDesign.Equivalence.Reshaped <- gather(TwoSampleUnpairedDesign.Equivalence,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
TwoSampleUnpairedDesign.Equivalence.Reshaped$Label <- rep("TSUD_EQUIVALENCE",nrow(TwoSampleUnpairedDesign.Equivalence.Reshaped))

TwoSampleUnpairedDesign.Equivalence.Reshaped

##    Cure.Rate.Change     Power Sample.Size            Label
## 1              0.20 Power=70%    78.77165 TSUD_EQUIVALENCE
## 2              0.22 Power=70%    61.21689 TSUD_EQUIVALENCE
## 3              0.24 Power=70%    48.33467 TSUD_EQUIVALENCE
## 4              0.26 Power=70%    38.61752 TSUD_EQUIVALENCE
## 5              0.28 Power=70%    31.11818 TSUD_EQUIVALENCE
## 6              0.30 Power=70%    25.21757 TSUD_EQUIVALENCE
## 7              0.20 Power=80%    93.83228 TSUD_EQUIVALENCE
## 8              0.22 Power=80%    72.92116 TSUD_EQUIVALENCE
## 9              0.24 Power=80%    57.57595 TSUD_EQUIVALENCE
## 10             0.26 Power=80%    46.00094 TSUD_EQUIVALENCE
## 11             0.28 Power=80%    37.06778 TSUD_EQUIVALENCE
## 12             0.30 Power=80%    30.03901 TSUD_EQUIVALENCE
## 13             0.20 Power=90%   118.57629 TSUD_EQUIVALENCE
## 14             0.22 Power=90%    92.15081 TSUD_EQUIVALENCE
## 15             0.24 Power=90%    72.75900 TSUD_EQUIVALENCE
## 16             0.26 Power=90%    58.13161 TSUD_EQUIVALENCE
## 17             0.28 Power=90%    46.84272 TSUD_EQUIVALENCE
## 18             0.30 Power=90%    37.96043 TSUD_EQUIVALENCE

##################################
# Plotting the sample sizes and power curves
##################################
(TwoSampleUnpairedDesign.Equivalence.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.Equivalence.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Two-Sample Unpaired Design : Test for Equivalence"))

1.2.5 Consolidated Evaluation

[A] The minimum sample size required for the treatment group in a Two-Sample Unpaired Design study is dependent upon the following factors defined and hypothesized prior to the clinical research:
     [A.1] Study Hypothesis (Equality, Non-Inferiority, Superiority, Equivalence)
     [A.2] Level of significance (Alpha) : Lower value increases sample size
     [A.3] Power of the test (Beta) : Higher value increases sample size
     [A.5] Epsilon (Difference between Treatment Response Rate and Control Response Rate) : Lower value increases sample size
     [A.6] Delta (Testing margin, as applicable) : Higher value increases sample size
[B] Given the level of significance = 5%, control response rate = 65% and treatment response rate = 85% to 95% representing epsilon values = 20% to 30%, the range of sample sizes is given on the respective power analyses provided for each study hypotheses -
     [B.1] OSD_EQUALITY :
            [B.1.1] For 70% power, sample size = 19 to 55
            [B.1.2] For 80% power, sample size = 24 to 70
            [B.1.3] For 90% power, sample size = 33 to 94
     [B.2] OSD_NONINFERIORITY with Delta = -10% :
            [B.2.1] For 70% power, sample size = 9 to 19
            [B.2.2] For 80% power, sample size = 11 to 25
            [B.2.3] For 90% power, sample size = 15 to 34
     [B.3] OSD_SUPERIORITY with Delta = +2% :
            [B.3.1] For 70% power, sample size = 17 to 52
            [B.3.2] For 80% power, sample size = 22 to 68
            [B.3.3] For 90% power, sample size = 31 to 94
     [B.4] OSD_EQUIVALENCE with Delta = +2% :
            [B.4.1] For 70% power, sample size = 26 to 79
            [B.4.2] For 80% power, sample size = 31 to 94
            [B.4.3] For 90% power, sample size = 38 to 119

Code Chunk | Output

TSUD_EQUALITY.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.Equality.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

TSUD_NONINFERIORITY.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.NonInferiority.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

TSUD_SUPERIORITY.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.Superiority.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

TSUD_EQUIVALENCE.PowerAnalysis <- ggplot(TwoSampleUnpairedDesign.Equivalence.Reshaped,
                                                      aes(x=Cure.Rate.Change,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Cure Rate Change",
                      limits=c(0.2,0.3),
                      breaks=seq(0.2,0.3,by=0.02)) +
  scale_y_continuous(name="Sample Size Per Group",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

TSUD_PowerAnalysis <- ggarrange(TSUD_EQUALITY.PowerAnalysis,
                               TSUD_NONINFERIORITY.PowerAnalysis,
                               TSUD_SUPERIORITY.PowerAnalysis,
                               TSUD_EQUIVALENCE.PowerAnalysis,
                               ncol=2, nrow=2)

annotate_figure(TSUD_PowerAnalysis,
                top = text_grob("Two-Sample Unpaired Design Power Analysis by Statistical Hypothesis",
                                color = "black",
                                face = "bold",
                                size = 14))

1.3 Two-Sample Paired Design (TSPD)

Two-Sample Paired Design involves individual proportions from sampled paired subjects between treatment and control groups. In clinical research, it is often of interest to examine any change in laboratory values before and after the application of the treatment. When the response variable is categorical, this type of change is called a categorical shift. Parameters needed for the computations are as follows:

[A] P_01 : Categorical shift shift from 0 (normal) in pre-treatment to 1 (abnormal) in post-treatment

[B] P10 : Categorical shift shift from 1 (abnormal) in pre-treatment to 0 (normal) in post-treatment

[D] Alpha : Maximum allowed Type I error

[E] Beta : Maximum allowed Type II error

A case scenario was provided as follows to demonstrate sample size computation and power analysis as applied for this type of hypothesis:

[Case Scenario] A clinical research study is being pursued concerning a test compound under in terms of the proportions of the patients with nocturnal hypoglycaemia, which is defined to be the patients with the overnight glucose value ≤ 3.5 mgL on two consecutive visits (15 minutes/per visit). At the first visit (pre-treatment), patients’ overnight glucose levels will be measured every 15 minutes. Whether or not the patient experiences nocturnal hypoglycaemia will be recorded. At the second visit, patients will receive the study compound and the overnight glucose levels will be obtained in a similar manner. Patients’ experience on nocturnal hypoglycaemia will also be recorded.

1.3.1 Test for Categorical Shift (TSPD_CATEGORICALSHIFT)

[Research Question] According to some pilot studies, it is expected that about 40 to 60% (P_10 = 0.50 to 0.60 with increment = 0.02 ) of patients will shift from 1 (nocturnal hypoglycaemia pre-treatment) to 0 (normal post-treatment) and 20% (P_01 = 0.20) of patients will shift from 0 (normal pre-treatment) to 1 (nocturnal hypoglycaemia post-treatment). Given a level of significance of 0.05, evaluate the sample size range for a test of categorical shift to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] P_01 (Categorical shift from normal in pre-treatment to hypoglycaemia in post-treatment) - P_10 (Categorical shift from hypoglycaemia in pre-treatment to normal in post-treatment) equals 0

[Alternative Hypothesis] P_01 (Categorical shift from normal in pre-treatment to hypoglycaemia in post-treatment) - P_10 (Categorical shift from hypoglycaemia in pre-treatment to normal in post-treatment) does not equal 0

[Hypothesis Test] Rejection of the null hypothesis implies a significant shift between between the hypoglycaemia and normal categories before and after treatment

[A] P_10 : 50 to 60% at 2% intervals

[B] P_01 : 20%

[C] Alpha : 0.05

[D] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

Code Chunk | Output

##################################
# Defining a range of possible hypothesized values for P10
##################################
p_10=seq(0.50,0.60,0.02)

##################################
# Defining a fixed hypothesized value for P01
##################################
p_01=0.20

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Computing the categorical shifts
##################################
p_Disc=p_10 + p_01
p_Diff=p_10 - p_01

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=((qnorm(1-alpha/2)*sqrt(p_Disc)+qnorm(1-beta_1)*sqrt(p_Disc-p_Diff^2))/p_Diff)^2)

## [1] 46.66682 42.05514 38.16335 34.84341 31.98406 29.50026

ceiling(n_Beta30)

## [1] 47 43 39 35 32 30

x1_Beta30=( p_Diff*sqrt(n_Beta30)-qnorm(1-alpha/2)*sqrt(p_Disc))/sqrt(p_Disc-p_Diff^2)
x2_Beta30=(-p_Diff*sqrt(n_Beta30)-qnorm(1-alpha/2)*sqrt(p_Disc))/sqrt(p_Disc-p_Diff^2)
(Power_Beta30 = pnorm(x1_Beta30)+pnorm(x2_Beta30))

## [1] 0.7000012 0.7000010 0.7000008 0.7000007 0.7000006 0.7000005

beta_2=beta[2]
(n_Beta20=((qnorm(1-alpha/2)*sqrt(p_Disc)+qnorm(1-beta_2)*sqrt(p_Disc-p_Diff^2))/p_Diff)^2)

## [1] 58.63224 52.76633 47.81581 43.59246 39.95477 36.79460

ceiling(n_Beta20)

## [1] 59 53 48 44 40 37

x1_Beta20=( p_Diff*sqrt(n_Beta20)-qnorm(1-alpha/2)*sqrt(p_Disc))/sqrt(p_Disc-p_Diff^2)
x2_Beta20=(-p_Diff*sqrt(n_Beta20)-qnorm(1-alpha/2)*sqrt(p_Disc))/sqrt(p_Disc-p_Diff^2)
(Power_Beta20 = pnorm(x1_Beta20)+pnorm(x2_Beta20))

## [1] 0.8000002 0.8000002 0.8000002 0.8000001 0.8000001 0.8000001

beta_3=beta[3]
(n_Beta10=((qnorm(1-alpha/2)*sqrt(p_Disc)+qnorm(1-beta_3)*sqrt(p_Disc-p_Diff^2))/p_Diff)^2)

## [1] 77.48385 69.62986 63.00128 57.34612 52.47493 48.24297

ceiling(n_Beta10)

## [1] 78 70 64 58 53 49

x1_Beta10=( p_Diff*sqrt(n_Beta10)-qnorm(1-alpha/2)*sqrt(p_Disc))/sqrt(p_Disc-p_Diff^2)
x2_Beta10=(-p_Diff*sqrt(n_Beta10)-qnorm(1-alpha/2)*sqrt(p_Disc))/sqrt(p_Disc-p_Diff^2)
(Power_Beta10 = pnorm(x1_Beta10)+pnorm(x2_Beta10))

## [1] 0.9 0.9 0.9 0.9 0.9 0.9

TwoSamplePairedDesign.CategoricalShift <- as.data.frame(cbind(p_01,p_10,n_Beta30,n_Beta20,n_Beta10))
names(TwoSamplePairedDesign.CategoricalShift) <- c("Categorical.Shift.P01",
                                                   "Categorical.Shift.P10",
                                                   "Power=70%",
                                                   "Power=80%",
                                                   "Power=90%")

##################################
# Restructuring the data
##################################
TwoSamplePairedDesign.CategoricalShift.Reshaped <- gather(TwoSamplePairedDesign.CategoricalShift,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
TwoSamplePairedDesign.CategoricalShift.Reshaped$Label <- rep("TSPD_CATEGORICALSHIFT",nrow(TwoSamplePairedDesign.CategoricalShift.Reshaped))

TwoSamplePairedDesign.CategoricalShift.Reshaped

##    Categorical.Shift.P01 Categorical.Shift.P10     Power Sample.Size
## 1                    0.2                  0.50 Power=70%    46.66682
## 2                    0.2                  0.52 Power=70%    42.05514
## 3                    0.2                  0.54 Power=70%    38.16335
## 4                    0.2                  0.56 Power=70%    34.84341
## 5                    0.2                  0.58 Power=70%    31.98406
## 6                    0.2                  0.60 Power=70%    29.50026
## 7                    0.2                  0.50 Power=80%    58.63224
## 8                    0.2                  0.52 Power=80%    52.76633
## 9                    0.2                  0.54 Power=80%    47.81581
## 10                   0.2                  0.56 Power=80%    43.59246
## 11                   0.2                  0.58 Power=80%    39.95477
## 12                   0.2                  0.60 Power=80%    36.79460
## 13                   0.2                  0.50 Power=90%    77.48385
## 14                   0.2                  0.52 Power=90%    69.62986
## 15                   0.2                  0.54 Power=90%    63.00128
## 16                   0.2                  0.56 Power=90%    57.34612
## 17                   0.2                  0.58 Power=90%    52.47493
## 18                   0.2                  0.60 Power=90%    48.24297
##                    Label
## 1  TSPD_CATEGORICALSHIFT
## 2  TSPD_CATEGORICALSHIFT
## 3  TSPD_CATEGORICALSHIFT
## 4  TSPD_CATEGORICALSHIFT
## 5  TSPD_CATEGORICALSHIFT
## 6  TSPD_CATEGORICALSHIFT
## 7  TSPD_CATEGORICALSHIFT
## 8  TSPD_CATEGORICALSHIFT
## 9  TSPD_CATEGORICALSHIFT
## 10 TSPD_CATEGORICALSHIFT
## 11 TSPD_CATEGORICALSHIFT
## 12 TSPD_CATEGORICALSHIFT
## 13 TSPD_CATEGORICALSHIFT
## 14 TSPD_CATEGORICALSHIFT
## 15 TSPD_CATEGORICALSHIFT
## 16 TSPD_CATEGORICALSHIFT
## 17 TSPD_CATEGORICALSHIFT
## 18 TSPD_CATEGORICALSHIFT

##################################
# Plotting the sample sizes and power curves
##################################
(TwoSamplePairedDesign.CategoricalShift.PowerAnalysis <- ggplot(TwoSamplePairedDesign.CategoricalShift.Reshaped,
                                                      aes(x=Categorical.Shift.P10,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Categorical Shift (Hypoglycaemia to Normal)",
                      limits=c(0.5,0.6),
                      breaks=seq(0.5,0.6,by=0.02)) +
  scale_y_continuous(name="Sample Size for Both Groups",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Two-Sample Paired Design : Test for Categorical Shift"))

1.3.2 Consolidated Evaluation

[A] The minimum sample size required for both groups in a Two-Sample Paired Design study to evaluate categorixcal shift is dependent upon the following factors defined and hypothesized prior to the clinical research:
     [A.1] Level of significance (Alpha) : Lower value increases sample size
     [A.2] Power of the test (Beta) : Higher value increases sample size
     [A.3] Categorical Shift (Difference between P_10 and P_01) : Lower value increases sample size
[B] Given the level of significance = 5%, P_01 = 20% and P_10 = 50% to 60%, the range of sample sizes is given on the respective power analyses provided for each study hypotheses -
     [B.1] TSPD_CATEGORICALSHIFT :
            [B.1.1] For 70% power, sample size = 30 to 47
            [B.1.2] For 80% power, sample size = 37 to 59
            [B.1.3] For 90% power, sample size = 49 to 78

Code Chunk | Output

TSPD_CATEGORICALSHIFT.PowerAnalysis <- ggplot(TwoSamplePairedDesign.CategoricalShift.Reshaped,
                                                      aes(x=Categorical.Shift.P10,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Categorical Shift (Hypoglycaemia to Normal)",
                      limits=c(0.5,0.6),
                      breaks=seq(0.5,0.6,by=0.02)) +
  scale_y_continuous(name="Sample Size for Both Groups",
                     limits=c(0,120),
                     breaks=seq(0,120,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

TSPD_PowerAnalysis <- ggarrange(TSPD_CATEGORICALSHIFT.PowerAnalysis,
                               ncol=2, nrow=2)

annotate_figure(TSPD_PowerAnalysis,
                top = text_grob("Two-Sample Paired Design Power Analysis by Statistical Hypothesis",
                                color = "black",
                                face = "bold",
                                size = 14))

1.4 Multiple-Sample Design One-Way ANOVA Pairwise (MSDOWAP)

Multiple-Sample Design One-Way ANOVA Pairwise involves individual proportions from sampled subjects across three or more competing treatment groups. In clinical research, responses could be between multiple test drugs and a placebo control or an active control agent; or between multiple competing agents. Responses are assumed to be independent and identically distributed normal random variables with proportion equal to the true response rate for each group. For a pairwise comparison between treatments, all possible combinations must be analyzed. Due to the multiple comparisons inflating the Type I error, adjustments arre made for controlling the overall Type I error rate at the desired significance level. Parameters needed for the computations are as follows:

[A] Treatment 1 Response Rate : Response rate obtained from sampled subjects in the first treatment group

[B] Treatment 2 Response Rate : Response rate obtained from sampled subjects in the second treatment group

[C] Epsilon : Response rate differences between the two treatment groups

[E] Alpha : Maximum allowed Type I error

[F] Beta : Maximum allowed Type II error

[G] Tau : Number of treatment groups

A case scenario was provided as follows to demonstrate sample size computation and power analysis as applied on different treatment pairs:

[Case Scenario] A parallel-group clinical research study was conducted to compare three active doses of a test compound against a standard therapy in patients with a specific carcinoma. The treatment groups are defined as A (Placebo), B (10 mg) and C (30 mg).

1.4.1 Test for Equality (MSDOWAP_EQUALITY)

[Research Question] Suppose that the hypothesized response rate for the placebo treatment is 20%. However, clinical responses for Treatments B and C were expected to range from 40% to 50% and 60% to 70%, respectively, with increments of 2%. The study aims to demonstrate the cure rate of each treatment as pairwise compared against each other. Given a level of significance of 0.05, evaluate the sample size range for a test of equality to achieve statistical test powers equal to 70%, 80% and 90%.

[Null Hypothesis] Response rate of the first evaluated treatment equals the Response rate of the second evaluated treatment

[Alternative Hypothesis] Response rate of the first evaluated treatment is not equal the Response rate of the second evaluated treatment

[Hypothesis Test] Rejection of the null hypothesis implies difference between the mean clinical response values between both evaluated treatments

[A] Response Rate : Treatment A : 20%

[B] Response Rate : Treatment B : 40% to 50% at 2% intervals

[C] Response Rate : Treatment C : 60% to 70% at 2% intervals

[D] Alpha : 0.05

[E] Beta : 0.30, 0.20 and 0.10 corresponding to 70%, 80% and 90% statistical powers

[F] Tau : 3 representing the number of treatments

Code Chunk | Output

##################################
# Defining a fixed hypothesized value for p_A
##################################
p_AF=0.20

##################################
# Defining a range of hypothesized values for p_B
##################################
p_BR=seq(0.40,0.50,0.02)

##################################
# Defining a fixed hypothesized value for p_B
# using the highest value from the range
##################################
p_BF=0.50

##################################
# Defining a range of hypothesized values for p_C
##################################
p_CR=seq(0.70,0.80,0.02)

##################################
# Defining a fixed hypothesized value for the Type I error
##################################
alpha=0.05

##################################
# Defining a range of possible hypothesized values for the Type II error
##################################
beta=c(0.30,0.20,0.10)

##################################
# Defining the number of treatments
##################################
tau=3

##################################
# Conducting a pairwise comparison
# between Treatments A and B
##################################

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=(p_AF*(1-p_AF)+p_BR*(1-p_BR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_1))/(p_AF-p_BR))^2)

## [1] 85.16944 71.02146 60.09177 51.45443 44.49669 38.79941

ceiling(n_Beta30)

## [1] 86 72 61 52 45 39

z_Beta30=(p_AF-p_BR)/sqrt(p_AF*(1-p_AF)/n_Beta30+p_BR*(1-p_BR)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha/2/tau))+pnorm(-z_Beta30-qnorm(1-alpha/2/tau)))

## [1] 0.7000001 0.7000001 0.7000001 0.7000001 0.7000001 0.7000001

beta_2=beta[2]
(n_Beta20=(p_AF*(1-p_AF)+p_BR*(1-p_BR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_2))/(p_AF-p_BR))^2)

## [1] 104.69114  87.30030  73.86542  63.24832  54.69578  47.69263

ceiling(n_Beta20)

## [1] 105  88  74  64  55  48

z_Beta20=(p_AF-p_BR)/sqrt(p_AF*(1-p_AF)/n_Beta20+p_BR*(1-p_BR)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha/2/tau))+pnorm(-z_Beta20-qnorm(1-alpha/2/tau)))

## [1] 0.8 0.8 0.8 0.8 0.8 0.8

beta_3=beta[3]
(n_Beta10=(p_AF*(1-p_AF)+p_BR*(1-p_BR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_3))/(p_AF-p_BR))^2)

## [1] 135.09531 112.65386  95.31725  81.61675  70.58041  61.54342

ceiling(n_Beta10)

## [1] 136 113  96  82  71  62

z_Beta10=(p_AF-p_BR)/sqrt(p_AF*(1-p_AF)/n_Beta10+p_BR*(1-p_BR)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha/2/tau))+pnorm(-z_Beta10-qnorm(1-alpha/2/tau)))

## [1] 0.9 0.9 0.9 0.9 0.9 0.9

MultipleSampleDesign.AB.Equality <- as.data.frame(cbind(p_AF,p_BR,n_Beta30,n_Beta20,n_Beta10))
names(MultipleSampleDesign.AB.Equality) <- c("Treatment_A.Response.Rate",
                                             "Treatment_B.Response.Rate",
                                             "Power=70%",
                                             "Power=80%",
                                             "Power=90%")

##################################
# Restructuring the data
##################################
MultipleSampleDesign.AB.Equality.Reshaped <- gather(MultipleSampleDesign.AB.Equality,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
MultipleSampleDesign.AB.Equality.Reshaped$Label <- rep("MSDOWAP_AB_EQUALITY",nrow(MultipleSampleDesign.AB.Equality.Reshaped))

MultipleSampleDesign.AB.Equality.Reshaped

##    Treatment_A.Response.Rate Treatment_B.Response.Rate     Power Sample.Size
## 1                        0.2                      0.40 Power=70%    85.16944
## 2                        0.2                      0.42 Power=70%    71.02146
## 3                        0.2                      0.44 Power=70%    60.09177
## 4                        0.2                      0.46 Power=70%    51.45443
## 5                        0.2                      0.48 Power=70%    44.49669
## 6                        0.2                      0.50 Power=70%    38.79941
## 7                        0.2                      0.40 Power=80%   104.69114
## 8                        0.2                      0.42 Power=80%    87.30030
## 9                        0.2                      0.44 Power=80%    73.86542
## 10                       0.2                      0.46 Power=80%    63.24832
## 11                       0.2                      0.48 Power=80%    54.69578
## 12                       0.2                      0.50 Power=80%    47.69263
## 13                       0.2                      0.40 Power=90%   135.09531
## 14                       0.2                      0.42 Power=90%   112.65386
## 15                       0.2                      0.44 Power=90%    95.31725
## 16                       0.2                      0.46 Power=90%    81.61675
## 17                       0.2                      0.48 Power=90%    70.58041
## 18                       0.2                      0.50 Power=90%    61.54342
##                  Label
## 1  MSDOWAP_AB_EQUALITY
## 2  MSDOWAP_AB_EQUALITY
## 3  MSDOWAP_AB_EQUALITY
## 4  MSDOWAP_AB_EQUALITY
## 5  MSDOWAP_AB_EQUALITY
## 6  MSDOWAP_AB_EQUALITY
## 7  MSDOWAP_AB_EQUALITY
## 8  MSDOWAP_AB_EQUALITY
## 9  MSDOWAP_AB_EQUALITY
## 10 MSDOWAP_AB_EQUALITY
## 11 MSDOWAP_AB_EQUALITY
## 12 MSDOWAP_AB_EQUALITY
## 13 MSDOWAP_AB_EQUALITY
## 14 MSDOWAP_AB_EQUALITY
## 15 MSDOWAP_AB_EQUALITY
## 16 MSDOWAP_AB_EQUALITY
## 17 MSDOWAP_AB_EQUALITY
## 18 MSDOWAP_AB_EQUALITY

##################################
# Plotting the sample sizes and power curves
##################################
(MultipleSampleDesign.AB.Equality.PowerAnalysis <-    ggplot(MultipleSampleDesign.AB.Equality.Reshaped,
                                                      aes(x=Treatment_B.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Clinical Response Rate for Treatment B",
                      limits=c(0.40,0.50),
                      breaks=seq(0.40,0.50,by=0.02)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,160),
                     breaks=seq(0,160,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Multiple-Sample Design : Test for Equality Between A Versus B"))

##################################
# Conducting a pairwise comparison
# between Treatments A and C
##################################

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=(p_AF*(1-p_AF)+p_CR*(1-p_CR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_1))/(p_AF-p_CR))^2)

## [1] 12.605077 11.389522 10.292767  9.299112  8.395418  7.570617

ceiling(n_Beta30)

## [1] 13 12 11 10  9  8

z_Beta30=(p_AF-p_CR)/sqrt(p_AF*(1-p_AF)/n_Beta30+p_CR*(1-p_CR)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha/2/tau))+pnorm(-z_Beta30-qnorm(1-alpha/2/tau)))

## [1] 0.7000001 0.7000001 0.7000001 0.7000001 0.7000001 0.7000001

beta_2=beta[2]
(n_Beta20=(p_AF*(1-p_AF)+p_CR*(1-p_CR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_2))/(p_AF-p_CR))^2)

## [1] 15.494289 14.000117 12.651975 11.430563 10.319733  9.305879

ceiling(n_Beta20)

## [1] 16 15 13 12 11 10

z_Beta20=(p_AF-p_CR)/sqrt(p_AF*(1-p_AF)/n_Beta20+p_CR*(1-p_CR)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha/2/tau))+pnorm(-z_Beta20-qnorm(1-alpha/2/tau)))

## [1] 0.8 0.8 0.8 0.8 0.8 0.8

beta_3=beta[3]
(n_Beta10=(p_AF*(1-p_AF)+p_CR*(1-p_CR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_3))/(p_AF-p_CR))^2)

## [1] 19.99411 18.06600 16.32633 14.75020 13.31677 12.00847

ceiling(n_Beta10)

## [1] 20 19 17 15 14 13

z_Beta10=(p_AF-p_CR)/sqrt(p_AF*(1-p_AF)/n_Beta10+p_CR*(1-p_CR)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha/2/tau))+pnorm(-z_Beta10-qnorm(1-alpha/2/tau)))

## [1] 0.9 0.9 0.9 0.9 0.9 0.9

MultipleSampleDesign.AC.Equality <- as.data.frame(cbind(p_AF,p_CR,n_Beta30,n_Beta20,n_Beta10))
names(MultipleSampleDesign.AC.Equality) <- c("Treatment_A.Response.Rate",
                                             "Treatment_B.Response.Rate",
                                             "Power=70%",
                                             "Power=80%",
                                             "Power=90%")

##################################
# Restructuring the data
##################################
MultipleSampleDesign.AC.Equality.Reshaped <- gather(MultipleSampleDesign.AC.Equality,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
MultipleSampleDesign.AC.Equality.Reshaped$Label <- rep("MSDOWAP_AC_EQUALITY",nrow(MultipleSampleDesign.AC.Equality.Reshaped))

MultipleSampleDesign.AC.Equality.Reshaped

##    Treatment_A.Response.Rate Treatment_B.Response.Rate     Power Sample.Size
## 1                        0.2                      0.70 Power=70%   12.605077
## 2                        0.2                      0.72 Power=70%   11.389522
## 3                        0.2                      0.74 Power=70%   10.292767
## 4                        0.2                      0.76 Power=70%    9.299112
## 5                        0.2                      0.78 Power=70%    8.395418
## 6                        0.2                      0.80 Power=70%    7.570617
## 7                        0.2                      0.70 Power=80%   15.494289
## 8                        0.2                      0.72 Power=80%   14.000117
## 9                        0.2                      0.74 Power=80%   12.651975
## 10                       0.2                      0.76 Power=80%   11.430563
## 11                       0.2                      0.78 Power=80%   10.319733
## 12                       0.2                      0.80 Power=80%    9.305879
## 13                       0.2                      0.70 Power=90%   19.994106
## 14                       0.2                      0.72 Power=90%   18.066000
## 15                       0.2                      0.74 Power=90%   16.326333
## 16                       0.2                      0.76 Power=90%   14.750202
## 17                       0.2                      0.78 Power=90%   13.316767
## 18                       0.2                      0.80 Power=90%   12.008472
##                  Label
## 1  MSDOWAP_AC_EQUALITY
## 2  MSDOWAP_AC_EQUALITY
## 3  MSDOWAP_AC_EQUALITY
## 4  MSDOWAP_AC_EQUALITY
## 5  MSDOWAP_AC_EQUALITY
## 6  MSDOWAP_AC_EQUALITY
## 7  MSDOWAP_AC_EQUALITY
## 8  MSDOWAP_AC_EQUALITY
## 9  MSDOWAP_AC_EQUALITY
## 10 MSDOWAP_AC_EQUALITY
## 11 MSDOWAP_AC_EQUALITY
## 12 MSDOWAP_AC_EQUALITY
## 13 MSDOWAP_AC_EQUALITY
## 14 MSDOWAP_AC_EQUALITY
## 15 MSDOWAP_AC_EQUALITY
## 16 MSDOWAP_AC_EQUALITY
## 17 MSDOWAP_AC_EQUALITY
## 18 MSDOWAP_AC_EQUALITY

##################################
# Plotting the sample sizes and power curves
##################################
(MultipleSampleDesign.AC.Equality.PowerAnalysis <-    ggplot(MultipleSampleDesign.AC.Equality.Reshaped,
                                                      aes(x=Treatment_B.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Clinical Response Rate for Treatment C",
                      limits=c(0.70,0.80),
                      breaks=seq(0.70,0.80,by=0.02)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,160),
                     breaks=seq(0,160,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Multiple-Sample Design : Test for Equality Between A Versus C"))

##################################
# Conducting a pairwise comparison
# between Treatments B and C
##################################

##################################
# Computing the range of samples sizes based on different levels of Type II errors
##################################
beta_1=beta[1]
(n_Beta30=(p_BF*(1-p_BF)+p_CR*(1-p_CR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_1))/(p_BF-p_CR))^2)

## [1] 97.94485 79.46801 65.41486 54.47820 45.80030 38.79941

ceiling(n_Beta30)

## [1] 98 80 66 55 46 39

z_Beta30=(p_BF-p_CR)/sqrt(p_BF*(1-p_BF)/n_Beta30+p_CR*(1-p_CR)/n_Beta30)
(Power_Beta30=pnorm(z_Beta30-qnorm(1-alpha/2/tau))+pnorm(-z_Beta30-qnorm(1-alpha/2/tau)))

## [1] 0.7000001 0.7000001 0.7000001 0.7000001 0.7000001 0.7000001

beta_2=beta[2]
(n_Beta20=(p_BF*(1-p_BF)+p_CR*(1-p_CR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_2))/(p_BF-p_CR))^2)

## [1] 120.39481  97.68289  80.40861  66.96516  56.29819  47.69263

ceiling(n_Beta20)

## [1] 121  98  81  67  57  48

z_Beta20=(p_BF-p_CR)/sqrt(p_BF*(1-p_BF)/n_Beta20+p_CR*(1-p_CR)/n_Beta20)
(Power_Beta20=pnorm(z_Beta20-qnorm(1-alpha/2/tau))+pnorm(-z_Beta20-qnorm(1-alpha/2/tau)))

## [1] 0.8 0.8 0.8 0.8 0.8 0.8

beta_3=beta[3]
(n_Beta10=(p_BF*(1-p_BF)+p_CR*(1-p_CR))*
    ((qnorm(1-alpha/2/tau)+qnorm(1-beta_3))/(p_BF-p_CR))^2)

## [1] 155.35960 126.05174 103.76070  86.41303  72.64819  61.54342

ceiling(n_Beta10)

## [1] 156 127 104  87  73  62

z_Beta10=(p_BF-p_CR)/sqrt(p_BF*(1-p_BF)/n_Beta10+p_CR*(1-p_CR)/n_Beta10)
(Power_Beta10=pnorm(z_Beta10-qnorm(1-alpha/2/tau))+pnorm(-z_Beta10-qnorm(1-alpha/2/tau)))

## [1] 0.9 0.9 0.9 0.9 0.9 0.9

MultipleSampleDesign.BC.Equality <- as.data.frame(cbind(p_BF,p_CR,n_Beta30,n_Beta20,n_Beta10))
names(MultipleSampleDesign.BC.Equality) <- c("Treatment_A.Response.Rate",
                                             "Treatment_B.Response.Rate",
                                             "Power=70%",
                                             "Power=80%",
                                             "Power=90%")

##################################
# Restructuring the data
##################################
MultipleSampleDesign.BC.Equality.Reshaped <- gather(MultipleSampleDesign.BC.Equality,
                                            "Power=70%","Power=80%","Power=90%",
                                            key="Power",
                                            value="Sample.Size")
MultipleSampleDesign.BC.Equality.Reshaped$Label <- rep("MSDOWAP_BC_EQUALITY",nrow(MultipleSampleDesign.BC.Equality.Reshaped))

MultipleSampleDesign.BC.Equality.Reshaped

##    Treatment_A.Response.Rate Treatment_B.Response.Rate     Power Sample.Size
## 1                        0.5                      0.70 Power=70%    97.94485
## 2                        0.5                      0.72 Power=70%    79.46801
## 3                        0.5                      0.74 Power=70%    65.41486
## 4                        0.5                      0.76 Power=70%    54.47820
## 5                        0.5                      0.78 Power=70%    45.80030
## 6                        0.5                      0.80 Power=70%    38.79941
## 7                        0.5                      0.70 Power=80%   120.39481
## 8                        0.5                      0.72 Power=80%    97.68289
## 9                        0.5                      0.74 Power=80%    80.40861
## 10                       0.5                      0.76 Power=80%    66.96516
## 11                       0.5                      0.78 Power=80%    56.29819
## 12                       0.5                      0.80 Power=80%    47.69263
## 13                       0.5                      0.70 Power=90%   155.35960
## 14                       0.5                      0.72 Power=90%   126.05174
## 15                       0.5                      0.74 Power=90%   103.76070
## 16                       0.5                      0.76 Power=90%    86.41303
## 17                       0.5                      0.78 Power=90%    72.64819
## 18                       0.5                      0.80 Power=90%    61.54342
##                  Label
## 1  MSDOWAP_BC_EQUALITY
## 2  MSDOWAP_BC_EQUALITY
## 3  MSDOWAP_BC_EQUALITY
## 4  MSDOWAP_BC_EQUALITY
## 5  MSDOWAP_BC_EQUALITY
## 6  MSDOWAP_BC_EQUALITY
## 7  MSDOWAP_BC_EQUALITY
## 8  MSDOWAP_BC_EQUALITY
## 9  MSDOWAP_BC_EQUALITY
## 10 MSDOWAP_BC_EQUALITY
## 11 MSDOWAP_BC_EQUALITY
## 12 MSDOWAP_BC_EQUALITY
## 13 MSDOWAP_BC_EQUALITY
## 14 MSDOWAP_BC_EQUALITY
## 15 MSDOWAP_BC_EQUALITY
## 16 MSDOWAP_BC_EQUALITY
## 17 MSDOWAP_BC_EQUALITY
## 18 MSDOWAP_BC_EQUALITY

##################################
# Plotting the sample sizes and power curves
##################################
(MultipleSampleDesign.BC.Equality.PowerAnalysis <-    ggplot(MultipleSampleDesign.BC.Equality.Reshaped,
                                                      aes(x=Treatment_B.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Clinical Response Rate for Treatment C",
                      limits=c(0.70,0.80),
                      breaks=seq(0.70,0.80,by=0.02)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,160),
                     breaks=seq(0,160,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50)) +
  ggtitle("Multiple-Sample Design : Test for Equality Between B Versus C"))

1.4.2 Consolidated Evaluation

[A] The minimum sample size required for each treatment group in a Multiple-Sample Design One-Way ANOVA Pairwise study is dependent upon the following factors defined and hypothesized prior to the clinical research:
     [A.1] Level of significance (Alpha) : Lower value increases sample size
     [A.2] Power of the test (Beta) : Higher value increases sample size
     [A.3] Epsilon (Difference between the Treatment Response Rates) : Lower value increases sample size
     [A.4] Tau (Number of treatment groups) : Higher value increases sample size
[B] Given the level of significance = 5%, number of treatments = 3 and a varying range of clinical response values across pairwise combinations of treatments, the sample size is determined as the maximum number obtained for all computations under each statistical power -
     [B.1] 70% power, sample size = 98
     [B.2] 80% power, sample size = 121
     [B.3] 90% power, sample size = 156

Code Chunk | Output

##################################
# Consolidating the power analyses charts
##################################
MSDOWAP.AB.Equality.PowerAnalysis <- ggplot(MultipleSampleDesign.AB.Equality.Reshaped,
                                                      aes(x=Treatment_B.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Clinical Response Rate for Treatment B",
                      limits=c(0.40,0.50),
                      breaks=seq(0.40,0.50,by=0.02)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,160),
                     breaks=seq(0,160,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

MSDOWAP.AC.Equality.PowerAnalysis <-  ggplot(MultipleSampleDesign.AC.Equality.Reshaped,
                                                      aes(x=Treatment_B.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Clinical Response Rate for Treatment C",
                      limits=c(0.70,0.80),
                      breaks=seq(0.70,0.80,by=0.02)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,160),
                     breaks=seq(0,160,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))

MSDOWAP.BC.Equality.PowerAnalysis <- ggplot(MultipleSampleDesign.BC.Equality.Reshaped,
                                                      aes(x=Treatment_B.Response.Rate,
                                                          y=Sample.Size, 
                                                          color=Power)) +
   geom_line(size=1)+
   geom_point(size=4)+
   theme_bw() +
   facet_grid(. ~ Label) +
   scale_color_brewer(palette="Paired") +
   scale_x_continuous(name="Hypothesized Clinical Response Rate for Treatment C",
                      limits=c(0.70,0.80),
                      breaks=seq(0.70,0.80,by=0.02)) +
  scale_y_continuous(name="Treatment Group Sample Size",
                     limits=c(0,160),
                     breaks=seq(0,160,by=10)) +
  theme(axis.title.x=element_text(color="black",face="bold",size=12),
        legend.position="top",
        axis.title.y=element_text(color="black",face="bold",size=12),
        plot.title=element_text(color="black",size=14,face="bold",hjust=0.50))


MSDOWAP_PowerAnalysis <- ggarrange(MSDOWAP.AB.Equality.PowerAnalysis,
                                   MSDOWAP.AC.Equality.PowerAnalysis,
                                   MSDOWAP.BC.Equality.PowerAnalysis,
                                   ncol=2, nrow=2)

annotate_figure(MSDOWAP_PowerAnalysis,
                top = text_grob("Multiple-Sample Design One-Way ANOVA Power Analysis by Pairwise Comparison",
                                color = "black",
                                face = "bold",
                                size = 14))

Statistical Evaluation : Sample Size and Power Calculations for Tests Comparing Proportions in Clinical Research

John Pauline Pineda

May 21, 2023

1. Table of Contents

1.1 One-Sample Design (OSD)

1.1.1 Test for Equality (OSD_EQUALITY)

1.1.2 Test for Non-Inferiority (OSD_NONINFERIORITY)

1.1.3 Test for Superiority (OSD_SUPERIORITY)

1.1.4 Test for Equivalence (OSD_EQUIVALENCE)

1.1.5 Consolidated Evaluation

1.2 Two-Sample Unpaired Design (TSUD)

1.2.1 Test for Equality (TSUD_EQUALITY)

1.2.2 Test for Non-Inferiority (TSUD_NONINFERIORITY)

1.2.3 Test for Superiority (TSUD_SUPERIORITY)

1.2.4 Test for Equivalence (TSUD_EQUIVALENCE)

1.2.5 Consolidated Evaluation

1.3 Two-Sample Paired Design (TSPD)

1.3.1 Test for Categorical Shift (TSPD_CATEGORICALSHIFT)

1.3.2 Consolidated Evaluation

1.4 Multiple-Sample Design One-Way ANOVA Pairwise (MSDOWAP)

1.4.1 Test for Equality (MSDOWAP_EQUALITY)

1.4.2 Consolidated Evaluation

2. Summary

3. References