Low Cost SSR | ngyx.eu

WEB Content

If you want to support NGYX Non-Profit initiative by advertising on this WEB site Click Here!

Low Cost SSR (Sample Sizing and Randomization)

A. Sample Sizing.

Sample Size determination is strickly forward IF you can figure out your expectations (effect size). There are 2 formulas to apply depending if the effect relates to continuous variables or proportions/percentages.

Important: Whatever the variable type you first need to define:

a. Risk Alpha. (it is a percentage > 0% & < 50%). That's what's usually called False Positives Risk. Let's phrase it as: "If I state the Risk Alpha at 5% (usual) for predicting a disease via (e.g.) a biomarker, it means that I do not want more than 5% of predictions for a disease that does not exist for real" (see also the Risk Alpha/Beta remark here below).

b. Risk Beta / Power (= 1-Risk Beta). Also known as the False Negatives Risk. In this case you fix the limit for missing an existing disease via (e.g.) a biomarker. So risk Beta should be > 0 % and <50% (usually 80%).

Risk Alpha/Beta remark: In most of the cases you expect your effect size going one way. So that (here it is for Statisticians) the formula will fetch the Z values at Alpha/2. (e.g: Alpha 5% but Z in tables captured at 2.5%... This does not apply to risk Beta because when you miss predicting side does not care. Good to know: I was recently involved in a CT where a Biomarker went significatively the other way round then expected... Amazing...

I. Continuous Variables .

Let's say that you have a Disease Related Biomarker that can be evaluated from samples testing as a value e.g. a concentration in blood. In such a case the formula used requires (see also: Required Information here below):

a. Base Level of the variable. e.g. Blood Concentration of the Biomarker in the global population without disease.

b. Expected Level of the variable. e.g. Blood Concentration of the Biomarker in people with disease.

c. Variable Measurement Variability (Standard Error of the variable). Slightly more difficult here as it can vary in between the disease or no disease subpopulations. If there is clearly a difference I will recommend to use the largest one (to stay at the safe side).

Formula of calculating sample size is:

n = [ { ( Z α/2 + Z β )**2 } x { 2 x (0**2) } ] / (μ1 - μ2) **2

where n = sample size required in each group is calculated from:

μ1 =Base Level of the variable.

μ2 = Expected Level of the variable.

o = Variable Measurement Variability

And extracted from tables, the constants:

Z α/2 = Risk Alpha

Z β = Risk Beta / Power (= 1-Risk Beta)

II. Proportions/Percentages.

Imagine that you have data about pain created via a VAS (Visual Assessement Score; no pain 0% - 100% full pain) prior and after a treatment vs. a disease. In this case the formula requires also the 2 risk factors and in this case:

a. Base level percentage. What 's the % that you will observe from a "non treated disease" population?

b. Expected level percentage . What's the (reduced) percentage that you will observe in the treated population?

Formula of calculating sample size is

n = [(Z α/2 + Z β ) 2 x {(p1 (1-p1) + (p2 (1-p2))}]/(p1 - p2) 2

where n = sample size required in each group is calculated from:

p1 = proportion of subject cured by Drug A = 0.50,

p2 = proportion of subject cured by Placebo = 0.34,

And extracted from tables, the constants:

Z α/2 = Risk Alpha

Z β = Risk Beta / Power (= 1-Risk Beta)

ALSO while by using this approach you won't face any problem with submission to regulatory authorithies (ICH E9, pp 16 and following) there is much more to say and to take into consideration to define the sample size. Click here for more info and discover how we manage these aspects.

B. Randomization.

Once the final sample size is defined, here comes the randomization (Not only for blind /double blind trials...)

Randomization starts with the identification of "groups" (grouping the data). Let's have an example.

Suppose you have a study parallel groups for one product and one placebo. Suppose that you would like to make the distinction between male and female. And also that you would like to consider the data coming from 3 different sites on their own.

So in fact you will face 12 datasets (2x2x3) . And each of these needs to be randomized independantly...

Also in terms of products you will face only 2 of these. And there is only one way to randomize perfectly for these 2 products within these 6 datasets (let's call these the predefined stratification datasets).

1. The most important part: Seeding! It's the warrant of reproducibility whatever the code/system used for generating data at random. AND YOU SHOULD ALWAYS CHECK that with the same seed you regenerate the same outpouts.

II. Here the "Ultimate Sequence":

1. Alea() A or B

2. Alea() A or B

Then all of following...

3. if(count of A - count of B) > 1 : B

elsif(count of B - count of A) > 1: A

else(alea()) A or B

...

Nothing else!!!

Have enveloppes and distribute.

Our Low Cost SSR package: 500 Euros + 10 Euros/Theoritical Patients. Click Here if interested.

Feel free to have a contact: Info_ClinicalTrials@NGYX.eu

NGYX I.C. details / Coordinates.

Company N°: BE 0537.471.159