Sample Size Determination, P-Value Calculation & Hypothesis Testing

A detailed step-by-step approach to calculating minimum sample size, p-values, and conducting hypothesis tests in statistical studies.

Posted by Alfred Prah on October 06, 2024 · 4 mins read

I've been tinkering with what it would take to accelerate GenAIBro's growth, and have been reflecting quite a bit on experimentation. Granted, GenAIBro is still in its infancy, but with the trajectory it's on, I'd be remiss if I werent planning next steps already. Now, to the Article! When conducting Experiments, especially in Marketing, it's crucial to:

  • Determine the appropriate sample size to achieve desired precision and confidence.
  • Understand how to calculate p-values to assess the statistical significance of your results.
  • Know when to use different statistical tests and how to interpret them.

I created this guide as a refresher, and it should walk you (read: us) through these concepts using practical examples, focusing on the use case of estimating click-through rates (CTR) on platforms.


Determining Sample Size for Estimating a Proportion

Step-by-Step Method

To determine the minimum sample size (N) required to estimate a population proportion (P) within a margin of error (δ) at a specific confidence level:

  1. Choose the Confidence Level and Z-value:
    Common Z-values for confidence levels are:
    • 90%: Z ≈ 1.645
    • 95%: Z ≈ 1.96
    • 99%: Z ≈ 2.58
  2. Decide on the Margin of Error (δ):
    This represents the maximum acceptable difference between your sample estimate and the true population proportion.
  3. Estimate the Population Proportion (P):
    Use prior data or P = 0.5 if unknown for a conservative estimate.
  4. Apply the Sample Size Formula:
    N = (Z * √(P(1-P)) / δ)^2

Examples

Example 1: Estimated Proportion Known

  • Estimated CTR (P): 10%
  • Margin of Error (δ): 2%
  • Confidence Level: 95% (Z = 2)

Calculations yield a minimum sample size of 900.

Example 2: Estimated Proportion Unknown

  • P = 0.5
  • δ = 2%
  • Confidence Level: 95% (Z = 2)

Calculations yield a sample size of 2,500.


Calculating P-Values and Hypothesis Testing

The framework for hypothesis testing includes:

  1. Formulate Hypotheses:
    Null Hypothesis (H₀) assumes no effect, while the Alternative Hypothesis (H₁) tests for an effect or difference.
  2. Choose Significance Level (α):
    Common choices are 0.05, 0.01, or 0.10.
  3. Calculate Test Statistics:
    For large sample sizes, use the z-statistic:
    Z = (P̂ - P₀) / √(P₀(1-P₀) / N)

When to Use t-Statistic vs. z-Statistic

  • Z-Statistic: Used when population standard deviation is known and sample size is large (N ≥ 30).
  • T-Statistic: Used when population standard deviation is unknown, especially with small samples (N < 30).

One-Tailed vs. Two-Tailed Tests

  • One-Tailed Test: Tests for an effect in one direction.
  • Two-Tailed Test: Tests for any significant difference, regardless of direction.

Choosing between these depends on whether you are testing for a specific direction or any deviation.


Comprehensive Example

Scenario: Testing if a new ad design increases CTR from 5%.

  1. Determine Sample Size:
    With a desired margin of error of ±2%, a confidence level of 95%, and an estimated CTR of 5%, you calculate a sample size of approximately 475.
  2. Collect Data:
    Sample size: 500
    Observed clicks: 35
    Observed CTR: 7%
  3. Conduct Hypothesis Test:
    Null Hypothesis (H₀): P = 0.05
    Alternative Hypothesis (H₁): P > 0.05
  4. Calculate Test Statistic:
    Using the observed data, calculate Z ≈ 2.051.
  5. Calculate P-Value:
    One-tailed p-value ≈ 0.0202, which is less than α = 0.05, leading to the rejection of the null hypothesis.

Conclusion

By following this guide, you can:

  • Determine sample sizes for statistical studies.
  • Calculate p-values for statistical significance.
  • Decide between t-statistics and z-statistics.
  • Choose the appropriate hypothesis testing method for your analysis.

In summary, these statistical principles ensure the reliability of experimental findings and help make informed, data-driven decisions.