E-commerce A/B Testing and Experimentation
A/B testing compares two versions of a page, feature, or experience to determine which performs better — measured against defined business metrics. In e-commerce, systematic experimentation is the evidence-based alternative to HiPPO decisions (Highest Paid Person's Opinion). Companies with mature experimentation programmes make more reliable improvements and avoid costly mistakes.
A/B Testing Mechanics
A random sample of users sees Version A (control); another sample sees Version B (variant). Traffic is split — typically 50/50 or with a holdout for cautious rollouts. The test runs until statistical significance is reached (typically 95% confidence level). Analysis measures the primary metric (conversion rate, AOV, revenue per visitor) and guards against negative effects on secondary metrics.
Statistical Rigour
- Sample size: Calculate required sample size before running — underpowered tests produce unreliable results. Minimum detectable effect × confidence level → required traffic
- Test duration: Run for full business cycles (minimum 1-2 weeks to capture weekday/weekend variation)
- Statistical significance: 95% confidence that the observed difference isn't due to chance
- Novelty effect: New experiences sometimes perform better simply because they're new — monitor long-term performance
Testing Tools
Optimizely (enterprise), VWO, AB Tasty, Convert — full-featured experimentation platforms. Google Optimize sunsetted 2023. Statsig, GrowthBook — developer-friendly, open source option available. Feature flag tools (LaunchDarkly, Unleash) can run experiments within feature flags.