Why do you test any change before implementing it on your website? You know that’s important if you want to improve the user experience. Plus, you don’t want to lose money. A/B tests are a golden standard in website maintenance — you can even apply it to display ads to increase your ad revenue.
The only problem is that they don’t always give you precise results. That doesn’t mean we should all just ditch A/B testing and stick to the old-fashioned ways of being guided by intuition or doing what everyone else is doing.
A/B tests are useful and important; you just need to avoid the mistakes of that practice.
Let’s go through 9 common mistakes that ruin the success of A/B testing. When you know what you shouldn’t do, it’s easier to do the right thing.
Cutting A/B Testing Too Early Into The Process
This is the perfect recipe for getting statistically insignificant results. We’ve all been there: running the A/B test for three days and realising that one of the versions is absolutely superior to the other one. Yes, it’s possible to see a staggering difference between the two versions and think that going on with the testing has no point.
At this stage, there’s an important thing to remember: Do NOT stop the test!
On a longer term, the variant that seems inferior at first might turn out better. If you quit the testing too early in the process, you’ll lose conversions and you’ll make most of your audience frustrated.
How do you know what the right length of testing is, anyway? You can use VWO’s tool to give you the answer: A/B Split and Multivariate Test Duration Calculator. It will give you the estimated proper length of A/B testing based on several factors: Minimum improvement in conversions you want to detect, estimated existing conversion rate, average number of daily visitors, number of variations to run, and percent visitors included in the test.
Running A/B Tests Without A Hypothesis
Ask yourself this question: why are you running the A/B test? Is it because all other marketers are doing it? The approach “I’ll just test things until something works” is wrong. You need a hypothesis. You need to be almost sure that something will work, and then you test it just to get the final approval before implementing the update.
Set a hypothesis: The audience will like version A better than version B. Then, test to prove that hypothesis right or wrong. The goal is to learn more about the audience. The information you get through A/B testing will help you improve your overall customer theory and come up with better hypothesis to test in future.
Discarding A Failed Test
There’s a case study that teaches us a lesson. For the improvement of TruckersReport website, the team ran 6 rounds of tests and they ended up using a version that was 79.3% better than the previous one.
Although the variation of the old website was much better in design, the control beat it on the first test by 13.56%. This A/B test shows that you failed in the attempts to boost the design of the site. Did the team stop there? No. They decided that more testing was needed to figure out why that happened. They set a hypothesis for the potential problem and added more changes in the variation.
During the second test, the original beat the variation with a 21.7% difference in the bottom of the funnel conversions. They continued testing, and they ended up using Variation 3, which performed 79.3% better than the original landing page.
Here’s the lesson to learn: A single test that shows you failed doesn’t mean you should stop the process and stick to the good old design. You continue testing and figuring out why the variations don’t work until you find one that does.
Running Tests With Overlapping Traffic
It’s interesting to see how people think they can save time by running multiple A/B tests at the same time. They’re doing it on the home page, the features page, the subscription form… everywhere. You’re covering a huge load of work in a short period.
The biggest problem in A/B testing with overlapping traffic is in even distribution. The traffic should always be split evenly between the versions you’re testing. Otherwise, that extreme workload will lead to unrealistic results.
The solution? You can do multiple A/B tests at the same time, but you have to be very careful not to get tricked by overlapped traffic. The best thing would be sticking to simplicity and, if possible, doing one test at a time.
Not Running Tests Constantly
Let’s remind ourselves of the reason we’re testing: making informed decisions, improving the conversion rate, learning more about the visitors, improving their user experience, and getting more money from that website. That’s not something we stop doing. Ever!
Not running tests all the time is a big mistake. The online world is constantly evolving. If you don’t keep up with the trends and you stop introducing new trends, you’ll eventually lose your audience to the competition. You don’t want that to happen? Never stop testing and optimising!
Doing A/B Testing Without Proper Traffic Or Conversions To Test
When you don’t have the parameters needed for proper A/B testing, you won’t get accurate results. Even if the version B is better than the control version, the A/B test won’t show that. It would take many months of testing to get results with statistical significance. Months of doing the same test means resources spent, or better said – wasted.
When the sample size is big enough to reach statistical significance, you can start doing A/B testing. Until then, save your time and money for improving the promotional campaign and attracting more visits.
Changing Parameters In The Middle Of Testing
When you’re about to launch a test, you have to be committed to it. The experiment settings, the control and variation design, and the test goals stay the same. You shouldn’t change any parameter, including traffic allocations to variations.
Have you heard of the Simpson’s Paradox? A trend we see in different groups of data may disappear when we combine those groups. That can happen if you change the traffic split between variations in the middle of testing. These changes will only affect new users, and they will disturb the sampling of returning visitors.
Before you start with A/B testing, have a plan. Don’t change the parameters because that practice will lead to confusing, unrealistic results.
Running Tests With Too Many Variations
You might get the impression that the more variations you have to test, the more insights you’re getting. However, too many variations will slow things down. Plus, they are increasing your chances to get a false positive and end up using a variation that’s not the real winner.
Optimizely, VWO and other A/B testing tools take the multiple comparison problem into consideration and adjust the false positive rate to make sure you’re getting realistic results. However, that still doesn’t mean that testing as many variations as possible is the right thing to do. This practice usually results in a difference that has no statistical significance.
Not Considering Validity Threats
So you do the test and you get the data you need. Now, all you need to do is pick the version that achieved better results, right? Wrong! Even if the sample size was decent, you gave the test a good period of time and you didn’t have too many variations to test, there’s still a possibility of getting invalid results.
You have to make sure that every single metric and goal you need to track is being tracked. If you notice that you’re not getting data from some of the metrics, you’ll have to find out what’s wrong, fix the issue and start the test all over again.
Bottomline? test, test, and test some more! However, those tests have to be effective and adequate. There are many tools that make it easy for you, but that doesn’t mean they can’t lead you to wrong decisions. Start paying attention to the common mistakes described above. Are you making some of them? You’ll improve the way you test as soon as you eliminate the noticeable flaws.
[This post first appeared on Adpushup and has been reproduced with permission.]