Randomness and causality
All my life I have been a late developer, and it’s never really bothered me. I was close to being the shortest kid in my school year until I reached 16, and then suddenly grew to one of the tallest. I founded ClickTracks long after other web analytics products seemed to make the market space impenetrable, and yet ClickTracks established a firm position. While there is such a thing as first mover advantage, more often than people give credit you’re actually better off examining the successes and failures of others and learning before you move.
I therefore hope a recent discovery of mine will be seen as late mover advantage, and not that I am hopelessly behind the times. I was browsing a bookshop and picked up, at random, a book named ‘Fooled By Randomness’ by Nassim Nicholas Taleb. I have been enraptured by the insight, eloquence and sheer intellect of this author. If you’re reading this article then I assume you need to make rational, data driven choices, in which case you simply must read this book now. If you’ve already read it, forgive me.
In common with other experts, I have become adept at listening to an opinion or fact and repackaging it as my own in front of a slightly difference audience. During my tenure at ClickTracks I worked very closely with Dr. Stephen Turner, who counseled me on probabilities, statistics and simple math, who steered the architecture and algorithms of web analytics so that results would be correct and easy to understand. Dr. Turner also patiently counseled me over such important questions as statistical significance, causality and data overload. Many will know that I also drank liberally from the well of Edward Tufte, but more on that later.
‘Fooled By Randomness’ has proved an Eldorado* of fundamental thinking that is so profound it could change how you invest your 401k. At a minimum it has important implications for how you apply the discipline of web analytics. In this and following articles I will borrow from Mr. Taleb’s excellent book, and promote the ideas as my own, though of course you should buy the book anyway.
Causality and exit pages
Many web analytics packages provide reports that list exit pages (the last page seen by a visitor before the session ends). Novice web analysts agonize over this data, seeing the sales that would have happened if only the visitor had not exited FROM THIS PARTICULAR PAGE. During the design of the ClickTracks tools, I wanted this report to be provided, since other web analytics tools offered it. Dr. Turner however convinced me that the report is useless because there is no causality between exiting and the page you exit from (except during the checkout, where of course the report is still useless because you know exactly those pages in advance). The exit pages report therefore is disabled in ClickTracks by default (a shameful compromise on my part. Some customer probably insisted on it being available or no sale; at least you have to dig around to find out how to switch it on). The concept of causality is summarized beautifully by Mr. Taleb on page 214: Suppose babies born at hospital A are 52% boys, and those at B 48%, in the same year. If your baby boy were born at hospital A you surely would not claim that hospital A caused this.
The same problem arises in web analytics. Just because people left the site at page X, doesn’t mean that page X caused them to exit. The site is complex, as are visitors’ emotions. They build a picture of your company and product over many many pages. Gradually (or rapidly) that pictures becomes something they don’t want, and they exit. It was not the last page that made them exit, but the entire experience leading up to and including it. I am of course assuming you don’t have nasty surprises for your visitors hidden on certain pages. Thus the exit pages is not just useless, it is potentially TOXIC INFORMATION (to borrow from Mr. Taleb again). The top exit pages are likely to be the same as the top pages. The fact people exit in high numbers from a certain page is because the page is popular, and people have to exit somewhere. The page itself does not act as a visitor repellent, it’s the entire site or product or group of pages or experience. If you’re concerned about exits, you need to look at what makes people stay, because that’s where the causality lies. More on this later.
*Forgive the historic references. When I read the book I was on vacation in Granada, southern Spain – the exact place where Isabella and Ferdinand signed the contract with Columbus to chart the western route to the Indies .