<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>showmeanalytics.com &#187; Analytics</title>
	<atom:link href="http://showmeanalytics.com/category/analytics/feed/" rel="self" type="application/rss+xml" />
	<link>http://showmeanalytics.com</link>
	<description>Analytics from the Show Me State</description>
	<lastBuildDate>Wed, 01 Sep 2010 11:42:57 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>You say I&#8217;m engaged, I say you&#8217;re wasting my time</title>
		<link>http://showmeanalytics.com/2010/02/you-say-im-engaged-i-say-youre-wasting-my-time/</link>
		<comments>http://showmeanalytics.com/2010/02/you-say-im-engaged-i-say-youre-wasting-my-time/#comments</comments>
		<pubDate>Fri, 05 Feb 2010 04:36:02 +0000</pubDate>
		<dc:creator>angie</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[engagement]]></category>

		<guid isPermaLink="false">http://showmeanalytics.com/?p=116</guid>
		<description><![CDATA[For content sites, web analysts often look at engagement-related metrics to try to assess whether or not visitors are having a successful visit. After all, there is no transaction like a purchase, to tell us that something &#8220;good&#8221; happened, if not for our visitor then at least for our business. We may look at metrics [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://showmeanalytics.com/wp-content/uploads/2010/02/ring_small.jpg"><img class="alignright size-full wp-image-120" title="ring" src="http://showmeanalytics.com/wp-content/uploads/2010/02/ring_small.jpg" alt="ring" width="250" height="243" /></a>For content sites, web analysts often look at engagement-related metrics to try to assess whether or not visitors are having a successful visit. After all, there is no transaction like a purchase, to tell us that something &#8220;good&#8221; happened, if not for our visitor then at least for our business. We may look at metrics like time on site, content pages viewed per visit, the ratio of navigation page views to content page views, and micro-conversions like viewing a print-ready version of an article, or emailing a link to a colleague.</p>
<p>I&#8217;ve always suspected there is a fine line between engaging people and wasting their time, especially when you&#8217;re dealing with B2B sites. After all, when I&#8217;m looking for information as part of my workday, I don&#8217;t want to spend a lot of time on a site. I don&#8217;t want to view a lot of pages: I want to find the answer to my question right away. Even micro-conversions don&#8217;t necessarily mean that my visit was successful: maybe I&#8217;m printing out pages or emailing myself a link because I don&#8217;t have time to wade through the confusion right now.</p>
<p>I was recently doing an analysis for a site, and was curious about the &#8220;engagement&#8221; level of two important customer segments. I looked at time on site, content pages per visit, bounce rate, navigation to content ratio (i.e. for a ratio of 2, it would mean that on average, for every content page viewed, visitors must view 2 navigation pages), % visits that contained email to friend actions, and % visits that contained at least one print-ready view. I couldn&#8217;t factor  in other potential engagement/loyalty metrics (visit frequency, etc.) because of the large number of shared computers and accounts for this particular site, which is heavily used in a workplace setting. Here&#8217;s what I found.</p>
<ul>
<li>Segment B had half the bounce rate of Segment A (although both were pretty low).</li>
<li>Segment B spent 30% more time on the site.</li>
<li>Segment B viewed 30% more content.</li>
<li>Segment B viewed print-ready pages in 4% more visits, close enough that I&#8217;d consider their usage of this function to be roughly equivalent.</li>
<li>Segment B needed to go through 8% fewer navigational pages in order to find content.</li>
<li>Usage of the email function was similar for both segments, just slightly higher for Segment A.</li>
</ul>
<p>By all measures except the two conversions, I would have considered Segment B to be a good bit more &#8220;engaged&#8221; than Segment A. Even on the print and email conversions they were roughly the same. But my satisfaction surveys told a different story. I used the responses to our &#8220;Were you able to find the information you were looking for?&#8221; question to double-check overall satisfaction scores. What did I find?</p>
<ul>
<li>Segment B scored 2 points <em>lower </em>than Segment A on overall satisfaction (on a 0-100 scale).</li>
<li>Segment B respondents were 7% less likely to answer &#8220;Yes&#8221; to the question about whether they found what they are looking for.</li>
</ul>
<p>I don&#8217;t know if the above differences are statistically significant, but what I absolutely do know is that higher performance on engagement-related metrics did not mean that Segment B customers were happier or more satisfied with the site. If anything, it&#8217;s just the opposite.</p>
<p>The best thing you can do for a content site (and other sites, too, IMHO) is to install continuous surveys. That&#8217;s the only way you will ever be able to assess the quality of your visitors&#8217; experience online. If you can&#8217;t afford one of the for-pay surveys that can be well-customized (from <a href="http://www.foreseeresults.com/">ForeSee Results</a> or <a href="http://www.iperceptions.com/">iPerceptions</a>), there are still a number of customer satisfaction tools, like <a href="http://www.4qsurvey.com/">4Q </a>and <a href="http://www.kampyle.com/">Kampyle</a>, that allow you to ask your customers straight out whether or not their visit was successful.</p>
]]></content:encoded>
			<wfw:commentRss>http://showmeanalytics.com/2010/02/you-say-im-engaged-i-say-youre-wasting-my-time/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>One visit, two user agents</title>
		<link>http://showmeanalytics.com/2009/07/one-visit-two-user-agents/</link>
		<comments>http://showmeanalytics.com/2009/07/one-visit-two-user-agents/#comments</comments>
		<pubDate>Tue, 14 Jul 2009 12:50:58 +0000</pubDate>
		<dc:creator>angie</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[browsers]]></category>
		<category><![CDATA[Logfiles]]></category>
		<category><![CDATA[visits]]></category>

		<guid isPermaLink="false">http://showmeanalytics.com/?p=110</guid>
		<description><![CDATA[I found out recently that visitors using Internet Explorer 8 on a site that is not compatible with that browser, can exhibit multiple user agent strings during one visit. This is because of a compatibility view provided in IE8 that makes it look and act mostly (but not exactly) like IE7, for sites that don’t [...]]]></description>
			<content:encoded><![CDATA[<p>I found out recently that visitors using Internet Explorer 8 on a site that is not compatible with that browser, can exhibit multiple user agent strings during one visit. This is because of a <a href="http://blogs.msdn.com/ie/archive/2008/08/27/introducing-compatibility-view.aspx">compatibility view</a> provided in IE8 that makes it look and act mostly (<a href="http://blogs.msdn.com/ie/archive/2009/03/12/site-compatibility-and-ie8.aspx">but not exactly</a>) like IE7, for sites that don’t play nicely with the newer browser.  If you are trying to provide a proper browser breakdown in support of a site redesign, or if you are troubleshooting browser-related data or user problems, the compatibility view will complicate things.</p>
<p>I assume that most web analytics tools identify the IE version by looking for <em>MSIE X.Y</em> in the browser string. However, this is no longer valid for IE8. This is because the IE8 user agent string will include <em>MSIE 7.0</em> when in compatibility mode. The difference between the “real” IE7, and IE8 in compatibility mode is the word <em>Trident</em>, which is included in both variants of IE8:</p>
<p><em>Example of a regular IE8 user agent: </em>Mozilla/4.0 (compatible; <strong>MSIE 8.0</strong>; Windows NT 6.0; <strong>Trident</strong>/4.0; SLCC1; Media Center PC 5.0; .NET CLR 3.5.21022)</p>
<p><em>Example of IE8 in compatibility mode:</em> Mozilla/4.0 (compatible; <strong>MSIE 7.0</strong>; Windows NT 6.0; <strong>Trident</strong>/4.0; SLCC1; Media Center PC 5.0; .NET CLR 3.5.21022)</p>
<p>Literally thousands of web sites are not compatible with IE8. A list of <a href="http://www.microsoft.com/downloads/thankyou.aspx?familyId=b885e621-91b7-432d-8175-a745b87d2588&amp;displayLang=en">more than 3,000 incompatible sites</a> is maintained by Microsoft.  This list can be downloaded by IE8 users so that the browser can automatically switch itself into compatibility view when a site is encountered that has previously been identified by IE8 users as incompatible. Many more sites are not compatible, but are not on the list because they have lower traffic levels.</p>
<p>Because a visitor can have multiple user agents in one visit, this raises a number of questions:</p>
<ul>
<li>Does your analytics tool keep the user agent string from each individual page view, or do they associate one browser with the entire visit?</li>
<li>If browser is associated with the entire visit, which browser is recorded? If they keep the string on the entry page, then IE8 is likely represented correctly in your data, but you won’t know if users are resorting to compatibility mode in order to view your site. If your analytics tool keeps the last browser string encountered in the visit, then your numbers are likely biased toward IE7 unless your tool is properly grouping this traffic as IE8.</li>
<li>If browser is associated with page views instead of the visit, then adding up visits in your browser report would give you more than the total visits for your site. In other words, browser visits would not be “summable” the way they are when one can assume that each visit has only one browser. This is not the end of the world, just something to be aware of because it’s not intuitive.</li>
<li>Does your analytics tool properly group the browsers with both <em>MSIE 7.0</em> and <em>Trident</em> as IE8? If not, do they expose the entire string so you can do the calculations yourself to see if your site has IE8 issues?</li>
<li>If you are doing logfile analysis without cookies, sessionization is probably based on IP + User Agent. For sites where I’ve transitioned from logfiles to tags in the same tool, my experience has been that IP/User Agent sessionization tends to over-count visits: this issue will increase that inflation even more. Bear in mind that many tag-based tools resort to IP/UA when cookies are blocked, so there could be a small inflation effect regardless of the type of data-collection you use.</li>
</ul>
<p>I examined a few of my sites and found the percentage of visits with IE8 to be roughly between 5% and 15%, depending on the site. My B2B sites tend to have lower IE8 penetration, while sites that attract high-tech users will tend to show a higher percentage of the latest browsers.</p>
<p>If your web analytics tool exposes the entire browser string (Google Analytics does not), I recommend you search through your user agent strings looking for <em>Trident</em>, and see for yourself if this is an issue for the sites you analyze. One metric I’m looking at is the percentage of my <em>Trident</em> browser visits that also contain <em>MSIE 7</em>, assuming that sites that are not compatible with IE8 will show a higher percentage of users resorting to compatibility mode. For a site with known IE8 issues I calculated 25% , while another site I randomly chose calculated to 12%. I haven’t examined enough sites yet to know if that means the second site also has IE8 issues, or if it just means it&#8217;s &#8220;normal&#8221; for a certain percentage of IE8 users to surf in compatibility mode. Clearly I have more work to do.</p>
<p><strong>Update</strong>: Last night I received an email from a colleague who had read this post, asking why should they care? It&#8217;s a fair question so I thought I&#8217;d answer it publicly.</p>
<p>First, if you&#8217;re asking then you probably aren&#8217;t in a situation where you need to care. That&#8217;s OK: the lowly browser report isn&#8217;t the most important report in your web analytics tool, not by a long shot.</p>
<p>But I can think of a couple of situations where it&#8217;s important:</p>
<p>1. When deciding whether or not to fund development changes to enable compatibility with certain browsers, &#8220;fewer than 5% of our visits use that browser&#8221; is a lot different than &#8220;nearly 10% of our visits use that browser&#8221;.  The numbers you use for those decisions should be as accurate as practical.</p>
<p>2. Your customer service department may receive emails or phone calls from visitors complaining that they are unable to perform certain tasks on your site (like complete a transaction). When they receive multiple complaints that sound similar but are unable to reproduce the problem in house they may ask you, the analytics ninja, for help defining the scope of the problem. These intermittent issues are difficult to troubleshoot because they&#8217;re often environment-related. One starting point is to examine the user experience through that transaction &#8212; transaction page views per visit is sometimes sufficient, or you may want to look at a funnel chart for the process &#8212; and segment it by different browser versions. If the issue is due to a browser incompatibility, you can sometimes pinpoint it quickly with this type of analysis.</p>
]]></content:encoded>
			<wfw:commentRss>http://showmeanalytics.com/2009/07/one-visit-two-user-agents/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Perverts Make My Job Interesting</title>
		<link>http://showmeanalytics.com/2009/07/perverts-make-my-job-interesting/</link>
		<comments>http://showmeanalytics.com/2009/07/perverts-make-my-job-interesting/#comments</comments>
		<pubDate>Sun, 05 Jul 2009 21:57:58 +0000</pubDate>
		<dc:creator>angie</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Logfiles]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[search keywords]]></category>

		<guid isPermaLink="false">http://showmeanalytics.com/?p=103</guid>
		<description><![CDATA[If you are a web analyst, and you have ever had to Google “zoo porn” as part of your job, you would understand why I loathe the idea of targeted advertising based on user searches. The terms I’ve searched as part of my job have gotten me on the net-nanny list of every employer I’ve [...]]]></description>
			<content:encoded><![CDATA[<p>If you are a web analyst, and you have ever had to Google “zoo porn” as part of your job, you would understand why I loathe the idea of targeted advertising based on user searches. The terms I’ve searched as part of my job have gotten me on the net-nanny list of every employer I’ve had since working in this field. It’s the perverts: they really affect my data.</p>
<div id="attachment_106" class="wp-caption aligncenter" style="width: 451px"><a href="http://showmeanalytics.com/wp-content/uploads/2009/07/fark.jpg"><img class="size-full wp-image-106" title="Screenshot: www.fark.com" src="http://showmeanalytics.com/wp-content/uploads/2009/07/fark.jpg" alt="Screenshot: www.fark.com" width="441" height="66" /></a><p class="wp-caption-text">If Fark is to be believed, the Internet is all about porn anyway.</p></div>
<p style="text-align: center;">
<p>For the record, I don’t analyze porn sites for a living. While I admit I have done analysis for at least one adult-oriented site in the past, this is different. This is the effect of sexually-oriented search terms on websites that have little or nothing to do with sex, websites that I would happily show to my mother. But if you analyze a wide enough variety of sites, you will find that fetishes come in a surprising variety of shapes and sizes, and you’ll be surprised where they, um, pop up.</p>
<p>There are three ways that these “thrill-seekers” may affect your data.</p>
<p style="padding-left: 30px;">1. <strong>By causing a one-time traffic spike</strong>. This is more likely to happen for a blog or a news site, when an article mentions something sexual in a fairly innocuous way. For example, this article contains plenty of keywords that may attract traffic that is not part of my target audience (and if you haven’t bounced by now, welcome to the world of web analytics!). This can happen on news or magazine sites that run features on a variety of subjects, and it can often catch the web analyst off guard. For example, consider the more-or-less legitimate &#8212; if somewhat sensational &#8212; news articles that were all the rage a couple months ago, talking about teens sending naked pictures of themselves to each other on cell phones. When you mention “teens” and “sex” and “naked pictures”  in the same article, you’re bound to attract some of <em>that</em> kind of traffic.</p>
<p style="padding-left: 30px;">This usually only becomes an issue when the traffic spike for a single article is large enough to influence aggregate numbers for the entire week or month. Any sudden spike (or dip) in traffic should always be investigated: it may have been due to a simple editorial choice instead of that awesome marketing campaign that your HiPPO designed.</p>
<p style="padding-left: 30px;">2. <strong>By inflating search engine visits long-term</strong>. Perhaps “inflating” isn’t the best term, since the traffic is real, it’s human, and it’s coming from search engines. This situation happens when there are articles or images on your site that are intended for one audience but end up attracting another audience – the kind that’s not likely to become a customer – and it can wreak havoc with your conversion rates. A prime example is a site that publishes medical information intended for a professional medical audience. A thorough enough site will likely contain pictures of certain body parts or descriptions of rare medical procedures, and a glance through some of your top search terms can yield insights into the human psyche that you wish you didn’t know.</p>
<p style="padding-left: 30px;">Always look past the “Top X” keyword report that is spit out of your web analytics package by default. Look for terms that seem over-represented on a site like yours. Pay careful attention to image searches, and ensure that you can separate image search keywords from text search keywords if necessary.</p>
<p style="padding-left: 30px;">3. <strong>By logging visits that never really happened</strong>. This is fairly rare, and you will likely only catch it if a) your analytics are based on server logs instead of JavaScript tags, and b) your site contains one or more unprotected redirect URLs, “pages” that contain a URL as a value in the query string. The symptom is a sudden appearance in your keyword reports of sexually-oriented phrases that have absolutely nothing to do with your site. The cause is a search engine ranking hack, where a site-of-ill-repute manages to get themselves indexed by means of your redirect URLs, using your site’s good reputation to increase their rankings. You can confirm by looking at the entry pages for the offending terms to see if they are the redirect pages.</p>
<p>As with any traffic that is obviously unqualified, you very likely want to segment out the perverts from some of your conversion rate calculations, especially if you are doing optimization efforts on one or more areas of your site. Unqualified traffic volume can be more than enough to skew results and mask changes to real customer behavior. However, I don’t recommend you filter this traffic from your entire data set. If your linking, advertising, or SEO efforts are bringing in the wrong kind of traffic, this is something you really need to know.</p>
]]></content:encoded>
			<wfw:commentRss>http://showmeanalytics.com/2009/07/perverts-make-my-job-interesting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Estimating the effects of cookie-deletion</title>
		<link>http://showmeanalytics.com/2009/04/calculating-the-effects-of-cookie-deletion/</link>
		<comments>http://showmeanalytics.com/2009/04/calculating-the-effects-of-cookie-deletion/#comments</comments>
		<pubDate>Fri, 10 Apr 2009 11:59:43 +0000</pubDate>
		<dc:creator>angie</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[cookie-deletion]]></category>
		<category><![CDATA[cookies]]></category>
		<category><![CDATA[unique visitors]]></category>

		<guid isPermaLink="false">http://showmeanalytics.com/?p=75</guid>
		<description><![CDATA[There are differing opinions on how to label the metric historically known as &#8220;Unique Visitors&#8221;. On one side of the fence are those who think it should be relabeled &#8220;Unique Cookies&#8221;, since that is the most popular method used for calculations. On the other side of the fence are others who think the metric is [...]]]></description>
			<content:encoded><![CDATA[<p>There are differing opinions on how to label the metric historically known as &#8220;Unique Visitors&#8221;. On one side of the fence are those who think it should be relabeled &#8220;Unique Cookies&#8221;, since that is the most popular method used for calculations. On the other side of the fence are others who think the metric is a catch-all for the &#8220;best available&#8221; measurement (authenticated visitors, cookies if those aren&#8217;t available, IP/UA combination if neither is available) and should be replaced with a different term if/when a better, standardized way to measure people comes along. What we all agree on, though, is that a Unique Visitor metric measured with cookies is terribly inaccurate.</p>
<p>How bad is it? As usual, it depends. A site where people tend to visit on a daily basis will see more inflation from cookie-deletion than a site that is only visited once per month. In the former example, one person may count toward the monthly total as many as 30 or so times, while in the latter, even a frequent cookie-deleter would only count once.</p>
<p>Let&#8217;s pretend that we know something about the actual people visiting a site, and see if we can determine by how much our web analytics numbers might be affected by cookie-deletion.</p>
<p>In order to make the calculations easy, I have assumed people only visit or delete their cookies on daily, weekly, or monthly boundaries, and I am only considering a one month time frame. However, the same logic could be applied to more granular data. It ultimately boils down to a matrix algebra problem, but I doubt many of us are eager to get into that level of detail.</p>
<p><strong>An example</strong></p>
<p><a href="http://showmeanalytics.com/wp-content/uploads/2009/04/crowd1.jpg"><img src="http://showmeanalytics.com/wp-content/uploads/2009/04/crowd1.jpg" alt="crowd1" title="crowd1" width="200" height="150" class="alignright size-full wp-image-90" /></a>Consider 10,000 people:  not 10,000 &#8220;cookies&#8221; and not 10,000 &#8220;unique visitors,&#8221; but 10,000 real-life, carbon-based beings. Suppose we are able to observe these people in such a way that we know &#8220;the truth&#8221; about their online behavior. Suppose also we have observed that, on average, 10% of our people delete their cookies every day, 15% delete once per week, and the remainder delete their cookies monthly or less frequently.</p>
<p>We have also observed that 20% of these people visit our website every day, 30% visit once per week, and the remainder only visit once in a given month. There&#8217;s no reason for these people to login to our website &#8212; all visits are anonymous &#8212; and we count Unique Visitors using a cookie.</p>
<p><strong>How bad is it?</strong></p>
<p>Our first step is to find out how many different cookies each person will receive over the course of the month, based on the number of times they visit our site and how often they delete their cookies. For simplicity&#8217;s sake, we&#8217;ll assume each month has 30 days and 4 weeks.</p>
<p>Daily deleters will receive a different cookie each time they visit, so daily visitors will log 30 cookies and weekly visitors log 4.  Weekly deleters who visit every day will log 4 different cookies over the course of the month, as will weekly deleters who visit once per week. Everybody else&#8217;s activity will be logged with one cookie. We can summarize as shown below.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="160" valign="top"><a name="OLE_LINK1">One month&#8217;s time&#8230;</a></td>
<td width="160" valign="top"><em>Daily   deleters</em></td>
<td width="160" valign="top"><em>Weekly   deleters</em></td>
<td width="160" valign="top"><em>Monthly   deleters</em></td>
</tr>
<tr>
<td width="160" valign="top"><em>Visit   every day </em></td>
<td width="160" valign="top">30 cookies</td>
<td width="160" valign="top">4 cookies</td>
<td width="160" valign="top">1 cookie</td>
</tr>
<tr>
<td width="160" valign="top"><em>Visit   once/week </em></td>
<td width="160" valign="top">4 cookies</td>
<td width="160" valign="top">4 cookies</td>
<td width="160" valign="top">1 cookie</td>
</tr>
<tr>
<td width="160" valign="top"><em>Visit   once/month</em></td>
<td width="160" valign="top">1 cookie</td>
<td width="160" valign="top">1 cookie</td>
<td width="160" valign="top">1 cookie</td>
</tr>
</tbody>
</table>
<p>Using the above factors, we can determine the cookie-contribution from each set of people:</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="157" valign="top">(Monthly Calculations)</td>
<td width="134" valign="top">Delete daily (10%)</td>
<td width="134" valign="top">Delete weekly (15%)</td>
<td width="134" valign="top">Delete monthly (75%)</td>
<td width="79" valign="top"># Cookies</td>
</tr>
<tr>
<td width="157" valign="top">Visit daily (2000)</td>
<td width="134" valign="top">2000 x 10% x 30 = 6000</td>
<td width="134" valign="top">2000 x 15% x  4 =   1200</td>
<td width="134" valign="top">2000 x 75% x 1 = 1500</td>
<td width="79" valign="top">8700</td>
</tr>
<tr>
<td width="157" valign="top">Visit weekly (3000)</td>
<td width="134" valign="top">3000 x 10% x 4 = 1200</td>
<td width="134" valign="top">3000 x 15% x 4 = 1800</td>
<td width="134" valign="top">2250 x 75% x 1 = 2250</td>
<td width="79" valign="top">5250</td>
</tr>
<tr>
<td width="157" valign="top">Visit monthly (5000)</td>
<td width="134" valign="top">5000 x 10% x 1 = 500</td>
<td width="134" valign="top">5000 x 15% x 1 = 750</td>
<td width="134" valign="top">5000 x 75% x 1 = 3750</td>
<td width="79" valign="top">5000</td>
</tr>
<tr>
<td width="157" valign="top"></td>
<td width="134" valign="top">7700</td>
<td width="134" valign="top">3750</td>
<td width="134" valign="top">7500</td>
<td width="79" valign="top"><strong>18,950</strong></td>
</tr>
</tbody>
</table>
<p>Wow. Our 10,000 people are being represented as 18,950 unique visitors: the Unique Visitors number is inflated by 90%!</p>
<p><strong>Visitor loyalty reports are affected, too</strong></p>
<p>Unique Visitors isn&#8217;t the only number that&#8217;s affected by cookie-deletion. Any visitor-based number is going to be off, so you have a lot of trouble understanding visitor loyalty. You can tell when your efforts to improve loyalty are working, since the numbers will move in the right direction, but the <em>magnitude</em> of change will be misleading.</p>
<p>For the above example, we know that 20% of our people visited daily, 30% visited weekly, and 50% visited monthly, so the number of visits in a month works out to 77,000 (2000 x 30 + 3000 x 4 + 5000). Visits aren&#8217;t affected by cookie-deletion to any great extent (a good argument for visit-based analysis!) &#8211; so our tool will also report 77,000 visits.</p>
<p>This means our <em>people</em> averaged 7.70 visits (77000/10000) over the course of the month, but our web analytics tool will only report 4.06 (77000/18950) visits per visitor because the visitors are inflated. Our average visits per visitor are under-reported by 47%!</p>
<p>If you prefer to view loyalty using a histogram (# visitors who visited once, twice, three times, etc.), then we need to determine to which bin each visitor&#8217;s cookies will be credited.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="139" valign="top">(Monthly histogram)</td>
<td width="162" valign="top"><em>Delete daily</em></td>
<td width="162" valign="top"><em>Delete weekly</em></td>
<td width="162" valign="top"><em>Delete monthly</em></td>
</tr>
<tr>
<td width="139" valign="top"><em>Visit daily</em></td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top">Each cookie is seen 7 times in a month</td>
<td width="162" valign="top">Each cookie is seen 30 times (every day)</td>
</tr>
<tr>
<td width="139" valign="top"><em>Visit weekly</em></td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top">Each cookie is seen 4 times in a month</td>
</tr>
<tr>
<td width="139" valign="top"><em>Visit monthly</em></td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
</tr>
</tbody>
</table>
<p>The first thing that jumps out of this table is that <em>the majority of the cookies are only encountered once, regardless of how many times someone actually visited</em>. This explains why visitor loyalty graphs, regardless of the tool used, are often overloaded with so many one-time visitors.</p>
<div id="attachment_78" class="wp-caption aligncenter" style="width: 681px"><img class="size-full wp-image-78" title="visitor duration graphs" src="http://showmeanalytics.com/wp-content/uploads/2009/04/duration_graphs.jpg" alt="visitor duration graphs" width="671" height="289" /><p class="wp-caption-text">visitor duration graphs</p></div>
<p>Applying the frequencies in the histogram table to the numbers in the calculations table show us how our visitor retention graph is affected by cookie-deletion.</p>
<div id="attachment_79" class="wp-caption aligncenter" style="width: 538px"><img class="size-full wp-image-79" title="frequency corrections" src="http://showmeanalytics.com/wp-content/uploads/2009/04/frequency_correction.jpg" alt="visitor frequency corrections" width="528" height="77" /><p class="wp-caption-text">visitor frequency corrections</p></div>
<p>Again, wow! While 20% of our people visited the site every day, with cookie-based visitor counting, only 8% appear in this super-loyal segment. The majority of the &#8220;visitors&#8221; that were added to the site via cookie-deletion appear in the 1 visit bin, inflating that number by almost a factor of 3.</p>
<p><strong>Adding Authentication</strong></p>
<p>If 100% of the people to the above site authenticated, and the authenticated visitor identifier were used to count unique visitors, then the number would be pretty accurate (ignoring shared logins, etc.). But most sites don&#8217;t require authentication to see certain pages, so the likelihood of 100% authentication is low except in special cases, like intranets.</p>
<p>For our example above, what if half the visitors authenticated? Half of the 10,000 people would be more-or-less accurately represented, while our cookie-deletion calculations would apply to the remaining 5,000. The unique visitor multiplier factor decreases with increasing percentage of authenticated people.</p>
<div id="attachment_86" class="wp-caption aligncenter" style="width: 493px"><a href="http://showmeanalytics.com/wp-content/uploads/2009/04/uv_graph.jpg"><img class="size-full wp-image-86" title="uv_graph" src="http://showmeanalytics.com/wp-content/uploads/2009/04/uv_graph.jpg" alt="Assumed: 10/15/75 cookie deletion and 20/30/50 visiting frequency (daily/weekly/monthly)." width="483" height="291" /></a><p class="wp-caption-text">Assumed: 10/15/75 cookie deletion and 20/30/50 visiting frequency (daily/weekly/monthly).</p></div>
<p><strong>Is it always that bad?</strong></p>
<p>Not necessarily, it could be worse or it could be better. The above examples assumed that 20% of people visited the website every day, and 30% visited weekly. This was an arbitrary example meant to make calculations easier. In real life, you may have far fewer daily visitors (or more, it just depends on the site). Running the numbers assuming 5% daily and 50% weekly visitors, for example, results in a unique visitor inflation of 1.5 instead of the 1.9 calculated in our example.</p>
<p>I&#8217;ve attached <a href='http://showmeanalytics.com/wp-content/uploads/2009/04/abb_cookie_deletion_200904.xlsx'>my spreadsheet</a> so you can run your own what-ifs.</p>
]]></content:encoded>
			<wfw:commentRss>http://showmeanalytics.com/2009/04/calculating-the-effects-of-cookie-deletion/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>
