<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>showmeanalytics.com &#187; cookies</title>
	<atom:link href="http://showmeanalytics.com/tag/cookies/feed/" rel="self" type="application/rss+xml" />
	<link>http://showmeanalytics.com</link>
	<description>Analytics from the Show Me State</description>
	<lastBuildDate>Wed, 05 May 2010 23:56:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Estimating the effects of cookie-deletion</title>
		<link>http://showmeanalytics.com/2009/04/calculating-the-effects-of-cookie-deletion/</link>
		<comments>http://showmeanalytics.com/2009/04/calculating-the-effects-of-cookie-deletion/#comments</comments>
		<pubDate>Fri, 10 Apr 2009 11:59:43 +0000</pubDate>
		<dc:creator>angie</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[cookie-deletion]]></category>
		<category><![CDATA[cookies]]></category>
		<category><![CDATA[unique visitors]]></category>

		<guid isPermaLink="false">http://showmeanalytics.com/?p=75</guid>
		<description><![CDATA[There are differing opinions on how to label the metric historically known as &#8220;Unique Visitors&#8221;. On one side of the fence are those who think it should be relabeled &#8220;Unique Cookies&#8221;, since that is the most popular method used for calculations. On the other side of the fence are others who think the metric is [...]]]></description>
			<content:encoded><![CDATA[<p>There are differing opinions on how to label the metric historically known as &#8220;Unique Visitors&#8221;. On one side of the fence are those who think it should be relabeled &#8220;Unique Cookies&#8221;, since that is the most popular method used for calculations. On the other side of the fence are others who think the metric is a catch-all for the &#8220;best available&#8221; measurement (authenticated visitors, cookies if those aren&#8217;t available, IP/UA combination if neither is available) and should be replaced with a different term if/when a better, standardized way to measure people comes along. What we all agree on, though, is that a Unique Visitor metric measured with cookies is terribly inaccurate.</p>
<p>How bad is it? As usual, it depends. A site where people tend to visit on a daily basis will see more inflation from cookie-deletion than a site that is only visited once per month. In the former example, one person may count toward the monthly total as many as 30 or so times, while in the latter, even a frequent cookie-deleter would only count once.</p>
<p>Let&#8217;s pretend that we know something about the actual people visiting a site, and see if we can determine by how much our web analytics numbers might be affected by cookie-deletion.</p>
<p>In order to make the calculations easy, I have assumed people only visit or delete their cookies on daily, weekly, or monthly boundaries, and I am only considering a one month time frame. However, the same logic could be applied to more granular data. It ultimately boils down to a matrix algebra problem, but I doubt many of us are eager to get into that level of detail.</p>
<p><strong>An example</strong></p>
<p><a href="http://showmeanalytics.com/wp-content/uploads/2009/04/crowd1.jpg"><img src="http://showmeanalytics.com/wp-content/uploads/2009/04/crowd1.jpg" alt="crowd1" title="crowd1" width="200" height="150" class="alignright size-full wp-image-90" /></a>Consider 10,000 people:  not 10,000 &#8220;cookies&#8221; and not 10,000 &#8220;unique visitors,&#8221; but 10,000 real-life, carbon-based beings. Suppose we are able to observe these people in such a way that we know &#8220;the truth&#8221; about their online behavior. Suppose also we have observed that, on average, 10% of our people delete their cookies every day, 15% delete once per week, and the remainder delete their cookies monthly or less frequently.</p>
<p>We have also observed that 20% of these people visit our website every day, 30% visit once per week, and the remainder only visit once in a given month. There&#8217;s no reason for these people to login to our website &#8212; all visits are anonymous &#8212; and we count Unique Visitors using a cookie.</p>
<p><strong>How bad is it?</strong></p>
<p>Our first step is to find out how many different cookies each person will receive over the course of the month, based on the number of times they visit our site and how often they delete their cookies. For simplicity&#8217;s sake, we&#8217;ll assume each month has 30 days and 4 weeks.</p>
<p>Daily deleters will receive a different cookie each time they visit, so daily visitors will log 30 cookies and weekly visitors log 4.  Weekly deleters who visit every day will log 4 different cookies over the course of the month, as will weekly deleters who visit once per week. Everybody else&#8217;s activity will be logged with one cookie. We can summarize as shown below.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="160" valign="top"><a name="OLE_LINK1">One month&#8217;s time&#8230;</a></td>
<td width="160" valign="top"><em>Daily   deleters</em></td>
<td width="160" valign="top"><em>Weekly   deleters</em></td>
<td width="160" valign="top"><em>Monthly   deleters</em></td>
</tr>
<tr>
<td width="160" valign="top"><em>Visit   every day </em></td>
<td width="160" valign="top">30 cookies</td>
<td width="160" valign="top">4 cookies</td>
<td width="160" valign="top">1 cookie</td>
</tr>
<tr>
<td width="160" valign="top"><em>Visit   once/week </em></td>
<td width="160" valign="top">4 cookies</td>
<td width="160" valign="top">4 cookies</td>
<td width="160" valign="top">1 cookie</td>
</tr>
<tr>
<td width="160" valign="top"><em>Visit   once/month</em></td>
<td width="160" valign="top">1 cookie</td>
<td width="160" valign="top">1 cookie</td>
<td width="160" valign="top">1 cookie</td>
</tr>
</tbody>
</table>
<p>Using the above factors, we can determine the cookie-contribution from each set of people:</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="157" valign="top">(Monthly Calculations)</td>
<td width="134" valign="top">Delete daily (10%)</td>
<td width="134" valign="top">Delete weekly (15%)</td>
<td width="134" valign="top">Delete monthly (75%)</td>
<td width="79" valign="top"># Cookies</td>
</tr>
<tr>
<td width="157" valign="top">Visit daily (2000)</td>
<td width="134" valign="top">2000 x 10% x 30 = 6000</td>
<td width="134" valign="top">2000 x 15% x  4 =   1200</td>
<td width="134" valign="top">2000 x 75% x 1 = 1500</td>
<td width="79" valign="top">8700</td>
</tr>
<tr>
<td width="157" valign="top">Visit weekly (3000)</td>
<td width="134" valign="top">3000 x 10% x 4 = 1200</td>
<td width="134" valign="top">3000 x 15% x 4 = 1800</td>
<td width="134" valign="top">2250 x 75% x 1 = 2250</td>
<td width="79" valign="top">5250</td>
</tr>
<tr>
<td width="157" valign="top">Visit monthly (5000)</td>
<td width="134" valign="top">5000 x 10% x 1 = 500</td>
<td width="134" valign="top">5000 x 15% x 1 = 750</td>
<td width="134" valign="top">5000 x 75% x 1 = 3750</td>
<td width="79" valign="top">5000</td>
</tr>
<tr>
<td width="157" valign="top"></td>
<td width="134" valign="top">7700</td>
<td width="134" valign="top">3750</td>
<td width="134" valign="top">7500</td>
<td width="79" valign="top"><strong>18,950</strong></td>
</tr>
</tbody>
</table>
<p>Wow. Our 10,000 people are being represented as 18,950 unique visitors: the Unique Visitors number is inflated by 90%!</p>
<p><strong>Visitor loyalty reports are affected, too</strong></p>
<p>Unique Visitors isn&#8217;t the only number that&#8217;s affected by cookie-deletion. Any visitor-based number is going to be off, so you have a lot of trouble understanding visitor loyalty. You can tell when your efforts to improve loyalty are working, since the numbers will move in the right direction, but the <em>magnitude</em> of change will be misleading.</p>
<p>For the above example, we know that 20% of our people visited daily, 30% visited weekly, and 50% visited monthly, so the number of visits in a month works out to 77,000 (2000 x 30 + 3000 x 4 + 5000). Visits aren&#8217;t affected by cookie-deletion to any great extent (a good argument for visit-based analysis!) &#8211; so our tool will also report 77,000 visits.</p>
<p>This means our <em>people</em> averaged 7.70 visits (77000/10000) over the course of the month, but our web analytics tool will only report 4.06 (77000/18950) visits per visitor because the visitors are inflated. Our average visits per visitor are under-reported by 47%!</p>
<p>If you prefer to view loyalty using a histogram (# visitors who visited once, twice, three times, etc.), then we need to determine to which bin each visitor&#8217;s cookies will be credited.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="139" valign="top">(Monthly histogram)</td>
<td width="162" valign="top"><em>Delete daily</em></td>
<td width="162" valign="top"><em>Delete weekly</em></td>
<td width="162" valign="top"><em>Delete monthly</em></td>
</tr>
<tr>
<td width="139" valign="top"><em>Visit daily</em></td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top">Each cookie is seen 7 times in a month</td>
<td width="162" valign="top">Each cookie is seen 30 times (every day)</td>
</tr>
<tr>
<td width="139" valign="top"><em>Visit weekly</em></td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top">Each cookie is seen 4 times in a month</td>
</tr>
<tr>
<td width="139" valign="top"><em>Visit monthly</em></td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
<td width="162" valign="top" bgcolor="#ffffcc">Each cookie is logged only once</td>
</tr>
</tbody>
</table>
<p>The first thing that jumps out of this table is that <em>the majority of the cookies are only encountered once, regardless of how many times someone actually visited</em>. This explains why visitor loyalty graphs, regardless of the tool used, are often overloaded with so many one-time visitors.</p>
<div id="attachment_78" class="wp-caption aligncenter" style="width: 681px"><img class="size-full wp-image-78" title="visitor duration graphs" src="http://showmeanalytics.com/wp-content/uploads/2009/04/duration_graphs.jpg" alt="visitor duration graphs" width="671" height="289" /><p class="wp-caption-text">visitor duration graphs</p></div>
<p>Applying the frequencies in the histogram table to the numbers in the calculations table show us how our visitor retention graph is affected by cookie-deletion.</p>
<div id="attachment_79" class="wp-caption aligncenter" style="width: 538px"><img class="size-full wp-image-79" title="frequency corrections" src="http://showmeanalytics.com/wp-content/uploads/2009/04/frequency_correction.jpg" alt="visitor frequency corrections" width="528" height="77" /><p class="wp-caption-text">visitor frequency corrections</p></div>
<p>Again, wow! While 20% of our people visited the site every day, with cookie-based visitor counting, only 8% appear in this super-loyal segment. The majority of the &#8220;visitors&#8221; that were added to the site via cookie-deletion appear in the 1 visit bin, inflating that number by almost a factor of 3.</p>
<p><strong>Adding Authentication</strong></p>
<p>If 100% of the people to the above site authenticated, and the authenticated visitor identifier were used to count unique visitors, then the number would be pretty accurate (ignoring shared logins, etc.). But most sites don&#8217;t require authentication to see certain pages, so the likelihood of 100% authentication is low except in special cases, like intranets.</p>
<p>For our example above, what if half the visitors authenticated? Half of the 10,000 people would be more-or-less accurately represented, while our cookie-deletion calculations would apply to the remaining 5,000. The unique visitor multiplier factor decreases with increasing percentage of authenticated people.</p>
<div id="attachment_86" class="wp-caption aligncenter" style="width: 493px"><a href="http://showmeanalytics.com/wp-content/uploads/2009/04/uv_graph.jpg"><img class="size-full wp-image-86" title="uv_graph" src="http://showmeanalytics.com/wp-content/uploads/2009/04/uv_graph.jpg" alt="Assumed: 10/15/75 cookie deletion and 20/30/50 visiting frequency (daily/weekly/monthly)." width="483" height="291" /></a><p class="wp-caption-text">Assumed: 10/15/75 cookie deletion and 20/30/50 visiting frequency (daily/weekly/monthly).</p></div>
<p><strong>Is it always that bad?</strong></p>
<p>Not necessarily, it could be worse or it could be better. The above examples assumed that 20% of people visited the website every day, and 30% visited weekly. This was an arbitrary example meant to make calculations easier. In real life, you may have far fewer daily visitors (or more, it just depends on the site). Running the numbers assuming 5% daily and 50% weekly visitors, for example, results in a unique visitor inflation of 1.5 instead of the 1.9 calculated in our example.</p>
<p>I&#8217;ve attached <a href='http://showmeanalytics.com/wp-content/uploads/2009/04/abb_cookie_deletion_200904.xlsx'>my spreadsheet</a> so you can run your own what-ifs.</p>
]]></content:encoded>
			<wfw:commentRss>http://showmeanalytics.com/2009/04/calculating-the-effects-of-cookie-deletion/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>
