Using Statistical Process Control to Analyze Web Activity
By David Shearer, Technical Director, Northwest Analytical
Evaluating web site activity is tricky business. Because web activity varies
constantly, the challenge is to separate everyday random variation from the
non-random changes due to marketing programs, web site design, or search engine
rank.
Failing to recognize the sources of variation can result in one of two errors:
- Random variation in activity is "explained" by a change in marketing
or the web site when no "real" change in activity occurred.
- A real change in activity is ignored.
Either of the two errors wastes resources and business opportunities.
Enter Statistical Process Control. SPC charts were specifically invented to
separate the two sources of variation. While control charts have been most commonly
used in manufacturing processes, they "work" with virtually any time
series. Additionally, they are easy to understand and use.
On a control chart, upper and lower control limits are automatically calculated
to separate common variation from special variation. Points inside the control
limits are due to common variation. Points outside the limits are due to special
causes.
To make the charts more sensitive to small, sustained shifts in level, "pattern
rules" are also used. An example rule is, "Eight consecutive points
above the central line signals a special cause."
If a shift in level is identified, a separate set of control limits can be
applied to pre- and post-shift data.
As an example, our firm, Northwest Analytical (NWA), charts the weekly number
of web sessions on the company site. After eliminating robot and virus (worm)
hits, we chart the count of sessions where three or more pages were hit. Our
SPC software product, NWA Quality Analyst, automatically extracts the data from
a SQL Server database of web logs and charts the data. All the user has to do
is enter the desired date range.
The control chart shown in Figure 1 displays the web sessions for January 30,
2000 through April 29, 2002:

The chart reveals the following:
- Out of control points around the holidays can be ignored; they occur every
year.
- NWA made a minor change to web site navigation in mid-May of 2001 ("Article
Button"). There was an immediate increase in (three or more page) sessions.
We think visitors are finding the articles through a search engine and are
now more likely to look at the rest of the site. We believe some single page
sessions have turned into multi-page sessions.
- Beginning January 2002, we saw a jump of over 600 sessions per week. Our
Google Ad Select began in early March and yields 170 sessions per week, but
doesn't explain the entire increase.
- The chart is very sensitive; it's easy to measure the effect of a marketing
campaign. We can detect a one or two week change of 1000 sessions, or a sustained
shift (five to eight weeks) of 200 sessions.

As shown in Figure 2, the hit rate on web site news articles (individual hits,
not sessions) shows the effect of our web site navigation change more dramatically.
There was an increase of 1000 article hits per week after the "Article
Button" was added. Since the start of February we have also noticed an
upward shift in article hits. The unusally high rates during June and August
2000 were due to external technical factors outside of our control.
Quality Analyst software can analyze web log data in other ways. The most popular
25 articles from January 27 to April 27 appear below in Figure 3:

As we can see, the most popular article is a Petrochemicals article.
Figure 4 is a control chart of the petrochemicals article hits. The article
is old (1999), but the number of hits increased in February 2002 from 200 per
week to about 300 per week. Is the petrochemicals sector looking up?

David Shearer is Technical Director at Northwest Analytical where he's responsible
for the statistical and analytical features of NWA's products. He is also a
primary instructor for the company's SPC seminars and provides statistical support
to customers. Dave is also certified for the operation of public firework displays.
He can be reached at .
|