Tracking down rogue bandwidth: a story of Comcast data caps and offsite backup

tl;dr: if you use BackBlaze and are subject to a data cap, you should keep an eye on BackBlaze to make sure it doesn’t quietly use up all your data allowance. I’ve switched to CrashPlan, which has better retention, encryption, and backup policies.

Beyond sharing this warning, I wanted to do a longer write-up of what I experienced because I found some pretty interesting things along the way – how to get true visibility into what’s happening on your local network, what’s up with Comcast’s usage meter, and what happens when something goes wrong with offsite backup with BackBlaze.

Earlier in the year Comcast announced they were introducing 1 terabyte data caps in 27 markets across the US (of which 18 previously did not have enforced caps). This is a pretty consumer-unfriendly move, and being in California this meant this was happening to me – but personally I had never come close to exceeding the cap, so wasn’t overly worried (and won’t be until 4K video becomes more common). In the second week of November 2016 I got a notification in my browser telling me I’d used 90% of my cap. This was surprising in two ways – one, I didn’t expect Comcast to hijack an http request to do this, and two, it was saying I’d used over 900gb of data in just over a week. I was about to head overseas for 8 days so I decided to shut everything down while I was away and figure out what was using the data once I got back.

While I was away, each day I saw my data continue to increase by approximately 80gb a day. I had a left my Mac Mini server running, a Dropcam, and a Nest and that was it. I checked remotely to ensure there were no other wireless clients running on my network, but couldn’t see anything. I used Activity Monitor to keep a track of how much bandwidth each process was using on the server, and saw BackBlaze (bztransmit) had transmitted ~50gb, but my uptime was 3 weeks at that point, and so that seemed about right. I was completely baffled – nothing else was being generated by the Mac Mini, and the amount of data uploaded by the Dropcam was also minimal, yet each day it kept increasing.

With no obvious culprits on my network, it seemed suspicious for me to be exceeding my cap just as they introduce caps, so my attention turned to whether Comcast’s usage reporting was accurate.

Comcast gets a lot of heat for their data caps, and in particular for their usage meter accuracy. When trying to use the meter to diagnose bandwidth issues, Comcast itself says it should not be relied on. They have a disclaimer on the meter saying it is “delayed by up to 24 hours” (and on the phone I was told it could lag for weeks). By the end of figuring all this out, I actually found their usage meter to be accurate in realtime, but others online report usage changing after the fact. I also found numerous people online complaining that their usage meter did not match their self-evaluated usage (some measured using tools such as using Gargoyle firmware on their router). However, most of these quibbles were off by 10% or so, not the orders of magnitude I was seeing. I was also unable to find well-proven cases of their meter being wildly off except for one instance where a MAC address had been entered with a single character typo. There was only one other example of Comcast rolling back fees for overages – but it only happened after the media became involved, and nobody technically proficient actually checked the network.

Given the likely accuracy, I was keen to find an inexpensive (and preferably software based, so I could do it remotely) method for measuring what was actually going through the cable modem. Most of the advice I found suggested buying a router that supported Gargoyle, but I also discovered that the Motorola SB6121 cable modem I use reports number of codewords per channel, aka bytes.

In the above example, you can see the aggregate codewords for the 4 channels is 275,455,328,290, which is 275.45 gigabytes. This was exactly as I would’ve expected. I remotely reset the cable modem (another quirk – Comcast removed the ability for me to do that myself on the cable modem through a modem update – I actually own the modem too) using the mobile app, and after doing so my traffic was measured in megabytes per day, which was what I expected. At this point I was baffled, as advice I’d seen repeated a few times online is to use the codewords as the source of truth, but they do NOT include uploads, which Comcast counts towards your cap.

At this point I didn’t know that, and since I had no idea how they were measuring so much traffic, I kept escalating with Comcast, eventually being told a level 3 support technician would contact me. That was a couple of weeks ago, and I never heard anything more from Comcast.

To be completely sure it wasn’t something on my end, my next step to diagnose this was to use SNMP logging to verify the amount of data leaving my network via the ethernet connection to the cable modem. SNMP is “simple network management protocol”, and is a standardized way of reporting and collating information about managed devices on a network. Routers with SNMP logging are able (amongst other things) to report exactly how much bandwidth is being consumed on any of its many interfaces (which are unhelpfully named, but a good description of each on an Airport Extreme is available here). Helpfully, mgi1 is the interface specifically for the WAN connection – i.e. I was able to measure very specifically every bit of traffic going to and from the cable modem. Unfortunately, Apple took this feature out of the 802.11ac Airport Extreme, which I used as my main router. As such I reconfigured my entire network to use my older router and an old Airport Express (which also includes SNMP logging).

At this point I had SNMP logging all my internal network bandwidth, and to visualize it I used PeakHour 3, which did a great job of making it very easy to see what was happening. At this point I finally had proof my Mac Mini really was uploading a LOT of data, and matched the Comcast usage. But Activity Monitor still did not show me that anything was out of sorts, so I still didn’t know WHAT was causing all the usage.

I researched what other tools I could use to monitor network traffic by process, and I found Little Snitch. Little Snitch lets you police all network requests, and approve or deny them, which is pretty nifty, but all I needed was the monitoring tool. This let me see that bztransmit was happily uploading at 5-10mbps in bursts every few minutes, and the cumulative from this process matched exactly the upload traffic seen in the SNMP logging. Leaving it for even just a few hours, it was clear this was the culprit.

I throttled BackBlaze using their preference pane to 128kbps (it claimed it would upload “approximately 1gb a day”, compared to the 4gb/hour it was doing at the time) and contacted their customer support. While I waited to hear back from them, I started reading through the BackBlaze log files, and saw it was uploading the same amount each day, and then a little more, e.g.:

server:bzreports_eventlog admin$ cat 10.log
 2016-11-10 04:55:40 - END_BACKUP: Backed up 2801 FILES / 40617 MB

server:bzreports_eventlog admin$ cat 17.log
 2016-11-17 05:02:54 - END_BACKUP: Backed up 2933 FILES / 51489 MB

They helpfully have a log of the last files uploaded, located in /Library/Backblaze/bzdata/bzlogs/bzreports_lastfilestransmitted. The same files were being uploaded each day.

Another thing I noticed in the log files is that it appeared that BackBlaze was downloading updates and/or reinstalling itself basically every day. As far as I can tell, this was resetting the usage in Activity Monitor and why that was not a reliable measure.

While I was investigating these logs, I continued monitoring my network, and noticed that while BackBlaze was “throttled”, it ended up uploading nearly 4gb of data in 24 hours. The bztransmit process was using just under 1mbps of bandwidth, approximately 8x the promised throttle limit. To be clear, I have been using megabits for all my bandwidth measures in this post. I have to assume that there is a mistake in the conversion somewhere, which would perfectly explain why it was uploading at 128 KBps rather than Kbps. Their annotation in the UI and their documentation is ambiguous as it’s all lowercase and abbreviated, however, their estimates match that the numbers shown should be kilobits (128 kbps = 16 KBps = 1,382,400 kbytes = ~1.3GB/day).

BackBlaze got back to me the following day and asked for a copy of my logs – all of them. They gave me a unsigned tool which gathers all these logs, as well as a full system snapshot – approximately 50mb of log files in my case. At this point I wasn’t very comfortable about this. BackBlaze encrypts your files with your private key before uploading them, and according to an employee on Reddit, they can’t even see the filename. I really liked this feature – having all my private documents in the cloud is a scary proposition from a security perspective, and even if the contents are encrypted, the filename themselves leak entropy (e.g. financial documents, photo folder names, etc). I wasn’t particularly keen to send that over, as the logs are very chatty about your system and the files it’s working on.

At this point things with BackBlaze broke down. The customer support person I was communicating with ignored my request to surgically provide logs rather than send all of them. And by ignore, I mean they stopped updating the ticket and ignored my updates to the ticket. I was only able to re-engage them by pinging the BackBlaze Twitter account. He then refused to escalate further without the logs, and ignored my report of the throttling bug.

Without BackBlaze making any good faith efforts to remedy the situation (I should also note they were never apologetic about any of this, including just ignoring the ticket), I investigated alternatives, and have switched to CrashPlan. They offer a variety of better features compared to BackBlaze, including file versioning (including allowing deleted files to stay backed up – BackBlaze will delete after 30 days), 448-bit file encryption (versus 128-bit for BackBlaze), and allow NAS backups too. They are $60/year compared to $50/year for BackBlaze.

I had been using BackBlaze for 6 years. I had been an evangelist, recommending it to family and friends, and likely referred at least a dozen new customers to them. To say the least, I was very disappointed at this experience. After this happened, I did a quick look around to see if this had happened to others. What I found was a pattern of issues with customer support, slow restores, other bugs (folks missing files showed up several times), and backups being unexpectedly deleted from the server (e.g. someone goes on vacation and leaves their external drive at home, and while away the 30 day trigger hits. This also means if you do have a catastrophic data failure, you have 30 days to get your computer set up again to do a restore). It was interesting the different approaches people took to try to vent their issues with the company, including Amazon reviews, Facebook, the BBB and even CNet.

Most tellingly I found someone on Reddit reporting the same issue I encountered – a year ago – with other users in the thread reporting the same problems.

At the end of all this, I’ve spent somewhere in the vicinity of 20-30 hours of my time diagnosing this and talking with customer support, I’ve gone over my quota for two months (October and November – with the latter hitting 2TB in the end), leaving me a single courtesy month with Comcast (after which I never get any courtesy months again). On the plus side, my network is now more secure, and I learnt some interesting stuff along the way, and I was able to diagnose the cause literally 1 hour before midnight on November 30th, preventing me from going over again. The bummer is that there are two companies who have shown no interest in fixing issues for the consumer and puts all the onus on them: BackBlaze need to fix these bugs, and Comcast needs to provide their customers with better tools for diagnosing and monitoring network traffic if they’re going to institute caps for everyone.

nginx, gunicorn and repeated requests

I ran into an interesting issue today that I could not find anywhere on Google, so wanted to document it.

In my move over from C++ app development to web development, I suddenly found I was hugely distanced from the operations side of things. Part of this was an unfamiliarity with Linux systems, but also because of the more delineated roles between developers, DevOps and operations. More things are automated and handled by DevOps than I previously had to worry about – which leaves me more time to code – but does mean I am less comfortable at this than I used to be. However I still find one of my favorite things to do is investigate the root cause of particularly tricky or hard to reproduce bugs, and this has led me to find some interesting edge cases with our technical setup.

It is common in web development to have multiple layers of routers and load balancers on top of each other performing slightly different roles. e.g. a security / payload manipulation layer on top, then a routing/caching layer underneath routing the request to an app itself. To take advantage of the hardware the software is running on, it is then common for the app itself to have many workers (in a combination of different processes, threads and greenlets) to service the requests.

The app I was working on today used nginx for routing, and then gunicorn for actually serving the code. I’d solved an issue a while ago where nginx has a setting called proxy_next_upstream, which by default passes a request that has timed out onto the next node in the pool. For non-idempotent requests this obviously can cause huge issues, e.g. an object being copied multiple times instead of just once. The solution was to change nginx to only retry a request on a different node when the current node is down (i.e. returning a 502 to nginx) and otherwise return an error after the timeout occurred. This is done by changing this setting to proxy_next_upstream error; (compared to its default, proxy_next_upstream error timeout;)

Recently I noticed the exact same issue I’d found previously seemed to be back, although all configs were correct. I found two odd things – 1) the request was being retried every 30 seconds, even though the nginx timeout was set to the default, 60 seconds, and 2) there were NO further logs from the request that timed out once the 30 second had elapsed, whereas previously the app continued processing the request as normal.

Eventually I found that Gunicorn has a default worker timeout of 30 seconds – and that when it kills a worker, it does two things. First – it completely stops the worker from completing any more work (which is probably good if there’s a deadlock or anything happening) and second, it returns a 502 status code to nginx (Bad Gateway), not a 504 (Gateway Timeout). This means that nginx thinks the request has not been processed at all – and passes it on to the next address in the pool. The reason it hadn’t surfaced previously was because the app was previously using Waitress – it wasn’t until a move to Gunicorn that the problem appeared.

Given the popularity of both nginx and gunicorn, I’m surprised this issue hasn’t been more documented. I guess it is rare for a request to take longer than 30 seconds, and when it does it has just been missed that it was retried.

The solution of course is to just adjust the timeout so that Gunicorn is at least as long as nginx. I ended up giving ours a little more leeway – so that any downstream connections (with the same timeout value) that might be slowing us down will timeout, let us do any cleanup, log a few things, and then get out, rather than killing the worker at exactly 60 seconds.

As always, it was incredibly satisfying figuring out and solving this issue – although I was quite chagrined to see how relatively simple the original issue was, despite manifesting in some confusing and difficult to pin down side effects.

Micro Python: Python for microcontrollers

A good friend of mine, Damien George, has been busy this year creating “Micro Python” – an implementation of Python 3.3 for microcontrollers. He has a Kickstarter going here – this is an amazingly cool project, please check it out! I’ve gone for the ‘starter kit’ – can’t wait to play with this and see what it can do.

Git and .pycs – not the best of friends

Coming from a C++ background, using Python for the first time was pretty amazing.  The ease with which you can link in other libraries and quickly write code is fantastic.  Learning and using Git for the first time on the other hand – that took some adjustment.

Git and I are on fairly good terms these days, and I’ve learned to love some of the features it offers compared to TFS (the last source control system I was using), especially when used in conjunction with Github.  However I hit a problem the other day that made _zero_ sense to me – and took a surprisingly long time to track down, due to lots of false positives in figuring out the issue.

I’d been switching back and forth between a few branches, and all of a sudden my unit tests were failing, and the app appeared to be testing things that shouldn’t exist in this branch. Not only that, the app suddenly refused to run – I kept getting a pyramid.exceptions.ConfigurationExecutionError error.  I found that if I added in a bunch of routes that existed in the _other_ branch, suddenly everything worked.

Experienced Python/Pyramid coders will probably know exactly what the problem was, but I was stumped.  It turns out that our Git is configured to ignore .pyc files – which is normally what you want and works fine. However when files exist in one branch and not in another, it can cause all sorts of issues.

There are two main ways of solving this:

  1. Create a clean_pycs alias in your .bashrc file, that you can run to clean your pycs when you need to – i.e. alias clean_pycs='find . -name "*.pyc" -exec rm {} \;'
  2. Use a Git hook to automatically clean up the files every time you change branches – more info here.

There are other ways you can do it, mentioned in the follow-up to the blog post above, but these are the two easiest.  Which one you use is up to you – I’ve spoken to a few other coders at work, and because we work on a lot of different projects they tend to use the former, rather than having to add the hook in so many Git configs.

Hopefully this blog post helps somebody – I was incredibly thankful when I finally found the blog post linked above!

Learning Python

About a year ago my wife and I packed up all our things and moved to the Silicon Valley to work for SurveyMonkey.  As well as changing job, city and country, I switched from being a Windows application developer to being a backend web developer.  I spent over 8 years writing high performance trading applications in C++ and SQL under Windows, and traded it all in for Python, Javascript, open source libraries, virtual Linux environments, and OS X.  I did get to keep using SQL Server 2008 though surprisingly!

I have noticed in Silicon Valley people tend to be a lot more collaborative than in Melbourne.  I suspect it may be because of the heavier use of open source software, and also the rise of distributed source control like Git.  Back in Melbourne most technology companies seemed to use either Java, C++, or C# – when I left Python and Ruby were becoming more prominent, particularly due to a burgeoning startup community, but still pretty relatively rare compared to here.  These languages don’t seem to lend themselves as well to open source collaboration – however I will admit my experience was extremely filtered as a developer working in a Windows dev-shop that rarely used a third party library.

I’ve found the many tech blogs about Python and Pyramid extremely helpful when trying to solve problems, and in the spirit of this collaborative effort, would like to give back a little with what I’ve found.  I also find that writing up technology discussions and solutions very helpful for refining and consolidating my own thoughts, so even if this isn’t widely read, it’s a very useful exercise.

To kick off, I thought I’d mention something that seems to have caught surprisingly few people out on the web, but had me completely stumped a few weeks back at work.  It’ll follow in my next post.  Thanks for reading!

This is the golden age for new gadgets

Back in 2002 when I started work on my PhD, I remember feeling in awe of all the amazing new devices being released.  Bluetooth had just come out, PDAs were getting wifi, and mobile phones were getting cameras.

A lot of the designs back then were kind of clunky, but a lot were pretty amazing.  I remember being amazed by the sleek design of the iPaq 3870.  The very cool (and thoughtfully ergonomic) T68 from Ericsson. The Sony Vaios were to be the coolest thing in laptops for the next 8 years (I still love the S58GP/B).

Since then I haven’t been nearly as excited about new devices, and I’ve often wondered if it was just a phase I was going through.  It was certainly better for my wallet.

This year though has been a pretty amazing year for design and technology.  Here are a few of the gadgets wowing me at the moment:

Macbook Air

When the Macbook Air came out a few years ago I was pretty impressed from a technology perspective.  However from a price/performance balance I was less than enthused.  Incredibly high price coupled with middling performance?  No thank you.  This year Apple suddenly releases an even SMALLER Macbook Air as well as a version of the original with bumped specs and much lower prices? Done.  Between the SSD, the higher resolution screen and the lighter weight/longer battery life, the Macbook Air ceases to be a novelty and begins to be a competitor as an every day computer.  Looking forward to the 15″ version hopefully some day soon.

Kindle 3

The first Kindle was of course groundbreaking, and the second was a worthy successor.  However, the Kindle 3 combined an impressive upgrade (smaller, lighter, and with a better contrast screen), with a killer price point – $139.  Suddenly the greatest ebook reader was a commodity purchase.  Until the Kindle 3, I knew of a single person with a Kindle.  Now, at least a dozen of my close friends own one.


There were rumours for years.  And tablets were pretty woeful too – so the jokes about a second Newton actually seemed pretty accurate.  However Apple did two really impressive things with the iPad.  The first is the price – they crammed in a really solid mix of tech in a beautiful device, yet without the design premium.  Already they’re ahead – but unlike Microsoft with their woeful tablet version of Windows XP, they successfully redesigned the OS paradigm.  They took the best of iPhone and OS X and figured out an all new way to use computers.  And it was an immediate success!  I’m still in awe that they achieved this, and with a 1.0 product too.  I somewhat regret my decision to be a late adopter to the iPad, but I am hoping 2.0 will justify my wait.  As someone who dreamed about ubiquitous computing becoming mainstream as an academic, I’m truly jealous of the designers and engineers at Apple for pulling this off.

Volkswagen Golf Mk 6

This seems a little odd.  Gushing about the design of a car?  Yes, cars are a fairly well-known format, but since having lengthy discussions with the Volkswagen Electronics Research Laboratory I’ve been really excited about the user-centred approach to design employed there.  I originally test-drove the latest Golf as one of several possible cars based only on its looks, but the attention to detail compared to the price point is amazing.  Subtle touches like a cooling glove box, consistent ambient reading lights, 12V chargers in the boot, and a well-designed “second dashboard” UI, give a suitable wow factor.  I’ve never enjoyed using a car as much as I have the Golf Mk 6 (well, except maybe the Audi S5 Coupe).

Canon Powershot S95

I was disappointed with Canon after the apparent peak of point-and-shoots with the Ixus 860IS.  I had just planned to get another Ixus, even though I had known there were issues with the latest sensors, however when I went to Best Buy to compare them, I stumbled across the S95.  I immediately knew I had to get one.  The S95 boasts a very sharp and clear LCD, a large sensor, high quality lens and nearly full manual controls.  Since buying this I have retired my Ixus 860IS (apart from for concerts, given its great audio attenuation) as well as my wife’s Rebel XTi.  Mr Pogue sums up my feelings pretty well in his love letter.

Ruby on Rails

Yes, I’m kind of late to the party, but I’m traditionally a Java man, having dallied in Struts and used Spring/Hibernate.  Writing a web-app used to be kind of a big deal.  With Ruby on Rails all the smart decisions have already been made for you.  Ruby is a fun language to write in, and Rails takes care of all the hard stuff.  This is what programming is meant to be like.

Everything, really

Furthermore, have you noticed how much more it feels like we’re living in the future at the moment? I think the mobile/ubiquitous computing nature of devices is fueling this.

  • Identifying a song on the radio using Shazam
  • No longer having to plan every detail of vacation.  Finding an amazing restaurant using Yelp, booking it on OpenTable and navigating with Google Maps – all on your smartphone.
  • Wireless broadband – how did I make do without you before?
  • Playing Xbox without any controllers at all – I still remember the weird feeling of just not having a cable and how disconcerting that was.
  • VOIP calling anywhere, anytime.
  • Fitbit, WakeMate, RunKeeper, Nike Plus – you can monitor almost all your everyday activities now.
  • Having applications in the cloud – find any computer, and you’ve got access to all your data and applications.
  • A mobile phone with a 960×640 display? The iPhone 4 wows me every time I use one.
  • WordLens – realtime OCR/translation and augmented reality for translating text – all on a mobile device.

It’s difficult to make strong predictions, but 2011 looks to be another amazing year for technology, even if it is “just” incremental (the S95 was “just” an update to the S90).  I’m looking forward in particular to what Apple do next (of course), and the continued innovation in web applications, particularly in the cloud.  Y Combinator continues to be very successful and with nearly 80 new companies likely to emerge next year that will be another exciting scene. It is hard to imagine a strong follow-up to a year which included Xbox Kinect, WordLens, iPhone 4, and the iPad, but I am optimistic.

A long time off the air

Apologies for the long delay.  I had worried I would stop updating a blog on my own volition, but instead a nasty Firestats bug shut down my WordPress install.  Every time I sat down to try and fix it, I’d get interrupted and never had quite enough time to finish figuring it out.  All sorted now though, so back to the usual miscellanea.

A participatory design approach in the engineering of ubiquitous computing systems

I’m very excited to report that I finally completed all my graduation requirements, including of course, attending the ceremony and I am now officially Dr Tim Cederman-Haysom.

Some obligatory photos of floppy hats:

And of course, my thesis is now available for helping insomniacs everywhere.  My abstract isn’t a bad summary for figuring it all out.

Products I like and wish I actually used

Sometimes you see something that looks so cool that you want to use it, but quickly realise you don’t have any actual compelling need or interest.  I’ve raved a few times about products I’m using at the moment and really enjoying but wanted to mention a few deserving products that I wish I used more often but for whatever reason don’t.

Posterous is named as such because they make it preposterously simple to blog.  It’s very easy to use and a great product — I’ve even heard people on the T to work raving about how much they love it.  I think their landing page, with its three steps of use, says it all:

Posterous landing page

I’d really like to enjoy that simplicity myself — I love the site, the implementation, and the look of posterous blogs, but with my comfy custom WordPress installation, I can’t see myself using it anytime soon.  Bummer.


Balsamiq is what I spent years wanting to have.  It’s a very simple-to-use but powerful creator for wireframes.  Instead of doing the smart thing and inventing my own version of it, I languished in Visio, PowerPoint and Photoshop.  Balsamiq provides a great toolkit for quickly creating digital sketches of UIs and is a joy to use.  While it’s been very useful for my own personal projects on occasion, unfortunately it doesn’t fit in with my current work flow at TripAdvisor where we’re doing a pretty decent job with Photoshop and paper sketches.  I would’ve loved having a tool like this at Trovix though.  Oh, and a hearty congratulations to the Balsamiq team for what sounds like a very successful 2009.

Amazon Kindle

I got to borrow one of these from Google over Thanksgiving and I loved using it.  It meant I had plenty to read while on vacation (where I get the bulk of my book-length reading done), without the bulk of the books.  I bought Under The Dome by Stephen King recently, and wow, there’s a book that shows the utility of the Kindle (1074 pages).

Unfortunately the clunky update speed and grayscale screen doesn’t do it for me.  The lure of the mythical Apple tablet is proving too strong and I can’t pull the trigger on one just yet.  More than happy to keep borrowing one of Google’s though.

Google Voice

I managed to snag a GrandCentral account a while back, but the inertia of my existing phone number meant it was more of a technical toy than a serious phone replacement.  I do love the idea of a unified phone system, and with realtime voicemail and transcription, call recording, conference calls and a slew of other great features, it seems like an amazing product… but only if you can get around the limitations of having to change your number, and to call the Google Voice service to take advantage of said features.  I think the rejected-by-AT&T iPhone app would’ve gone a long way to helping me switch.


RadRails is one of the few products where I’m not sure if it’s me at fault or them for not using it.  As someone who got very comfortable in Eclipse and is a little lazy, I’d like to continue my Rails hacking in a familiar IDE.  Unfortunately I just can’t seem to get RadRails to play nice with the latest releases of Ruby and RoR.  When I get more time I’ll take another crack at it.

In theory though, it’s a great environment for us ex-Eclipse users.  I’m not sure about other users, but I spent a fair bit of time in Eclipse using J2EE/Spring as a framework, and RadRails feels like home.

Edit: updated to add…

Google Website Optimizer

This is an amazing free product that allows for A/B and bucket testing.  Happily we have some very nice pool testing at TripAdvisor already, but perhaps I’ll get to use it on a future side project.