Tracking down rogue bandwidth: a story of Comcast data caps and offsite backup

tl;dr: if you use BackBlaze and are subject to a data cap, you should keep an eye on BackBlaze to make sure it doesn’t quietly use up all your data allowance. I’ve switched to CrashPlan, which has better retention, encryption, and backup policies.

Beyond sharing this warning, I wanted to do a longer write-up of what I experienced because I found some pretty interesting things along the way – how to get true visibility into what’s happening on your local network, what’s up with Comcast’s usage meter, and what happens when something goes wrong with offsite backup with BackBlaze.

Earlier in the year Comcast announced they were introducing 1 terabyte data caps in 27 markets across the US (of which 18 previously did not have enforced caps). This is a pretty consumer-unfriendly move, and being in California this meant this was happening to me – but personally I had never come close to exceeding the cap, so wasn’t overly worried (and won’t be until 4K video becomes more common). In the second week of November 2016 I got a notification in my browser telling me I’d used 90% of my cap. This was surprising in two ways – one, I didn’t expect Comcast to hijack an http request to do this, and two, it was saying I’d used over 900gb of data in just over a week. I was about to head overseas for 8 days so I decided to shut everything down while I was away and figure out what was using the data once I got back.

While I was away, each day I saw my data continue to increase by approximately 80gb a day. I had a left my Mac Mini server running, a Dropcam, and a Nest and that was it. I checked remotely to ensure there were no other wireless clients running on my network, but couldn’t see anything. I used Activity Monitor to keep a track of how much bandwidth each process was using on the server, and saw BackBlaze (bztransmit) had transmitted ~50gb, but my uptime was 3 weeks at that point, and so that seemed about right. I was completely baffled – nothing else was being generated by the Mac Mini, and the amount of data uploaded by the Dropcam was also minimal, yet each day it kept increasing.

With no obvious culprits on my network, it seemed suspicious for me to be exceeding my cap just as they introduce caps, so my attention turned to whether Comcast’s usage reporting was accurate.

Comcast gets a lot of heat for their data caps, and in particular for their usage meter accuracy. When trying to use the meter to diagnose bandwidth issues, Comcast itself says it should not be relied on. They have a disclaimer on the meter saying it is “delayed by up to 24 hours” (and on the phone I was told it could lag for weeks). By the end of figuring all this out, I actually found their usage meter to be accurate in realtime, but others online report usage changing after the fact. I also found numerous people online complaining that their usage meter did not match their self-evaluated usage (some measured using tools such as using Gargoyle firmware on their router). However, most of these quibbles were off by 10% or so, not the orders of magnitude I was seeing. I was also unable to find well-proven cases of their meter being wildly off except for one instance where a MAC address had been entered with a single character typo. There was only one other example of Comcast rolling back fees for overages – but it only happened after the media became involved, and nobody technically proficient actually checked the network.

Given the likely accuracy, I was keen to find an inexpensive (and preferably software based, so I could do it remotely) method for measuring what was actually going through the cable modem. Most of the advice I found suggested buying a router that supported Gargoyle, but I also discovered that the Motorola SB6121 cable modem I use reports number of codewords per channel, aka bytes.

In the above example, you can see the aggregate codewords for the 4 channels is 275,455,328,290, which is 275.45 gigabytes. This was exactly as I would’ve expected. I remotely reset the cable modem (another quirk – Comcast removed the ability for me to do that myself on the cable modem through a modem update – I actually own the modem too) using the mobile app, and after doing so my traffic was measured in megabytes per day, which was what I expected. At this point I was baffled, as advice I’d seen repeated a few times online is to use the codewords as the source of truth, but they do NOT include uploads, which Comcast counts towards your cap.

At this point I didn’t know that, and since I had no idea how they were measuring so much traffic, I kept escalating with Comcast, eventually being told a level 3 support technician would contact me. That was a couple of weeks ago, and I never heard anything more from Comcast.

To be completely sure it wasn’t something on my end, my next step to diagnose this was to use SNMP logging to verify the amount of data leaving my network via the ethernet connection to the cable modem. SNMP is “simple network management protocol”, and is a standardized way of reporting and collating information about managed devices on a network. Routers with SNMP logging are able (amongst other things) to report exactly how much bandwidth is being consumed on any of its many interfaces (which are unhelpfully named, but a good description of each on an Airport Extreme is available here). Helpfully, mgi1 is the interface specifically for the WAN connection – i.e. I was able to measure very specifically every bit of traffic going to and from the cable modem. Unfortunately, Apple took this feature out of the 802.11ac Airport Extreme, which I used as my main router. As such I reconfigured my entire network to use my older router and an old Airport Express (which also includes SNMP logging).

At this point I had SNMP logging all my internal network bandwidth, and to visualize it I used PeakHour 3, which did a great job of making it very easy to see what was happening. At this point I finally had proof my Mac Mini really was uploading a LOT of data, and matched the Comcast usage. But Activity Monitor still did not show me that anything was out of sorts, so I still didn’t know WHAT was causing all the usage.

I researched what other tools I could use to monitor network traffic by process, and I found Little Snitch. Little Snitch lets you police all network requests, and approve or deny them, which is pretty nifty, but all I needed was the monitoring tool. This let me see that bztransmit was happily uploading at 5-10mbps in bursts every few minutes, and the cumulative from this process matched exactly the upload traffic seen in the SNMP logging. Leaving it for even just a few hours, it was clear this was the culprit.

I throttled BackBlaze using their preference pane to 128kbps (it claimed it would upload “approximately 1gb a day”, compared to the 4gb/hour it was doing at the time) and contacted their customer support. While I waited to hear back from them, I started reading through the BackBlaze log files, and saw it was uploading the same amount each day, and then a little more, e.g.:

server:bzreports_eventlog admin$ cat 10.log
 2016-11-10 04:55:40 - END_BACKUP: Backed up 2801 FILES / 40617 MB

server:bzreports_eventlog admin$ cat 17.log
 2016-11-17 05:02:54 - END_BACKUP: Backed up 2933 FILES / 51489 MB

They helpfully have a log of the last files uploaded, located in /Library/Backblaze/bzdata/bzlogs/bzreports_lastfilestransmitted. The same files were being uploaded each day.

Another thing I noticed in the log files is that it appeared that BackBlaze was downloading updates and/or reinstalling itself basically every day. As far as I can tell, this was resetting the usage in Activity Monitor and why that was not a reliable measure.

While I was investigating these logs, I continued monitoring my network, and noticed that while BackBlaze was “throttled”, it ended up uploading nearly 4gb of data in 24 hours. The bztransmit process was using just under 1mbps of bandwidth, approximately 8x the promised throttle limit. To be clear, I have been using megabits for all my bandwidth measures in this post. I have to assume that there is a mistake in the conversion somewhere, which would perfectly explain why it was uploading at 128 KBps rather than Kbps. Their annotation in the UI and their documentation is ambiguous as it’s all lowercase and abbreviated, however, their estimates match that the numbers shown should be kilobits (128 kbps = 16 KBps = 1,382,400 kbytes = ~1.3GB/day).

BackBlaze got back to me the following day and asked for a copy of my logs – all of them. They gave me a unsigned tool which gathers all these logs, as well as a full system snapshot – approximately 50mb of log files in my case. At this point I wasn’t very comfortable about this. BackBlaze encrypts your files with your private key before uploading them, and according to an employee on Reddit, they can’t even see the filename. I really liked this feature – having all my private documents in the cloud is a scary proposition from a security perspective, and even if the contents are encrypted, the filename themselves leak entropy (e.g. financial documents, photo folder names, etc). I wasn’t particularly keen to send that over, as the logs are very chatty about your system and the files it’s working on.

At this point things with BackBlaze broke down. The customer support person I was communicating with ignored my request to surgically provide logs rather than send all of them. And by ignore, I mean they stopped updating the ticket and ignored my updates to the ticket. I was only able to re-engage them by pinging the BackBlaze Twitter account. He then refused to escalate further without the logs, and ignored my report of the throttling bug.

Without BackBlaze making any good faith efforts to remedy the situation (I should also note they were never apologetic about any of this, including just ignoring the ticket), I investigated alternatives, and have switched to CrashPlan. They offer a variety of better features compared to BackBlaze, including file versioning (including allowing deleted files to stay backed up – BackBlaze will delete after 30 days), 448-bit file encryption (versus 128-bit for BackBlaze), and allow NAS backups too. They are $60/year compared to $50/year for BackBlaze.

I had been using BackBlaze for 6 years. I had been an evangelist, recommending it to family and friends, and likely referred at least a dozen new customers to them. To say the least, I was very disappointed at this experience. After this happened, I did a quick look around to see if this had happened to others. What I found was a pattern of issues with customer support, slow restores, other bugs (folks missing files showed up several times), and backups being unexpectedly deleted from the server (e.g. someone goes on vacation and leaves their external drive at home, and while away the 30 day trigger hits. This also means if you do have a catastrophic data failure, you have 30 days to get your computer set up again to do a restore). It was interesting the different approaches people took to try to vent their issues with the company, including Amazon reviews, Facebook, the BBB and even CNet.

Most tellingly I found someone on Reddit reporting the same issue I encountered – a year ago – with other users in the thread reporting the same problems.

At the end of all this, I’ve spent somewhere in the vicinity of 20-30 hours of my time diagnosing this and talking with customer support, I’ve gone over my quota for two months (October and November – with the latter hitting 2TB in the end), leaving me a single courtesy month with Comcast (after which I never get any courtesy months again). On the plus side, my network is now more secure, and I learnt some interesting stuff along the way, and I was able to diagnose the cause literally 1 hour before midnight on November 30th, preventing me from going over again. The bummer is that there are two companies who have shown no interest in fixing issues for the consumer and puts all the onus on them: BackBlaze need to fix these bugs, and Comcast needs to provide their customers with better tools for diagnosing and monitoring network traffic if they’re going to institute caps for everyone.

This is the golden age for new gadgets

Back in 2002 when I started work on my PhD, I remember feeling in awe of all the amazing new devices being released.  Bluetooth had just come out, PDAs were getting wifi, and mobile phones were getting cameras.

A lot of the designs back then were kind of clunky, but a lot were pretty amazing.  I remember being amazed by the sleek design of the iPaq 3870.  The very cool (and thoughtfully ergonomic) T68 from Ericsson. The Sony Vaios were to be the coolest thing in laptops for the next 8 years (I still love the S58GP/B).

Since then I haven’t been nearly as excited about new devices, and I’ve often wondered if it was just a phase I was going through.  It was certainly better for my wallet.

This year though has been a pretty amazing year for design and technology.  Here are a few of the gadgets wowing me at the moment:

Macbook Air

When the Macbook Air came out a few years ago I was pretty impressed from a technology perspective.  However from a price/performance balance I was less than enthused.  Incredibly high price coupled with middling performance?  No thank you.  This year Apple suddenly releases an even SMALLER Macbook Air as well as a version of the original with bumped specs and much lower prices? Done.  Between the SSD, the higher resolution screen and the lighter weight/longer battery life, the Macbook Air ceases to be a novelty and begins to be a competitor as an every day computer.  Looking forward to the 15″ version hopefully some day soon.

Kindle 3

The first Kindle was of course groundbreaking, and the second was a worthy successor.  However, the Kindle 3 combined an impressive upgrade (smaller, lighter, and with a better contrast screen), with a killer price point – $139.  Suddenly the greatest ebook reader was a commodity purchase.  Until the Kindle 3, I knew of a single person with a Kindle.  Now, at least a dozen of my close friends own one.


There were rumours for years.  And tablets were pretty woeful too – so the jokes about a second Newton actually seemed pretty accurate.  However Apple did two really impressive things with the iPad.  The first is the price – they crammed in a really solid mix of tech in a beautiful device, yet without the design premium.  Already they’re ahead – but unlike Microsoft with their woeful tablet version of Windows XP, they successfully redesigned the OS paradigm.  They took the best of iPhone and OS X and figured out an all new way to use computers.  And it was an immediate success!  I’m still in awe that they achieved this, and with a 1.0 product too.  I somewhat regret my decision to be a late adopter to the iPad, but I am hoping 2.0 will justify my wait.  As someone who dreamed about ubiquitous computing becoming mainstream as an academic, I’m truly jealous of the designers and engineers at Apple for pulling this off.

Volkswagen Golf Mk 6

This seems a little odd.  Gushing about the design of a car?  Yes, cars are a fairly well-known format, but since having lengthy discussions with the Volkswagen Electronics Research Laboratory I’ve been really excited about the user-centred approach to design employed there.  I originally test-drove the latest Golf as one of several possible cars based only on its looks, but the attention to detail compared to the price point is amazing.  Subtle touches like a cooling glove box, consistent ambient reading lights, 12V chargers in the boot, and a well-designed “second dashboard” UI, give a suitable wow factor.  I’ve never enjoyed using a car as much as I have the Golf Mk 6 (well, except maybe the Audi S5 Coupe).

Canon Powershot S95

I was disappointed with Canon after the apparent peak of point-and-shoots with the Ixus 860IS.  I had just planned to get another Ixus, even though I had known there were issues with the latest sensors, however when I went to Best Buy to compare them, I stumbled across the S95.  I immediately knew I had to get one.  The S95 boasts a very sharp and clear LCD, a large sensor, high quality lens and nearly full manual controls.  Since buying this I have retired my Ixus 860IS (apart from for concerts, given its great audio attenuation) as well as my wife’s Rebel XTi.  Mr Pogue sums up my feelings pretty well in his love letter.

Ruby on Rails

Yes, I’m kind of late to the party, but I’m traditionally a Java man, having dallied in Struts and used Spring/Hibernate.  Writing a web-app used to be kind of a big deal.  With Ruby on Rails all the smart decisions have already been made for you.  Ruby is a fun language to write in, and Rails takes care of all the hard stuff.  This is what programming is meant to be like.

Everything, really

Furthermore, have you noticed how much more it feels like we’re living in the future at the moment? I think the mobile/ubiquitous computing nature of devices is fueling this.

  • Identifying a song on the radio using Shazam
  • No longer having to plan every detail of vacation.  Finding an amazing restaurant using Yelp, booking it on OpenTable and navigating with Google Maps – all on your smartphone.
  • Wireless broadband – how did I make do without you before?
  • Playing Xbox without any controllers at all – I still remember the weird feeling of just not having a cable and how disconcerting that was.
  • VOIP calling anywhere, anytime.
  • Fitbit, WakeMate, RunKeeper, Nike Plus – you can monitor almost all your everyday activities now.
  • Having applications in the cloud – find any computer, and you’ve got access to all your data and applications.
  • A mobile phone with a 960×640 display? The iPhone 4 wows me every time I use one.
  • WordLens – realtime OCR/translation and augmented reality for translating text – all on a mobile device.

It’s difficult to make strong predictions, but 2011 looks to be another amazing year for technology, even if it is “just” incremental (the S95 was “just” an update to the S90).  I’m looking forward in particular to what Apple do next (of course), and the continued innovation in web applications, particularly in the cloud.  Y Combinator continues to be very successful and with nearly 80 new companies likely to emerge next year that will be another exciting scene. It is hard to imagine a strong follow-up to a year which included Xbox Kinect, WordLens, iPhone 4, and the iPad, but I am optimistic.

A long time off the air

Apologies for the long delay.  I had worried I would stop updating a blog on my own volition, but instead a nasty Firestats bug shut down my WordPress install.  Every time I sat down to try and fix it, I’d get interrupted and never had quite enough time to finish figuring it out.  All sorted now though, so back to the usual miscellanea.

A participatory design approach in the engineering of ubiquitous computing systems

I’m very excited to report that I finally completed all my graduation requirements, including of course, attending the ceremony and I am now officially Dr Tim Cederman-Haysom.

Some obligatory photos of floppy hats:

And of course, my thesis is now available for helping insomniacs everywhere.  My abstract isn’t a bad summary for figuring it all out.

Products I like and wish I actually used

Sometimes you see something that looks so cool that you want to use it, but quickly realise you don’t have any actual compelling need or interest.  I’ve raved a few times about products I’m using at the moment and really enjoying but wanted to mention a few deserving products that I wish I used more often but for whatever reason don’t.

Posterous is named as such because they make it preposterously simple to blog.  It’s very easy to use and a great product — I’ve even heard people on the T to work raving about how much they love it.  I think their landing page, with its three steps of use, says it all:

Posterous landing page

I’d really like to enjoy that simplicity myself — I love the site, the implementation, and the look of posterous blogs, but with my comfy custom WordPress installation, I can’t see myself using it anytime soon.  Bummer.


Balsamiq is what I spent years wanting to have.  It’s a very simple-to-use but powerful creator for wireframes.  Instead of doing the smart thing and inventing my own version of it, I languished in Visio, PowerPoint and Photoshop.  Balsamiq provides a great toolkit for quickly creating digital sketches of UIs and is a joy to use.  While it’s been very useful for my own personal projects on occasion, unfortunately it doesn’t fit in with my current work flow at TripAdvisor where we’re doing a pretty decent job with Photoshop and paper sketches.  I would’ve loved having a tool like this at Trovix though.  Oh, and a hearty congratulations to the Balsamiq team for what sounds like a very successful 2009.

Amazon Kindle

I got to borrow one of these from Google over Thanksgiving and I loved using it.  It meant I had plenty to read while on vacation (where I get the bulk of my book-length reading done), without the bulk of the books.  I bought Under The Dome by Stephen King recently, and wow, there’s a book that shows the utility of the Kindle (1074 pages).

Unfortunately the clunky update speed and grayscale screen doesn’t do it for me.  The lure of the mythical Apple tablet is proving too strong and I can’t pull the trigger on one just yet.  More than happy to keep borrowing one of Google’s though.

Google Voice

I managed to snag a GrandCentral account a while back, but the inertia of my existing phone number meant it was more of a technical toy than a serious phone replacement.  I do love the idea of a unified phone system, and with realtime voicemail and transcription, call recording, conference calls and a slew of other great features, it seems like an amazing product… but only if you can get around the limitations of having to change your number, and to call the Google Voice service to take advantage of said features.  I think the rejected-by-AT&T iPhone app would’ve gone a long way to helping me switch.


RadRails is one of the few products where I’m not sure if it’s me at fault or them for not using it.  As someone who got very comfortable in Eclipse and is a little lazy, I’d like to continue my Rails hacking in a familiar IDE.  Unfortunately I just can’t seem to get RadRails to play nice with the latest releases of Ruby and RoR.  When I get more time I’ll take another crack at it.

In theory though, it’s a great environment for us ex-Eclipse users.  I’m not sure about other users, but I spent a fair bit of time in Eclipse using J2EE/Spring as a framework, and RadRails feels like home.

Edit: updated to add…

Google Website Optimizer

This is an amazing free product that allows for A/B and bucket testing.  Happily we have some very nice pool testing at TripAdvisor already, but perhaps I’ll get to use it on a future side project.

Fixing a corrupted/deleted partition table

About a month ago, while trying to upgrade to Windows 7, I managed to wipe the partition table and in trying to fix it, created a corrupted table.

(incidentally, if you can’t update Vista with the latest service pack, you won’t be able to upgrade to Windows 7, so don’t bother trying without fixing your boot configuration. Turns out my problem was having a dual-boot configuration with XP)

I had backed up my key files, but I wasn’t keen on losing my nice Vista configuration. I posted the whole sordid tale on Superuser.

Happily, I managed to figure out what had happened, what I was actually doing at a low level (sometimes I am a little too lazy and do just blindly run commands, something that Raymond Chen despises), and completely recover. I figured I’d post a link to the solution in case anyone else has their own troubles.

Products I’m really loving right now

I realised tonight that there are quite a variety of tools and products I’ve been using lately that I’ve been really enjoying, including:


I live in Beacon Hill, where owning a car is both expensive and difficult.  As such, I have two RFID cards in my wallet – my monthly MBTA pass and my Zipcard.   Zipcar finally released their iPhone application, which although not as exciting as made out to be (no initial unlocking of the car from your phone, but I do enjoy surreptitiously honking the horn while my wife is driving), does provide a very convenient way of getting a car when you need it last minute.  Their website is actually very nice too, and makes finding and booking a car surprisingly easy — I particularly like how they’ve implemented the calendaring.  The car sharing itself is also great.  $6.13/hour, all-inclusive, for a Prius just 2 blocks from my apartment is very compelling.  The insurance setup is less than ideal (only state minimums), and it depends on the goodness of others to keep the car in decent condition, but I’ve had no serious problems as yet.

Zipcar iPhone app Zipcar iPhone app

Yelp iPhone App

While I wait for TripAdvisor’s updated mobile offerings, I continue to enjoy using Yelp’s nifty local review app.  I’ve used it to find things to do when in NYC, somewhere to grab a quick bite, bars I didn’t know about and even write reviews while still at a restaurant.  The augmented reality is a nice toy, but a little gimmicky.  I recently used Yelp when visiting Michigan, and used it to find local favourites like Bates burgers and Bode’s Corned Beef House.

Yelp iPhone app

This is a pretty amazing flight search tool, replacing calendar widgets with a Googlesque search box. Not officially launched yet, but looks like it will be pretty awesome when it does.  Try searching for things like “Brisbane to snow in early December”.


I’m pretty sure I’m not the only person who hates waiting around an airport with no idea what’s going on with my flight.  Flightcaster will tell you the chances of a delay well before the airlines will.  So far I’ve been very impressed with both the prediction system and the UI, which is very intuitive and pretty.  I had a pretty amazing experience with Flightcaster last week when flying to San Francisco from Boston.  Before I left, the flight was listed as on-time, while Flightcaster predicted it was “probably delayed”.  I arrived at the airport, and the flight was delayed by two hours, due to weather at SFO.  While waiting, I checked Flightcaster again and it predicted we would be leaving shortly, and within minutes an announcement came through that the two hour delay had been shortened to a 20 minute delay.  Nifty.


iPhone 3GS

I know the iPhone 3GS has been out for a while, and it’s hardly groundbreaking to proclaim how great iPhones are, but since upgrading from my 2G to the 3GS, I’ve been pleasantly surprised at what a difference it makes.  It’s now fast enough that I’m able to work remotely, it’s great having a longer battery life, and the difference the speed makes to the user experience cannot be understated.  I particularly like the improved camera and finally having GPS and a compass, great for when I exit a T stop in a bewildered fashion.

iPhone 3GS

Skype iPhone App

A while back I wrote an article about using an iPhone as a home phone using Fring.  Sadly, it was a little too buggy for everyday use, and I continue to use my dedicated Netgear phone.  Happily Skype released an official iPhone app, which does everything I’ve always wanted in a dedicated Skype phone (unlike the disastrous Belkin Skype wifi phone).  While it doesn’t work in the background, or over 3G (yet), it does give me access to IM and voicemail at all times, and I can use Skype-To-Go to make calls over AT&T.  It also makes a great second landline at home while my wife is using our Netgear.

Skype iPhone app

Email disclaimers

Today I got an email from an Australian company and noticed two things at the bottom of their email. The first was the ever-silly “Go Green – please consider the environment before printing this e-mail.” line.

Does putting this in actually make people reconsider printing an email? I wonder who started this trait? I suspect it was started with somewhat passive-aggressive intent somewhere where a lot of “technically unsavvy” folks were printing emails, and spread from there. It also wouldn’t surprise me if this is the net result most of the time:

Ironic email reminder

(By the way, I really liked the way Google knew exactly what I meant when I searched for [reddit print email irony])

The second thing I noticed, and this is something I’ve definitely seen a lot more of from Australian companies for some reason, was the legal disclaimer.

I’m sure most people have seen this. It’s something like:

“This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.”

They’re all ridiculously long, nobody reads them, and they break up email conversations in an annoying way. So why do so many people have them (with whole websites devoted to them)?

At first it seems reasonable to believe they do offer some legal protection, which would explain their popularity. But do they? I think it would be reasonable to think that if they offered iron-clad legal protection everyone would have them, yet I rarely see them attached to emails from US companies, perceived as one of the most litigious countries worldwide. Even said website devoted to them, says:

If you were to be so unlucky to be sued for the contents of an e-mail, it is not certain whether an email disclaimer will protect you from liability in a court of law.

I’m certainly not the first to question this.  Nor am I the first to think the content is generally ridiculous.  Slate has also covered the issue.  So given its dubious nature, I suspect it persists mainly as a way to reassure the company, without actually doing anything (just like the raptor-repellent I keep in my cube, just in case).