Monthly sum-up for August 2014
Published 2014-09-16 by Jochen Lillich
August has been a bit more quiet due to vacations. Unfortunately, my own vacations came in the way of finishing and publishing my sum-up for July. That’s why I’ll compare our numbers for August to those I published in my sum-up for June.
In August, our DevOps support took center stage. We spent a significant amount of time working with customers on launching new websites and optimising existing ones. Performance tuning is one of the main concerns here. freistilbox certainly offers everything a high-traffic website needs to master traffic peaks without hiccups. Achieving reliable performance, though, requires optimising the web application so it can fully take advantage of our hosting platform. That’s where our engineering support shines with deep expertise in Drupal and WordPress tuning. We collaborate with our customers via phone, email or web chat as soon as any question or issue arises until it is solved.
We’re continuously expanding our infrastructure. Over the recent weeks, the number of websites we run on freistilbox increased by 22% to 394. With the number of websites, our web traffic also made a jump of 24% to 15.09 TB. Although a growing infrastructure means more points of failure, our monthly uptime stayed at an excellent value of 99.87% (+0.01%).
As I’ve mentioned above, delivering DevOps support is taking up a growing portion of our time. The August numbers for support requests reflects that. That month, we received 29% more tickets (193) than in June. Nonetheless, we’ve kept our ticket backlog at 39 because we were able to resolve 161 tickets, a whopping 50% improvement!
Unfortunately, our average reaction time went up significantly by 144%. As the chart shows, we slightly improved in the area of quick responses but much higher percentage of customers had to wait for more than a business day, compared to June. We’ll investigate if that’s due to the nature of the actual support requests or if we need to tweak our Help Center processes. Since satisfaction feedback remained at a perfect 100% “good”, we’re confident that we’re still doing a great job.
With more websites a growth in IT infrastructure is to be expected, and the number of servers our ops team has to maintain actually increased by 24%. 373 hosts means that our server:sysadmin ratio is 187:1.
The number of metrics we collect even grew by 26%. We’re now collecting 124,642 metrics every 10s. In order to achieve the necessary I/O performance, we built a new metrics storage on SSD drives.
Causing us a bit of concern is the fact that the amount of on-call alerts went up by 20% in August (1378 alerts total). So it’s exactly at the right time that PagerDuty published “ Let’s talk about Alert Fatigue”. We’ll especially have to dig deeper on the aspect “Cut alerts that aren’t actionable & adjust thresholds”. Another important improvement will be eliminating alerts that only get triggered as a consequence of previous alerts (for example, identical shared storage space warnings from all the boxes of a freistilbox cluster).
While our web hosting platform only runs PHP-based applications, we use Ruby for a lot of internal applications and tools. That’s why Markus spent the first August weekend attending eurucamp at the Hasso Plattner Institute in Potsdam. It was amazing to see the inspiration he brought back. This can only be good for our latest Ruby-based project, the freistilbox Hosting API. We’ll let you know more on this important undertaking later. So stay tuned!