Tag Archives: statistics

PaaS, bots, alerts and using analytics to improve web performance

Giuseppe Sollazzo

 

 

 

Giuseppe Sollazzo
Senior Systems Analyst
St George’s, University of London

 

 

Storytelling at Velocity

The second day of O’Reilly Velocity conference was definitely about storytelling: keynotes and sessions were both descriptions of performance-enhancement projects or accounts of particular experiences in the realm of systems management, and in all honesty, many of these stories resonate with our daily experience running IT Services in an academic environment. I will give a general summary, but also mention the names of the speakers I’ve found most useful.

Evolution in the Internet of Things age
An attention-catching keynote by Scott Jenson, Google’s Physical Web project lead, the first session was centred on a curious observation: most attention about web performances has traditionally been focused on the “body”, the page itself, while the most interesting and performance-challenged part is actually the address bar.

Starting from this point, Scott has illustrated how the web is evolving and what its characteristics will be especially in the Internet of Things age. He advocated for this to be an “open” project, rather than Google’s.

Another excellent point he has made is that control should be given back to the users. This was illustrated by a comparison between a QR code and an iBeacon : the former requires the user to take action; the latter is proactive to a passive user. Although we like to think of proactive applications, it only takes us to walk into a room full of them to understand being in control can be a good thing.

PaaS for Government as a Platform
Most of the conference talks have centred on monitoring and analytics as a way to manage performances. Among the most interesting talks, Anna Shipman of the UK Government Digital Service (GDS) illustrated how they are choosing a Platform-as-a-Service supplier in order to implement their “Government-as-a-Platform” vision.

I’ve argued a lot in the past that UK Academia will need, sooner or later, to go through a “GDS moment” to get back to innovation in a way it can control – as opposed to outsource in bulk – and this talk was definitely a reminder of that.

Rise of the bot
As with yesterday’s Velocity sessions, some truly mind-boggling statistics have been released today. One example is that that many servers are overwhelmed by web crawlers or “bots” – the automated software agents that index websites for search engines. In his presentation From RUM to robot crawl experience!  Klaus Enzenhofer of Dynatrace told the audience that he spoke to several companies for which two thirds of all traffic they receive is Google Bots. “We need a data centre only for Google”, they say.

Analytics for web performance
There has been quite a lot of discussion around monitoring vs. analysis. In his presentation Analytics is the new monitoring: Anomaly detection applied to web performance Bart De Vylder of CoScale argued for the adoption of data science techniques in order to build automatic analysis procedures for smart, adaptive alerting of anomalies. This requires an understanding of the domain of the anomalies in order to plan how to evolve the monitoring, considering for example seasonal variations in web access.

Using alerts
On a similar note was the most oversubscribed talk of the day, a 40 minute session by Sarah Wells of the Financial Times which saw over 200 attendees (with many trying to get a glimpse from outside the doors). Sarah told the audience about how it is very easy to be overwhelmed by alerts: in the FT’s case, they perform 1.5M checks per day generating over 400 alerts per day. She gave an account of their experience trimming down these figures. Very interestingly, the FT has adopted the cloud as a technology, but they haven’t bought it from an external supplier: they’ve built it themselves, with great attention to performance, cost, and compliance, surely a strategy that I subscribe to.

Conference creation
I also attended an interesting non-technical session by another Financial Times employee, Mark Barnes, who explained how they conceived the idea of an internal tech conference and how they effectively run it.

Hailed an internal success and attended by their international crowd, the conference idea came from an office party and reportedly has helped improve internal communications at all levels. As a conference/unconference organiser myself (OpenDataCamp, UkHealthCamp, WhereCampEU, UKGovCamp, and more), having this insight from the Financial Times will be invaluable for future events.

I’m continuing to fill in this Google doc with technical information and links from the sessions I attend, so have a look if you’re interested.

Disruptive statistics, Linux containers, extreme web performance for mobile devices

Giuseppe Sollazzo

 

 

 

Giuseppe Sollazzo
Senior Systems Analyst
St George’s, University of London

 

 

 

 

Day one at the Velocity conference, Amsterdam

What a first day! O’Reilly Velocity, the conference I’m attending thanks to a UCISA bursary, is off to a great start with a first day oriented to practical activities and hands-on workshops. The general idea of these workshops is to build and maintain large-scale IT systems enhancing their performances. Let me provide you with a quick summary of the workshops I have attended.

Statistics for Engineers
A statistics workshop at 9.30am is something that most would find soul-destroying, but this was a great introduction on how to use statistics in an engineering context – in other words, how to apply statistics to reality in order to gather information with the goal of taking action.

Statistics is, indeed, very simple maths and its difficult yet powerful bits allow practitioners to understand situations and predict their outcomes.

This workshop illustrated how to apply statistical methods to datasets generated by user applications: support requests, server logs, website visits. Why is this important? Very simply because service levels need to be planned and agreed upon very carefully. The speaker showed some examples of this. In fact, the title of this workshop should have been “Statistics for engineers and managers”: usage statistics help allocate resources (do we need more? can we reuse some?) and, in turn, financial budgets.

The workshop illustrated how to generate descriptive statistics and also how to use several mathematical tools for forecasting the evolution of service levels. We have had some experience with data collection and evaluation at St George’s University of London, and this workshop has definitely helped refine the tools and reasoning we will be applying.

Makefile VPS
This talk presented itself as a super-geeky session about Linux containers. Containers are a popular way to manage web services that does not require a full-fledged physical or virtual server. They can be easily built, deployed, and managed. However, they are rarely properly understood.

The engineer who presented this workshop showed how in his company, SoundCloud,  they build their own containers to power a “virtual lab” in order to simulate failures and train their engineers to react. His technique, based on scripts that build and launch containers at the press of the “Enter” button, is an effective solution both for quick prototyping and production deployment whenever docker or other commercial/free solutions are not a viable option (due to funding or complexity).

As much as this was quite a hard core session, it was good to see how services can be run in a way that makes their performances very easy to manage. This is definitely something that I will be sharing with my IT colleagues.

Extreme web performance for mobile devices
A lightweight (so to say!) finale to the day, discussing how mobile websites present a diverse range of performance issues and what techniques can be used to test and improve. However, the major contribution from this session was to share some truly extraordinary statistics about mobile traffic and browsers.

For example, the fact that on mobile 75% of traffic is from browser and 25% from web views (i.e. from apps) – 40% of which is from Facebook. Of course, these stats change from country to country and this makes it hard to launch a website with a single audience in mind. For universities, this becomes incredibly important in terms of international students recruitment.

Similarly shocking, we have learnt that the combination of Safari and Chrome, the major mobile browsers reach 93% on WiFi networks but only 88% on 3G networks; this suggests that connections speeds still matter to people, who might opt for different, more traffic-efficient browsers in connectivity-challenged environments (for example, OperaMini goes up from 1% to 4%)

One good practical piece of advice is to adopt the RAIL Approach, promoted by Google, which is a user-centric performance model that takes into consideration four aspects of performance: response, animation, idle time and loading. The combination of these aspects, each of which has its own ‘maximum allowed time’ before the user gets frustrated or abandons the activity, requires a delicate balance.

There was also some good level of discussion around the very popular “responsive web design”, a technique that has become a goal in itself. The speaker suggested that this should be just a tool, rather than a goal: users don’t care about “responsive”, they care about “fast”. Never forget the users is a good motto for everyone working in IT.

Summary
Velocity’s first day has been a very hands on day. The overall take-home lesson is simple: managing performance requires some sound science, but with adequate tools and resources it’s not impossible to do it on a shoestring budget and in an effective way. As I’m an advocate of internal resource control and management with respect to outsourcing, today’s talks have surely provided me with some great insight on how to achieve this smartly.

Aside from this summary, I’ve also been taking some technical notes, which are available here and will also contain notes from the future sessions.