R is for Security

A lot of users, consumers and companies were hit by the recent Facebook outage in early October.  Speculation ran high as to the cause of the outage and most probably a fair percentage of us instantly thought – “Cyber Attack!?” 

The Russians? Chinese? North Koreans? ISIS? Insulate Britain?  The list is endless.  Facebook is a target for all of that.

The motives behind attacks such as Denial of Service (DoS) are equally endless.  Money, political gain, industrial espionage, terrorism and simple kudos.  Facebook is a target for all of that.

The post-outage analysis will no doubt rumble on but there was one thing that caught my eye on the official statement from the Facebook Engineer post.

The last paragraph of the their statement reads:

“People and businesses around the world rely on us every day to stay connected. We understand the impact that outages like these have on people’s lives, as well as our responsibility to keep people informed about disruptions to our services. We apologize to all those affected, and we’re working to understand more about what happened today so we can continue to make our infrastructure more resilient.”

And the word that leaps out at me is ‘resilient.’

We are often too embroiled in the world of ‘cyber’ as new technologies and the associated nerdy definitions emerge and focus on threats and attacks – often forgetting one of the key aspects of this whole risk equation which is the potential impact.

The Facebook outage is just one example of many that we all face at every level, daily on our networks and ultimately it has a lot to do with the threat actors out there but also the vulnerabilities and dare I say it, fragilities on all of the networks we work with.

The first shout that went out after Facebook went down was – “it’s a DNS problem!”.  This made me smile as I remember from the early days in my IT career and long before I even properly understood what DNS was – that it was always the blame for everything on the network.  It’s almost like an ‘IT Crowd’ response (rather than have you tried turning it off and on again?!)  “It’s probably a DNS issue” – was normally a correct assumption but DNS is only as good as the services around it and the administration put into it. 

Like pretty much everything in the world of networking and IT.

Every network regardless of its shape and size is dependent upon the services, applications and more importantly the protocols which hold it together.  Those of us that like to work in these areas understand the concepts of OSI and TCP/IP.  But when you look at how old most of these protocols are and how modern and technologically advanced the systems are that rely on them – is it any wonder that occasionally something breaks?

It’s a bit like having a very modern, all singing all dancing car with no petrol at the pumps.  You may be able to control the car but do you control the supply chain?

Technology is reliant on a lot of external forces to make it work and most of it is out of your control and some of it is as old as the hills.  Break one element in the chain and it stops. 

In the world of ‘Cyber’ and ‘Infosec’ we also have to remember to focus on the boring elements of resilience and redundancy – which allows us to put as big a tick as we can in the ‘Availability’ box.

Your systems can have the most Confidentiality and Integrity you can afford built in and be as sophisticated as you like, but if the glue holding it together gives way then you have a problem.

Protocols and applications such as TCP, IP, BGP and DNS all work like the glue on a network.  They remain transparent to us and we rely on them every day.  But as soon as one of them goes wrong or is misconfigured the impact can be huge.

The biggest threat to our network is the Insider Threat.

The biggest type of insider threat is by far accidental and non-malicious actors.  A click on a link, letting someone you don’t know in, sending an attachment to the wrong addressee, pulling out the wrong cable – is far less sexy than blaming the Russians or the Chinese etc – but it is far more likely.

So until I hear differently – I will go with internal error as it’s probably true.

The biggest task that Facebook has is not finding the culprit or resolving the problem but learning from it and building some resilience into their change management processes.

‘R’ is a big part of Security

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.