When I write, read or hear a word too many times in a row, it starts to lose all meaning. It stops being a coherent word and starts being a random collection of letters next to each other. It ceases to have one strict definition and is completely open to interpretation.
What does Cloud mean? Depending on what your job is and how it relates to you, it could have several different meanings. Regardless of which one of them it is to you, ultimately it comes back to one simple idea – an abstraction. The replacement of the details of the implementation of something with a representative idea.
If you look at a network diagram, and there is a cloud in it, it means “you don’t know or care what happens in this section”. Most of you probably have ADSL at home. I’m certain that 99% of you have no idea how your Internet gets from your home router onto the Internet at large. It’s irrelevant to you. That network component seamlessly connects you to the things you want, it is transparent.
In 2006 I was hosting my own mail on my own server, running all my own applications. My employer was doing this too, and between the two of us we spent an inordinate amount of time maintaining our infrastructure. This seemed silly to me because I was a Network Engineer, but here I was spending portions of my time being a Systems Administrator. I came to the conclusion that any time spent working on things that were not my primary competency were things that were either:
a) taking longer than they should, or,
b) not being done as well as they could
This leads me to the essential point of ‘cloud’ from the perspective of someone buying into it. It isn’t necessarily about having a scalable application, or about using dynamic server spool-up.
The essence of cloud is finding a pain point, and abstracting it away, for profit.
The only difference is that now we’re talking about hosting those things online, so they can be delivered to everyone, as easily as possible.
So we’re taking away pain points that used to be owned, operated and managed in-house, and hosting them away somewhere on the Internet. All well and good. Problem is, a lot of those applications were originally designed to be run in-house. In-house has a lot of inherent implications about the environment. It’s probably going to be:
– only IP accessible via clients people that are supposed to have access to it
– extremely low latency
– extremely low jitter
– extremely high bandwidth
– relatively static in routing
The Internet is:
– IP accessible by anyone, anywhere
– totally variable in latency
– totally variable in jitter
– totally variable in bandwidth
– totally dynamic in routing
These things are going to have a real, noticeable effects on what you plan to do, regardless of what that is.
I’m going to talk about three situations.
1. You’re writing an application online, or you’re a business wanting to move an application online
2. You’re setting up as a cloud computing provider
3. You’re running a startup and have no infrastructure
So, you’re writing an application to host online – that’s great, I build the Internet so you’re securing my job.
1. Location location location
The first thing you need to think about is where your users are. Who is the target audience, where will they physically be located? This is going to be the first thing that starts crossing potential hosting solutions off your list. If you’re doing any kind of voice, 150ms is the acceptable RTT limit before people start noticing. This means there’s almost zero chance of hosting it overseas. In the immediate present, this means that Amazon EC2 is out of the question.
2. The Internet breaks all the time
How are your users connecting? If your customers are on 3G and this is a mobile app, they’re going to be transiting between cells, they’re going to lose packets, and if they’re on Vodafone they’re going to lose signal altogether. They’re going to walk away from their wifi coverage area, then they’re going to walk back into it 20 seconds later. Your application needs to be resilient enough to handle these things gracefully, and moreover you need a way to test what your app will do under these conditions. If you’re lucky, they’ll blame the provider but at the end of the day they had a bad user-experience with your app.
So you let’s say that we have the best case scenario: your customers are all in Australia, and they’re all on DSL or better. What are your traffic patterns going to look like? And how much traffic could you potentially have? It’s important to differentiate cloud applications from cloud computing. Does it make sense to use an elastic cloud provider? The answer is going to depend on what your growth and scaling looks like. If your potential scale and feasible growth is relatively slow, and you have traffic patterns that are largely uniform it may not make any sense to bother.
By all means, develop your application in a scalable way so you can grow it when you need to, but if you’re never going to outgrow more than a small farm of machines then it’s of questionable wisdom to pay for the privilege of elasticity. If you’re Draw Something, a mobile game, and your app can go viral, it’s paying for itself. Or perhaps you’re Xero, doing online accounting and everyone logs on at the end of the month to do Payroll, meaning you have a vast difference between the average usage level and the peak. But perhaps you’re not – if you’re BackBlaze, doing online backup, then perhaps you just need to add more of your own hardware as you go.
If your elasticity requirements and immediate potential growth aren’t massive, then you may be much better off cost and complexity wise to run up a simple VPS or dedicated server or two. If your application is written to scale well, you can do it manually when the time comes and save a lot of money. IP transit is cheaper than it’s ever been too – it may be worth setting up your own POP. Again, not necessarily the answer, but a question worth asking. And whoever you do decide to go with, make sure you get out there and visit them and talk to them and make sure that they are what they say on the tin. In a world where ‘cloud’ doesn’t mean any one thing, your ‘redundant storage’ cloud could just be two front-end servers connected to the same SAN with double parity. Don’t laugh, this happens, and you won’t be laughing when a failed PSU takes out your entire operation.
Depending on what you’re doing, if you have commercial or government clients you may face a number of hoops to jump through in order to get them using you. If you’re going to be storing credit card information there’s PCI-DSS compliance. If you’re going to be working with government departments then you’ll probably need to host the data in Australia on Australian-owned servers for data-sovereignty reasons. If you’re going to be selling to Defence then there maybe physical access requirements around the places where the data is actually stored. No-one considers this stuff until they’re stuck in an elevator with an armoured server rack trying to figure out how to tetris it through an office building. This is not fun, don’t let it be you.
Your application will be out on the Internet. YOUR APPLICATION WILL BE OUT ON THE INTERNET! That should scare the hell out of you. You need to approach every software release with the idea that the Internet is a swarming mass of black hats who are chomping at the bit to get at your data. They really are all out to get you, and you need to be afraid of that. With this in mind, you should be monitoring everything. Keep stats on all your server activity, it’s easy and there are plenty of programs to analyse the data and tell you if it’s out of spec for the day/week/month etc. Anyone who’s had a machine owned because of a software vulnerability knows the pain I’m talking about, and anyone who thinks they haven’t had a machine owned hasn’t been paying enough attention. Minimise your risk profile, know your upstream transit provider’s DDoS mitigation procedures, keep your software up to date and don’t expose anything to the Internet that doesn’t need to be. You will be attacked, all you can do is be ready for it.
If you’re developing mobile apps, remember that it won’t just be your apps connecting to your servers. If you’re hitting APIs on the public internet, people are going to find them, they’re going to test them, and they’re going to find ways to exploit them. Westfield Bondi learned this lesson in a very public and very embarrassing way when it was discovered that you could exploit the public interface to their parking application to find and track the location, comings and goings of any car, automatically. This wasn’t even hacking, it was just plain watching the data that was being transferred, and supplying your own variables for the search parameters. No exploits necessary.
This is especially relevant for anyone whose application was hosted in-house and is considering a move to the public Internet, because this is the main problem you’re not used to facing. People doing this as a startup are also vulnerable as well, because it’s usually their first jump online.
So that’s the application stuff – five main things to consider while developing. There are no ‘right answers’, just ones that make sense for your situation. Next is for the providers.
WHAT TO REMEMBER WHEN BUILDING ANYTHING
Because Cloud is more about behaviour than implementation, there’s a lot of room to move here. But there are a few basic principles that anyone can take on board when trying to implement things of this nature that everyone can learn from. The first one is very basic.
1. Any data that you don’t have at least two copies of, in two separate locations, is data you don’t care about
This is another one of those ‘repeat after me’ ones. Say this to yourself over and over. You need to be prepared for any part of your system to fail. Redundancy is key to business continuity. Hardware and software are both going to fall over, so you need to figure out why they might do that. It doesn’t just apply to data – also to communications. Fortunately, communications is a lot easier to provide for. This means:
a) Having more than one site
b) using dual network cards in servers
c) having redundancy in your switch network.
d) having more than one router at each site.
e) Having more than one ISP
f) Having storage available in multiple locations
Even Telstra can go down. Any suburb can lose power. Anyone’s UPS can fail, and any generator can fail to start. Any cable can get pulled out.
A few years back, AAPT caused huge voice outages because a high pressure chilled water loop pipe exploded. It lifted up the floor tile it was on, sprayed water all over its voice POP for QLD, then proceeded to fill the underfloor area of the room with water. People were without phones for days.
Webcentral, who arguably have more redundancy in their power systems and network than just about anyone, relied solely on water for their air conditioning. A truck smashed the water mains coming into their building, and their 10,000L tank on the roof slowly ran itself dry before the refilling contract truck arrived. The datacentre overheated and they had to shut everything down.
It happens to people who literally spend deep six figures on setting up their organisation to survive catastrophic failures – this means it can happen you. Mitigate that risk by spreading your eggs around a number of baskets, and really work deep into those dependency chains when you’re figuring out potential failure causes.
2. Redundancy and backup are different things.
Redundancy is how you deal with ongoing failures in the course of normal business; it’s what keeps you running when something dies. Backup is about disaster recovery, and being able to continue doing business after a critical failure. You need two strategies, centered around two questions:
How will my business keep running if something I am using breaks?
How can I prevent my business from ceasing to exist if EVERYTHING breaks?
Your backup strategy needs to be something that can’t be affected by a failure in the live network. Whether that’s archiving to disk, or to a magnetic drive that then lives on a shelf, or a tape, it needs to be something separate, stored at a separate location. Last year, TV Central ceased to exist after failing to pay attention to their data storage requirements, and subsequently blamed their hosting provider, Ventra IP, loudly and in public. The subsequent fallout from that when Ventra posted the full story of what really happened made TV Central look exactly like the kind of organisation that would not keep any of its own data. It’s heartbreaking to see years of work go down the toilet, but if you don’t understand where your data is and what risks it’s under, it can happen to you. If you advise your VPS provider to cancel your hosting before you’ve migrated somewhere else, then that’s your fault. If you don’t have a copy of your code and database sitting on a flash drive or computer not attached to your automatic systems, that’s your fault too.
And finally, what is out there that you, the enterprising young startup can take advantage of?
USEFUL CLOUD-BASED SOFTWARE
The first thing you obviously need to do is get set up. I probably don’t need to tell anyone about Google Apps, but if I do .. well, Google Apps. It’s cheap, easy to manage, works with all your mobile devices and it’s accessible anywhere. There’s a built-in Wiki, and Google Drive, their online storage synchronisation platform is now integrated and live.
If you don’t want to use Google Drive for file storage & sync, Dropbox is affordable – free for 2gb, $100/year for 50gb or team packages are available.
If you’re on the Microsoft side of thing, their Skydrive service is actually surprisingly good. If your first thought was ‘Why would I want Microsoft for this?’, the answer is really Office Live. The online versions of Office could easily be used as your primary word processor, something Google Docs sadly can’t compete with.
After that, you’re probably going to need to manage customer relationships and sell things at some stage.
To that end, SugarCRM is cheap and effective, starting at $30 per user per month, and you get access to basically everything for that price including API access. Salesforce starts getting useful at $21 per user per month, but if you want API access to start automating stuff, be prepared to shell out $180 per user per month. You may be able to talk them down to $130 if you play nice, but that’s still a whole lot. If you’re considering automating, definitely Sugar.
To actually do your work, I love Atlassian products. Jira is great for bug tracking, or any kind of support ticketing or request tracking. It’s got powerful workflow tools so if you’re repeating a process, it’s well worth doing. For up to 10 users, they’ll host if for you for $10pm, and if you don’t need it hosted it’s $10/year. It has a powerful API, and if you need something the API doesn’t have, the database schema is easy enough to understand that you can figure out how to read the database yourself. Just remember, don’t make changes to the DB manually – write through the API if you’re going to do writes. The atlassian products all kind of integrate into each other pretty well, and Confluence is pretty good for documentation if you’re into that sort of thing. You should probably be into that sort of thing.
If you’re wanting something to start time logging and billing customers for it, Minute Dock is very easy to set up and start using. There are apps to log time remotely and it integrates right into Xero, an online accounting package. Which leads me to my next thing .. Xero, an online accounting package. Unlikes most accounting packages, this one has been designed recently, meaning its not stuck with 15 years of legacy interface like most of the others you’ve heard about. If Xero’s not your thing, Freshbooks is a bit simpler, and also integrates into MinuteDock.
So that’s basically it. I hope you’ve come to learn something from our experiences and from the failure of others, and I hope that at least one thing I’ve said has truly scared you, because it is the correct handling of that fear that will keep you safe.