Most of the features of Amazon Web Services (AWS) are low risk in terms of changing your mind later. Don't like an EC2 instance type? Just stop it and start it with a new type. Want a larger EBS volume? Simply snapshot the current one and create a larger volume from it. The flexibility and low costs of errors are some of the great features of the AWS platform.
However, one place where you really need to get things right from the start on the AWS platform is in your Virtual Private Cloud (VPC) design. Unfortunately there isn't a lot of wisdom imparted through the defaults or documentation provided. The purpose of this post is to lay out some best practices so you won't find yourself up a creek later. If you've already gone partway up a creek, you'll be fine - AWS is a pretty agile canoe and there is no shortage of paddles.
One reason VPC design is so important is that you put things into it. It is a container for much of the other stuff you are going build in AWS. You'll deploy your EC2 instances into its subnets. You'll terminate your VPN connection into its gateway. You'll put Relational Database Services (RDS) and Elastic Load Balancers (ELBs) inside it as well. Put a lot of stuff into a VPC and changing it becomes rather difficult. You don't want to rebuild everything in another VPC because you failed to think ahead.
Another reason VPC design matters is that it is easy to get yourself confused in a VPC. Change some routing rules, add some Network Access Control Lists (NACLs) and configure your VPN end point or Direct Connect (DX) and you may be scratching your head over why you can't ssh into your bastion host even though the Security Group (SG) configuration is correct.
The third big reason to worry over your VPC design early on is that it forms the basis of your applications' security. VPC is very powerful when combined with SGs used properly with good cloud computing architecture. I've seen AWS bludgeoned into looking like a traditional on-premise data center network and security architecture and it's a real shame. AWS security is powerful, but if you approach it with your existing InfoSec tome of requirements, you'll make a less secure, more fragile and uglier baby. While even ugly babies are lovable, beautiful ones are preferable. And in AWS they generally cost less and perform better too!
I'm not going to give you step-by-step instructions or recipes because these produce mediocrities. Good designs come from holding a set of conflicting goals in mind and imagining them in different combination until something goes click and you know you've got it (however temporarily).
You Want Security, Resiliency and Efficiency
Security is the fidelity of the design. Does your security only allow intended behaviors and prevent unintended ones? Modern security methodologies and policies often backfire in the cloud (actually many backfire most places), resulting in systems that are fragile in the face of changing business requirements, low fidelity in that the intentional uses are limited by security policy and highly inefficient due to burdensome implementation. AWS allows the use of more flexible, efficient and resilient security patterns, but it does take some learning and mental flexibility.
Resiliency is the long-livedness of the design. How long can you work with your initial VPC before you hit a dead end, sigh, and go make a new one? More resiliency is generally better unless you’ve made a really ugly baby that you realize too late you want to replace. In these cases resiliency can be a bitterly ironic flaw. But you’re not going to put those security appliances running on EC2 instances between each of your subnets like this was an old-school LAN are you? Good. We’re one step closer to a beautiful baby.
Efficiency is the maximization of ROI. Efficiencies play out in so many ways on AWS that it can be hard to wrap your head around them. In the case of VPC, the primary efficiencies are not wasting IP address space, not having to move resources once deployed, and ease of automation and deployment into the VPC.
You may be wondering where performance fits into the goal list. The good news is that on AWS you get a pretty consistent level of performance from your VPC no matter the logical architecture. The not-so-good news is that the level of performance is <5msec latency with bandwidth dependent on the instance type, neighborhood of the server in AWS and what is happening around it. With a little added cost, you can gain much higher performance by dropping your instances (of appropriate type) into a Placement Group.
The problem people run into is that getting resiliency, efficiency and performance out of VPC doesn't look a great deal like getting them from traditional infrastructure. In the next post, I will describe how an ugly VPC is the normal outcome of an implementation based on pre-cloud ways of thinking.