Why and how to ditch AWS for cost reduction

AWS is an extremely evolved and useful ecosystem that makes many things easy. Nevertheless, there comes a time to say goodbye.

Preface

I am a neandarthal of sorts. Yes, I've really been living under a rock. well, building my own underground cavern, infrastructure wise, but yes, have not actually drunk the public cloud Kool-Aid. Likely, this is due to the fact that I am rather system-savvy. So, at any crossroads where I was offered to choose a managed solution vs. my own, the choice was easy. My own was more flexible, cost effective, and did not require learning irrelevant stuff. Thus, over the years I have ended up digging into the rabbithole of building my own virtualization infrastructure.

Client situation

On one hand, this wasn't the best place to be. How relevant is your isoteric know-how when everything's "in the cloud" around you, and worse yet, the container hype is in full-swing? However, there does seem to be a place where all of this comes in handy. A certain client is an adtech operation. Very load system, high volume traffic (10k req/sec requests). Over a period of 6 months, client was charged increasingly ridiculous amounts for incoming & outgoing traffic, and for Compute resources, to a lesser extent. Luckily, all they used was EC2 and load balancers, and as such had little/no lock-in into the aws echosystem.

Convincing

took plenty. In fact, I'd say this was half the job. Them, being a quite tech savvy and having a historical memory of managing infrastructure, really did not want the headache of having their own rancho. But, apparently, money is stronger than any conviction. It could very well be that AWS perceives a customer;s "liquidity" as being a product of their traffic volume. A "they're receiving lotsa traffic = they're doing really well." sort of thing. That's not without its logic, but apparently not in adtech.

Logical steps

Automate client's local dev environment. this gave us basic understanding into the app logical structure, and immediately provided the client with the value of an easy to set up local dev tool.
Transform said automation to a multi-host environment by spreading functionality across nodes with the notion of "roles". A role for DB, for Redis, for the App server, for the Queue worker and so on.
Build a multihost staging / preprod env that mimics existing production as closely as possible, in terms of scalability, throughput, data volume. run QA and stress tests to validate
prepare and execute a migration procedure to reduce downtime. the hardest thing to pull out of aws, or any cloud, is the data. in this case it was MongoDB, which does have its share of pain (initial sync is a bich on large dbs).
Rollback considerations. a multi-location replicaset is a great way to go if you want to easily go back in case the destination deployment does not meet expectations immediately after migration, or when there are other uncertainties that justify taking the extra precaution.

Fabric

It is ancient by nowadays' standards. and its definitely not as hip as kubernetes, etc. However, it is a very flexible, low-level, low-overhead tool to ~~get shit done~~ automate things via ssh. in fabric, methods are executed as remote commands on hosts. the hosts can be read from ssh config, which is great as a standard source of truth.

Roles

what gets executed where (at least by default) is denoted by roles, which can be assigned to hosts. mapping hosts to roles on aws can be done using the aws tagging engine. mapping hosts to roles on plain ssh hosts can be done by either storing the definition locally, in a database, or on each host (dangerous if the hosts are lost)

Rapid prototyping during development

fabric is easy to use to automate dev env setup on a local virtual machine, or even on localhost (ssh root@localhost). from there on, the same code can be used to automate staging/production deployment. this means things can grow organically.

Summary

fabric allows us to reuse deployment automation and grow it organically from prototyping phase through to live deployment
fabric can be reused across clouds and allows us to migrate stuff away from aws, so long as there is no vendor lock-in.
the fabric notion of hosts & tags can be neatly applied to both hardware machines, a virtualized environment as well as AWS.

Being a caveman is not all bad!