Why “mediocrity” is your friend (for now at least)
SOC.OS is a security alert correlation, enrichment and prioritisation tool that was born in an internal incubator program at BAE Systems Applied Intelligence. In June 2020, the SOC.OS team and product spun out from BAE Systems to form a new start-up company. We moved from being part of a large, well-established technology firm to being a new business with fewer engineers than letters in our name.
The more typical progression is to gradually grow a small start-up into a larger business, but we instead found ourselves taking a big step in the opposite direction. To be successful, our “progression” must be accompanied by a change in mindset and consideration of how we embrace the flexibility of being a start-up without forgetting the valuable lessons learnt whilst part of a big company.
While this re-assessment applies to all aspects of the business, I’d like to focus on a specific technical element and highlight how the re-evaluation has impacted our philosophy on automated deployments of the SOC.OS tool.
The nature of BAE Systems’ work and customer base can easily foster an approach to deployment automation that is at odds with the typical start-up mindset. At the more extreme end of the spectrum, this leads to an attitude where data integrity is paramount, deployment windows are set in stone1, releases are planned months in advance and emphasis is put on exhaustive testing because if you introduce a bug now, it could be a long time before you can release a fix.
There are huge positives in the above approach – it works for established systems where reliability is key. But as a start-up company you need a faster, iterative approach to development and deployment so that you can move quickly when you receive product feedback and get new features in front of your customers in a matter of weeks rather than months. It’s about striking the right balance between speed and reliability.
So given the slightly unusual progression we have gone through and based on the subsequent learnings, here are the steps that I think need to be considered when deciding how to tackle automated deployment of infrastructure in a start-up company.
Step 1: Find a good tool
It’s pretty obvious but that doesn’t make it any less crucial – you need the right (or right-ish) tool for the job. SOC.OS uses serverless capabilities across a number of cloud providers. So for us, flexibility is key – enter Ansible.
Ansible is open-source software that provides a wonderfully simple but incredibly powerful way to configure, deploy and manage infrastructure. It is the LEGO®2 of deployment automation tools. The module library provides sturdy and reliable building blocks that you can fit together in all manner of ways to create whatever beautiful castles you want to (learn more about defending the castles you build with Ansible in our Defending your castle with MITRE ATT&CK® blog post).
Using Ansible does introduce some overhead for us – most of the engineers on our team haven’t used it before so need to learn how to write and debug the scripts. In addition, we have to set up Windows Subsystem for Linux instances to run our scripts on. But all things taken into account, Ansible meets our requirements pretty well and the benefits outweigh the drawbacks. So (like a good start-up) having found a technology that works fine, we stopped looking for alternatives.
Step 2: The innovation balancing act
Hmm. I just said that it’s good that we, a start-up, found one possible deployment automation tool that works ok and then stuck with it. But as a start-up, don’t we need to embrace change and adopt new technologies in order to innovate successfully?
There are many excellent tools out there to help with your deployment automation and it’s an ever-evolving landscape. It may well be that there is a tool that suits our current requirements better than Ansible does. One which is easier for our team to pick up, maybe one that is written in the same language as the rest of our development stack. However, you need to keep in mind that time spent researching, testing approaches and fine-tuning the deployment automation is time that is not being spent on the development of new product features that make our customers’ lives easier. So maybe it’s better to stick with our existing solution?
Again, it’s all about finding the right balance – this time between the effort it would take to change the deployment approach and the time saved/errors avoided by using the new approach. Don’t introduce new technology unless there is a clear argument to do so and it’s really going to make management of your infrastructure easier now or in the near future. There’s no point spending 6 months future proofing your deployment if the company doesn’t make it past year 2, which is a realistic scenario in start-up life.
For us, this point came when we started working with AWS resources. We could have relied on our Ansible scripts and the AWS Command Line Interface to handle resource creation and management, but that would have quickly become messy and unmaintainable. Instead, we split our infrastructure into groups of interacting resources and deploy them using CloudFormation templates – JSON or YAML files that can be used to create and manage AWS resources.
Step 3: Invite everyone to the party!
Not something you’ll have heard that often during 2020/21…
Once you’ve picked a technology and have an approach for automated deployments in place, get your entire engineering team involved. Work towards the point where a development ticket covers designing the change, developing the code, testing it and updating the deployment scripts. By doing this you end up with more redundancy of skills in the team, more variety of tasks for your engineers and less errors from people documenting which updates need to be made to the deployment scripts only for another team member to make the actual changes. Everyone should be a part-time DevOps engineer.
Step 4: Include your customers
Now you’ve got all your engineers invested in the deployment process, who else can you involve…your customers, perhaps?
Ok so you’re not going to ask them to write deployment scripts but remember who your customers are. Early adopter customers know that they’re not getting the finished article. They are keen to help drive the direction of the product because they see its potential. Given the choice between zero downtime during a deployment or a few hours downtime and a shiny, valuable new product feature created in the time saved by not finessing the deployment scripts, as often as not they will choose the latter. So you can involve them by asking for their preference on these things. The importance of clear and upfront communication with your customers cannot be overstated – when you manage expectations many of them may be happy to tolerate some disruption if it increases the speed of product development.
Ask yourself, how is my work improving the lives of my customers? It is helpful to your customers to have a moderately fast, reliable process in place so that the system remains stable and product deployments can be done regularly. But beyond that do they care? Obviously the gold standard is continuous, bug free, seamless deployments but realistically, that is beyond the scope of most start-ups. So don’t get too hung up on having the perfect automated deployment set-up. Have one that is good enough for now and iterate when it’s required or when you have the scope to do so.
True to the lean start-up mindset, your deployment approach should be treated in the same way as your overall product development; build, learn, iterate, improve and do it quickly and frequently.
1 To read more idioms in the context of SOC.OS, see our recent blog post – The Needle in the Needlestack
2 LEGO® is a trademark of the LEGO Group of companies which does not sponsor, authorize or endorse this site.