e.g. job, company, title
e.g. city, county or postcode
advanced search »

Senior Site Reliability Engineer

44,254 P.A. ?
MMT Digital

Apply for this job

About the role

As a Senior Site Reliability Engineer, after an initial induction period in our London office you’ll be working on our client’s site, where you’ll be directly responsible for uptime of client infrastructure and applications. You’ll be working closely with our clients to ensure safe, accurate delivery of services. You’ll own end-to-end availability and performance of key services, and build automation to prevent problem recurrence.

Assisting in the roll-out and deployment of new product features and installations, you’ll be working closely with application and cloud architects to ensure that platforms are designed with scale and operability in mind. You’ll automate current manual infrastructure management and alerts handling processes, and find scalability bottlenecks and areas for performance improvements.

You’ll assist in the discussion and formulation of architectural strategies to maximise performance, stability and efficiency of the solution, and work with the Systems Team Delivery Manager and wider Systems Engineering Teams to continually assess and improve ways of working.

As a senior member of the team, you’ll help SREs in your team to grow and develop their careers through mentorship as well as sharing your knowledge and experience by organising Lunch and Learn, Lightning talks and Brown Bag sessions where you’ll be able to educate and onboard members from the wider business of working in an SRE model.

About you

In order to flourish in this role, you’ll need the following:

  • Experience in building highly resilient systems in an Agile environment
  • Experience with running production systems, triaging and solving outages
  • Experience with using monitoring and observability tools like Datadog/Splunk
  • In-depth knowledge of at least one infrastructure as code tool such as CloudFormation or Terraform
  • Deep understanding of large-scale system architecture
  • Working knowledge of UNIX/Linux internals
  • Proven success at adopting new methodologies driving an SRE culture
  • In-depth knowledge of using AWS, particularly around ECS, EC2, Cloudwatch, IAM, Networks and S3.
  • Config management using Ansible
  • Hands-on source control experience, knowledge of Docker and container runtimes and experience of writing build-release pipelines, preferably using Azure DevOps
  • Software engineering experience
  • Ability to listen effectively, communicate, challenge and influence team members, peers and your management
  • Excellent troubleshooting skills, able to drive out root causes of complex technical problems

It would also be great (but not essential) if you have:

  • Had the chance to use and implement the Google SRE working model for Enterprise clients
  • Experience working with Incident management tools such as Pagerduty
  • Experience with Hybrid cloud solutions
  • Versatility in one or more scripting languages (Python, Perl, Bash, PowerShell)
  • Knowledge of test frameworks (Pester, etc)
  • Working knowledge of Windows Server

What you’ll get

Here at MMT Digital we care about the people that work with us, and we have tried to reflect this in our benefits package.

  • Annual salary review
  • Company bonus scheme
  • An environment which supports and celebrates your personal development and growth
  • The opportunity to join internal groups which foster innovation and technical excellence
  • Lunch & Learn and Brown Bag sessions
  • Magic moments – a personalised peer to peer reward recognition scheme
  • Paid volunteering day, and support for fundraising activities
  • A social events committee that organise social events throughout the year including our awesome Christmas party
  • Enhanced maternity and paternity pay, along with a new baby hamper to celebrate your new arrival
  • Popcorn Pass Rewards
  • Cakes for your birthday - everyone loves cake!

About us

We are MMT Digital, a technical consultancy specialising in the delivery of effective digital solutions.

Join us and you’ll be part of our 170-strong UK team of experts who devise and deliver complex digital solutions to help our clients drive faster change.

We’ve spent the last 20 years working with organisations such as Bacardi, Vodafone and comparethemarket.com to digitally enable their businesses and help them drive the most value to their customers at speed and scale.

We work with CIOs, CTOs and Heads of Digital who need a trusted partner to augment their team with deep technical expertise and delivery capabilities. At the same time, we can upskill their in-house people through knowledge transfer and change management to create high performing teams. Our collaborative approach means our teams – whether working onsite, offsite or a combination of the two - build open and genuine relationships with our clients that bring success faster.

We became part of Be Heard Partnership in 2016 and have been rated the UK’s most recommended digital partner by clients for the last six years (The Drum Recommends) and picked up eight awards in 2019, including the prestigious Grand Prix for a second successive year.

Apply for this job

Salary comparison:

The number of jobs in each salary range for all:

Similar jobs
By creating an alert, you agree to our T&Cs and Privacy Notice, and Cookie Use.
Senior Site Reliability Engineer

LinuxRecruit Ltd
£85000 - £100000 per annum
London, South East
more details »