SITE reliability Engineer
TechOps
WHAT YOU’LL BE DOING
The role holder will be responsible for:
- Ensuring high levels of system performance through monitoring, analysis and performance tuning
- Implementing scalability and fault tolerance
- Improving processes through automation or other efficiencies
- Troubleshooting Application and Middleware issues
- Working with Engineering teams to ensure successful running of their software in a high throughput production environment
- Building deployment pipelines that ensure high quality code deployments
WHAT WE’RE LOOKING FOR
The job holder will be an exceptional candidate, with a proven track in a similar role.
Essential
- Experience of working with Microsoft
- Strong working knowledge of Ansible, Docker and Kubernetes
- Fluent in written and spoken Italian and English
- Passion around reliability
- Creating and modifying Terraform deployments
- Previous experience of working in an Operations role (ideally a Site Reliability role)
- Ability to work collaboratively across multiple teams, to take ownership of, prioritise and be accountable for your work
- Experience working with Mikrotik networking hardware and software
- Understanding of Ceph storage
- Excellent communication skills
- Monitoring solutions (Azure Application Insights, Log Analytics, New Relic, Zabbix, Elastic or Datadog)
- Scripting/programming languages to assist in automating solutions e.g. PowerShell (preferred), Bash, C#, Ruby, Python.
- Experience supporting web-based applications
Desirable
- Azure DevOps pipelines
- Experience of working with Microsoft Server Operating Systems
- Experience of defining service level objectives/operational requirements for a Cloud-based solution
- Understanding and working knowledge of Microsoft Azure Cloud offerings, especially in the Platform as a Service category (Web Apps, Storage, Functions)
- A good understanding or working knowledge of the following tools: Terraform, Ansible, VSTS, ARM, Puppet, Chef, Jenkins, ELK, Grafana
- A good understanding or working knowledge of DNS, Load Balancer configuration, Active Directory and Cloud-based network infrastructure
- Experience of working in an agile environment and experience with agile methodologies such as TDD, Scrum, Kanban
- Understanding and experience of implementing a monitoring and alerting system for a micro-service architecture
- Applied understanding of cloud security best practice
Apply
To apply please send your CV to diana.barbu@commify.com
Diversity
We’re committed to building a team with a variety of backgrounds, views and skills, embracing our key values. The more diverse and inclusive we are, the stronger we are as a team. We encourage applications from all candidates with the relevant skills and experience.
Legal
Commify is committed to protecting the privacy and security of your information. Personal information submitted as part of the recruitment and selection process will only be used for these purposes. We will retain information for up to 12 months, after which it will be deleted or destroyed.