We are hiring for a next generation telecoms software company who are seeking a Senior Software Engineer to join their expanding team.
Key Accountabilities & Activities
Utilise Kubernetes to manage containerised applications and ensure their smooth operation and high availability:
- Deploy and manage containerised applications within a Kubernetes cluster, adhering to best practices for security, scalability, and performance.
- Configure and maintain Kubernetes resources (e.g., deployments, pods, services) to ensure effective resource utilisation and application functionality.
- Develop, deploy and maintain Kubernetes components (operators, controllers, etc.) taking advantage of Kubernetes flexibility to fulfil the needs of the application development.
- Implement monitoring and alerting strategies to proactively identify and address potential issues within a Kubernetes environment.
- Automate tasks using Kubernetes features (e.g., deployments, rolling updates, scaling) to streamline application lifecycle management and minimise manual intervention.
- Collaborate with DevOps and SRE teams to ensure seamless integration of Kubernetes with existing infrastructure and deployment workflows.
- Troubleshoot and diagnose issues related to containerised applications and Kubernetes resources, identifying root causes and implementing effective solutions.
- Stay up-to-date on the latest advancements and best practices in Kubernetes by continuously learning and expanding your knowledge in this domain.
Implement monitoring and observability practices using Grafana and other relevant tools to gain deep insights into system health and performance:
- Design and implement a comprehensive monitoring and observability strategy using Grafana and other relevant tools (e.g., Prometheus, Loki, Jaeger) to collect, visualise, and analyse system metrics, logs, and traces.
- Configure and maintain dashboards within Grafana to effectively visualise key performance indicators (KPIs) and identify potential anomalies or bottlenecks.
- Set up alerting mechanisms to proactively notify relevant stakeholders of critical issues or performance deviations.
- Correlate and analyse data from various sources (metrics, logs, traces) to gain deeper understanding of system behaviour and identify root causes of issues.
- Optimise and refine existing monitoring and observability practices based on evolving system needs and insights gained from data analysis.
- Collaborate with engineering teams to translate insights from monitoring data into actionable improvements for system health and performance.
- Stay informed about the latest advancements in monitoring and observability tools and techniques, continuously learning and adapting the approach for optimal results.
Design, develop, and implement scalable and efficient platform related systems using Go:
- Create HLD’s (High Level Design) & LLDs (Low Level Design) in compliance with security & design authority mandates
- Adhere to high-quality development principles while delivering solutions on-time & in budget
- Design, develop, & test software, following established security & architectural standards
- Develop, refine, & tune integrations between application elements
- Package & support deployment of releases following the SRE (Site Reliability Engineer) deployment process
- Prepare reports, manuals & other documentation on the status, operation, & maintenance of the software
Collaborate with DevOps engineers to automate infrastructure provisioning and deployment processes:
- Identify and document opportunities for automation within the infrastructure provisioning and deployment processes.
- Work with DevOps engineers to design and implement automation solutions using relevant tools and technologies (e.g., Infrastructure as Code tools, CI/CD pipelines).
- Contribute to the development and maintenance of automation scripts and configurations, ensuring they are well-documented, efficient, and reliable.
- Test and validate automated deployments, identifying and addressing any potential issues.
- Collaborate on continuous improvement efforts, iterating on existing automation workflows and exploring new approaches to further streamline infrastructure management.
- Participate in code reviews for automation scripts, providing constructive feedback to ensure code quality and maintainability.
- Stay updated on the latest advancements in DevOps practices and automation tools,
- continuously learning and incorporating new knowledge into the collaboration process.
- Research & evaluate emerging developments, technologies & best practise within the development space
- Undertake ad-hoc projects & other activities as required
Experience & Skills
Essential
1. Strong Kubernetes experience
2. Experience deploying, orchestrating and automating business-critical services
3. Experience using Kustomize and Helm tools for deployment
4. Strong Git & CI/CD experience
5. Proficient in Go software development
6. Cloud experience (AWS, GCP, Azure)
7. Proven ability to work independently & collaboratively in a fast-paced technical environment.
8. Proficient ability to communicate in English (Written & Verbal)
Desirable
1. ArgoCD, PostgreSQL and NATS experience are bonus points
2. TDD (Test Driven Development) experience
3. Understanding of workflow & orchestration
4. Knowledge & experience of the telecommunications industry & technologies