👨🏻‍💻 Senior Software Engineer – SRE

Alkira

Est. Salary: ₹20 Lacs / year

Posted on: 27 Aug

Job Description

Are you passionate about building cloud networking infrastructure? Do you live and breathe Kubernetes and microservices? Join our innovative team at Alkira, Inc., a Network Infrastructure On-Demand company. As we continue to reinvent networking for the multi-cloud era and as our customers continue to grow with us, we are expanding our Engineering team in India. We are looking for Site Reliability Engineers who can manage, maintain and troubleshoot Alkira's world class cloud networking solution round the clock. In this role, you will work in a product company where you get to sharpen your existing skills and get exposed to a wide range of technologies and constructs ranging from microservices, devops methodologies, Kubernetes, Terraform, data networking and security. Responsibilities: You will be responsible for the availability and integrity of the infrastructure that underpins Alkira’s Cloud Networking platform You hold the production systems together; troubleshoot issues that arise in production deployment Provide 24x7 coverage as a part of scheduled shift and on-call rotation Work with multiple tools like Prometheus, Grafana, Jira etc. to monitor, manage, triage and document infrastructure issues in real time Mentor and guide junior engineers in best practices for DevOps, SRE, and cloud technologies, fostering a culture of knowledge sharing and continuous learning. Automate infrastructure deployment using CI/CD Collaborate with software engineering teams to improve deployment processes, performance tuning, and application scalability. Build necessary tools to evolve how we maintain and monitor our solution Develop and execute system and integration test plans Contribute to the development and documentation of standard operating procedures, runbooks, and incident response plans. Requirements: At least 5 years’ experience in management of production systems Self starter and a solution oriented mindset. You see potential challenges as opportunities to learn and grow Very strong hands-on experience with Linux systems Experience with cloud providers, AWS, Azure or GCP Have worked in a 24x7 operations environment Experience with monitoring and logging solutions (e.g., Prometheus, Grafana) and incident management processes. Experience with computer networking and network technologies Experience with CI/CD pipelines such as Concourse-CI, Jenkins. Experience with Kubernetes Excellent problem-solving skills and ability to quickly grasp new concepts Highly desirable - Hashicorp Certified: Terraform Associate

Apply Now