Position - Site Reliability EngineerLocation - Hyderabad
Exp - 5 to 8 Yrs
Working 2-3 days from client office is required.
There will be F2F internal and client interviews.
JD
TECH STACK:-----------
Containerisation: Docker, Kubernetes, Rancher, EKS, ECS, GKE
CLOUD: AWS, GCP
IaC: Terraform, CloudFormation / CloudComposer, Chef / AnsibleInfra
Monito: Prometheus, Datadog, Alert Manager, Thanos, AWS Cloudwatch
CI/CD: GITLAB CI-CD, Jenkins
Scripting: Python, Golang
VCS: GITLAB, Perforce, Subversion
OS: UBUNTU, CENTOS, Amazon LINUX, Redhat Linux
Nice to Have: Experience with supporting systems orchestrated on AWS OpsWorks
RESPONSIBILITIES:-----------------
* Implement, Own, maintain, monitor & support the backend servers & micro-services infrastructure for the studio titles which runs on wide-variety of tech stack
* Implement/maintain various automation tools for development, testing, operations and IT infrastructure
* Be available for on-call duty during production outages in 24/7 PAGERDUTY support
* Work very closely with all the disciplines/stakeholders and keep them communicated on all impacted aspects
* Defining and setting development, test, release, update, and support processes for the SRE operations
* Excellent troubleshooting skills in areas of systems Infrastructure engineering
* Monitoring the processes during the entire lifecycle for its adherence and updating or creating new processes for improvement and minimising the workflow times
* Encouraging and building automated processes wherever possible
* Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
* Incidence management and root cause analysis
* Monitoring and measuring customer experience