About Us
Blueshift is a venture-funded startup headquartered in San Francisco. Our AI-Powered marketing platform empowers cutting edge B2C brands to drive 1:1 marketing on every channel. With Blueshift, marketers are in full control of automating various forms of personalized messaging across every engagement channel.
Blueshift is trusted by leading digital brands like Udacity, LendingTree, BBC, and Paypal to automate their customer engagement marketing and recognized by Gartner as a 'Cool Vendor for AI in Marketing'.
Blueshift is founded by repeat entrepreneurs who previously built Mertado.com (acquired by Groupon to become Groupon Goods),and were part of the early team behind Kosmix (acquired by Walmart to become @WalmartLabs). We are backed by top-tier VCs including Nexus Venture Partners, Storm Ventures, Luma Partners, and SoftBank Venture Asia.
Blueshift has now started staffing a new development center in Pune, India. As part of Blueshift, you will get to work on cutting-edge technologies including machine learning, artificial intelligence, big data, and large-scale distributed data systems. This is an exciting opportunity for motivated individuals to build a great career.
Site Reliability Engineer / Cloud Operations Engineer
We are looking for a SRE / CloudOps Engineer who will be responsible for managing, building and scaling our 1000+ node setup, that processes millions of real-time events and personalizations daily. We are looking for candidates with prior infrastructure experience, ideally in a startup or other fast paced environment.
Responsibilities
- On-call duties to provide application support, incident management, and troubleshooting
- Improve reliability and drive down the burden of toil with tooling and automation
- Analyze complex systems from a reliability, resilience, and performance perspective
- Identify sources of instability in large-scale distributed systems and drive operational excellence
- Hands on implementation and management of complex virtualized environments
- Implement scale-up / scale-down strategies based on various utilization metrics
- Author incident reports by coordinating with multiple engineering teams
- Identify and fill gaps in the monitoring & alerting system
- Periodic reporting of system status to the organization
Requirements
- 5+ years of relevant industry experience
- Prior hands-on experience with managing AWS and cloud infrastructure scaling to hundreds of nodes
- Experience with managing a container orchestration system
- Deep understanding of large scale data systems and data pipelines including managing NoSQL, SQL and HDFS/Hadoop clusters
- Experience with modern SRE practices & tools
- Hands-on experience with active incident management
- Willingness & ability to work in night shifts
Perks And Benefits
- Opportunity to be part of the early team in India
- Competitive salary along with stock option grants
- Excellent hospitalisation, personal accident, and term insurance coverage
- Located in a top-notch facility in Baner - one of the best neighbourhoods for tech startups
- Daily catered breakfast, lunch, and snacks along with well-stocked pantry