[Remote] Sr. Manager, Incident Management and Site Reliability Engineering
Note: The job is a remote job and is open to candidates in USA. Peloton Interactive provides a seamless experience for its members and is seeking a Sr. Manager, Incident Management and Site Reliability Engineering to lead the team responsible for critical business lifecycles. This role involves managing a team of Site Reliability Engineers to ensure the resilience and scalability of their global SaaS ecosystem and improving business continuity through proactive engineering and observability.
Responsibilities
- Lead, mentor, and grow a team of SREs. Conduct 1:1s, define career growth paths, and foster a culture of high accountability and psychological safety
- Transition from reactive support to proactive engineering. Align the team’s quarterly goals with broader Finance and Supply Chain digital transformation initiatives
- Architect observability across complex business paths (e.g., ensuring a customer order flows from e-commerce through supply chain into the financial ledger)
- Partner with business owners to define and track Service Level Objectives (SLOs) and Error Budgets for critical SaaS integrations
- Own the Major Incident Response process for corporate systems. Ensure "War Rooms" are efficient and result in actionable improvements
- Lead the Root Cause Analysis (RCA) process, ensuring a culture of continuous learning and systematic "toil" reduction
- Oversee the reliability of API-driven connections and identity management (Okta/Azure AD) across our tech stack
- Champion "Infrastructure as Code" (IaC) to automate manual hand-offs between business systems using Python, Go, or Terraform
Skills
- 8+ years in SRE, DevOps, or Production Engineering, with 2+ years of direct people management experience
- Deep understanding of Order-to-Cash or Procure-to-Pay cycles. You can translate a 'database lag' into its specific impact on warehouse shipping or financial reconciliation
- Management of enterprise ecosystems (NetSuite, SAP, Workday, Salesforce)
- Solid grasp of Networking (SD-WAN, VPNs), Identity (IAM), and Endpoint Management
- Proficiency with Datadog, Splunk, New Relic, or Prometheus
- Proven ability to communicate technical risk to non-technical stakeholders (CFO, General Counsel, Head of People)
Benefits
- Annual equity awards
- Employee Stock Purchase Plan
- Medical, dental and vision insurance
- Generous paid time off policy
- Short-term and long-term disability
- Access to mental health services
- 401k, tuition reimbursement and student loan paydown plans
- Fertility and adoption support and up to 18 weeks of paid parental leave
- Child care and family care discounts
- Free access to Peloton Digital App and apparel and product discounts
- Commuter benefits and Citi Bike Discount
- Pet insurance
Company Overview
Company H1B Sponsorship