Conversation, Person, Adult, Male, Man, Head, Computer Keyboard, Face, Coat, Monitor

Principal Engineer, Product Software

 

Notice: Equinix is aware of scams involving fake employment offers. Read more. 

Principal Engineer, Product Software

  • JR-159040
  • Hybride
  • Bengaluru
  • Technology
  • Full time
Favoriten anzeigen

Who are we?

Equinix is the world’s digital infrastructure company®, shortening the path to connectivity to enable the innovations that enrich our work, life and planet. 
 

A place where bold ideas are welcomed, human connection is valued, and everyone has the opportunity to shape their future.

Help us challenge assumptions, uncover bias, and remove barriers—because progress starts with fresh ideas. You’ll find belonging, purpose, and a team that welcomes you—because when you feel valued, you’re empowered to do your best work.

Job Summary

We are looking for a highly skilled and forward-thinking Principal Engineer (SRE Architecture) to define and drive the next-generation reliability architecture across platforms and transformations. This role focuses on ensuring that every new transformation, service, and feature is onboarded with the right SRE foundations, observability practices, DR readiness, reliability hygiene, and AIOps enablement.

You will partner with engineering, platform, and transformation teams to architect reliability by design—defining SLI/SLO frameworks, ensuring Operational Readiness (ORR) compliance, building reusable reliability patterns, and driving the unified observability and AIOps strategy across the ecosystem. This role is deeply technical and hands-on, requiring strong architectural judgment, systems thinking, and leadership in guiding teams toward operational excellence.

Responsibilities

  • Lead Reliability Architecture for New Transformations

  • Ensure all new programs, services, and platforms are onboarded with the right SRE foundations—covering DR readiness, observability, capacity, performance, and operational hygiene

  • Drive & Enable ORR (Operational Readiness Review) Compliance
    Define architectural guardrails, review technical designs, and ensure teams meet ORR requirements before production launch

  • Define and Operationalize SLI/SLO Frameworks
    Partner with engineering to define service-level indicators and objectives, ensuring transformations and new features adhere to reliability goals

  • Architect Unified Observability & AIOps Integration
    Ensure all services are onboarded correctly onto the unified observability stack, with proper instrumentation, dashboards, alerting, and correlation patterns

  • Define AIOps Enablement Use Cases
    Identify and define patterns that leverage telemetry, automation, and intelligence—including anomaly detection, event deduplication, and predictive insights

  • Reliability Architecture Reviews
    Conduct deep technical reviews of system architecture, focusing on resilience, failure modes, performance, availability, and operational workflows

  • Handhold Transformations Through Hypercare to BAU
    Guide new transformations end-to-end—architecture reviews → observability setup → DR completion → ORR readiness → launch → BAU stabilization.

  • Build Reusable SRE Blueprints
    Create standardized templates and patterns for logging, monitoring, alerting, DR design, chaos readiness, and performance baselines

  • Partner with SRE & Platform Teams
    Work closely with Infrastructure, SRE, and Platform Engineering to ensure architectural alignment and drive adoption of reliability engineering best practices

Qualifications

  • 12+ years of experience in large-scale distributed systems, SRE, or platform engineering roles, with deep architectural responsibilities

  • Expertise in SRE foundations: SLI/SLOs, error budgets, incident response, capacity, chaos engineering, DR, reliability patterns

  • Strong hands-on background in observability stacks (Datadog, Splunk, Prometheus, Grafana, OpenTelemetry)

  • Experience with modern cloud-native architecture, container platforms, and microservices

  • Strong familiarity with DevOps practices, CI/CD pipelines, deployment strategies (blue/green, canary, progressive rollout)

  • Experience defining and enforcing ORR, reliability gates, operational hygiene, and launch readiness

  • Ability to influence architecture & engineering teams with strong systems thinking and operational rigor

  • Excellent communication skills with the ability to translate reliability goals into actionable engineering guidance

Equinix is committed to ensuring that our employment process is open to all individuals, including those with a disability.  If you are a qualified candidate and need assistance or an accommodation, please let us know by completing this form.

Equinix is an Equal Employment Opportunity and, in the U.S., an Affirmative Action employer.  All qualified applicants will receive consideration for employment without regard to unlawful consideration of race, color, religion, creed, national or ethnic origin, ancestry, place of birth, citizenship, sex, pregnancy / childbirth or related medical conditions, sexual orientation, gender identity or expression, marital or domestic partnership status, age, veteran or military status, physical or mental disability, medical condition, genetic information, political / organizational affiliation, status as a victim or family member of a victim of crime or abuse, or any other status protected by applicable law. 

We use artificial intelligence in our hiring process. Learn more here.