Senior Staff Cloud Platform Engineer
Notice: Equinix is aware of scams involving fake employment offers. Read more.
Senior Staff Cloud Platform Engineer
- JR-161608
- Hybrid
- Bengaluru
- Technology
- Full time
Who are we?
Equinix is the world’s digital infrastructure company®, shortening the path to connectivity to enable the innovations that enrich our work, life and planet.
A place where bold ideas are welcomed, human connection is valued, and everyone has the opportunity to shape their future.
Job Summary
Analyzes business and engineering requirements to determine the feasibility of platform and infrastructure designs within time, cost, scalability, and reliability constraints. Designs, builds, and operates the cloud platform and developer tooling that engineering teams build on top of — providing self-service infrastructure, paved-road workflows, and the operational guardrails that keep large-scale systems secure, performant, and cost-efficient. Acts as a senior technical leader across multiple teams and domains.
Responsibilities
Requirements Analysis
Reviews, analyzes, and gives feedback on platform requirements, capacity/scaling needs, and functional designs
Translates product and engineering needs into platform capabilities and self-service tooling
Attends and drives requirement-definition meetings with engineering, security, and product stakeholders
Platform & Infrastructure Architecture
Participates in and leads the architectural review process for cloud and platform initiatives
Defines reference architectures, landing zones, and multi-account / multi-region strategies
Owns architectural decisions around compute, networking, storage, identity, and data services
Platform Design
Designs larger platform enhancements, cross-team / cross-system infrastructure, and developer-experience improvements
Builds internal developer platforms (IDPs), golden paths, and reusable infrastructure modules
Conducts design reviews and provides technical leadership across teams
Development / Engineering
Develops and maintains infrastructure-as-code, platform services, automation, and integrations
Fixes defects, participates in and conducts peer / code reviews
Follows and proposes infrastructure, IaC, and operational standards and processes
Conducts performance analysis, tuning, and optimization of platform components and cloud spend
Quality & Testing
Develops unit, integration, and infrastructure tests; defines test strategies for IaC and platform tooling
Implements automated validation, policy-as-code checks, and pre-deployment gates
Logs, manages, and triages issues; recommends and integrates testing frameworks
DevSecOps
Defines the roadmap for automation, CI/CD, and tooling, and articulates its value to engineering practices
Designs and maintains secure, automated delivery pipelines (build, test, scan, deploy) with GitOps-based workflows
Embeds DevSecOps throughout the lifecycle — “shift-left” security via SAST/DAST, dependency and container image scanning, IaC security scanning, secrets detection, and software supply-chain controls (SBOMs, signed artifacts, provenance)
Implements policy-as-code guardrails (e.g., OPA/Conftest) and automated compliance gates so security and governance are enforced in the pipeline rather than after the fact
Drives infrastructure and pipeline requirements; reviews release planning and deployment lists
Ensures quality, security, and completeness of deployments; champions progressive delivery (canary, blue/green, feature flags) to reduce release risk
Service Ownership & SLO-Driven Operations
Promotes a “you build it, you run it” culture and clear service ownership across engineering teams
Defines and operationalizes SLIs, SLOs, and error budgets; uses error-budget policy to balance feature velocity against reliability
Takes accountability for operational SLAs and the end-to-end health of owned platform services
Establishes on-call, incident management, and blameless post-incident review practices; owns L2/L3 debugging and leads major-incident response
Builds observability and alerting that ties directly to SLOs, reducing alert noise and improving MTTD/MTTR
Drives reliability improvements through capacity planning, chaos / resilience testing, and toil reduction
Infrastructure-as-Code Approach
Treats all infrastructure as code — declarative, version-controlled, peer-reviewed, and deployed through automated pipelines (no manual / console changes)
Establishes reusable, composable IaC modules and standards using Terraform with Terragrunt (for DRY configuration, environment / account scaling, and remote-state orchestration) and AWS CloudFormation where native provisioning is preferred — providing self-service, paved-road infrastructure for engineering teams
Enforces immutable infrastructure, environment parity, and reproducible builds
Integrates automated IaC validation, drift detection, security / policy scanning, and pre-apply checks into the workflow
Maintains clear state management, secrets handling, and change-control practices for infrastructure changes
Reporting
Responsible for status reporting on platform initiatives and operational health
Defines and drives release management planning
Technical Project Management
Provides level-of-effort (LOE) estimates
Manages assigned platform initiatives to schedule / plan; provides leadership and planning for large enhancements and projects
Qualifications
10–12+ years — of software / infrastructure engineering experience, with significant time in cloud platform, infrastructure, DevOps, or SRE roles
Cloud platforms — deep, hands-on expertise with at least one major provider (AWS, Azure, or GCP); working knowledge of a second is a plus
Infrastructure as Code — strong, hands-on proficiency with Terraform and Terragrunt for module composition, DRY configuration, and multi-account / multi-environment management; AWS CloudFormation experience required. Familiarity with Pulumi or Bicep/ARM is a plus
Containers & orchestration — production experience with Docker and Kubernetes (EKS/AKS/GKE) and Helm
CI/CD & GitOps — e.g., GitHub Actions, GitLab CI, Jenkins, Argo CD, Spinnaker
DevSecOps — shift-left security tooling — SAST/DAST, container and dependency scanning, secrets detection, policy-as-code (OPA), and supply-chain security (SBOMs, artifact signing)
Programming / scripting — proficiency in Python and/or Go for tooling and automation; Bash for scripting
Cloud networking — VPC, load balancing, DNS, CDN, ingress/egress design, and service mesh (Istio/Linkerd)
Security & identity — IAM, secrets management (e.g., Vault), least-privilege, and compliance awareness
Observability — Prometheus, Grafana, Datadog, ELK/Splunk, OpenTelemetry
Reliability / SRE — SLI/SLO/SLA definition, error budgets, capacity planning, incident management, and on-call leadership
FinOps — cloud cost optimization and accountability
Technical leadership — proven track record influencing architecture at scale across multiple teams
Education — Bachelor’s in Computer Science, Computer Engineering, or equivalent practical experience
Leadership & Soft-Skill Competencies
Technical leadership & influence — sets technical direction across multiple teams and drives alignment through influence rather than authority
Communication — explains complex platform and architecture concepts clearly to both engineers and non-technical stakeholders; strong written and verbal skills
Mentorship — coaches and grows senior and mid-level engineers; raises the bar through code / design reviews and knowledge sharing
Collaboration & cross-functional partnership — works effectively with product, security, networking, and engineering teams to deliver shared outcomes
Ownership & accountability — takes end-to-end responsibility for platform reliability, security, and cost, including during incidents
Pragmatic decision-making — balances speed, cost, risk, and quality; makes sound trade-offs under ambiguity and explains the reasoning
Stakeholder management — manages expectations, priorities, and competing demands across teams and leadership
Continuous improvement mindset — drives a culture of automation, learning, blameless retrospectives, and operational excellence
Calm under pressure — leads effectively during high-severity incidents and high-stakes delivery timelines
Preferred / Nice-to-Have
Multi-cloud or hybrid-cloud experience
Platform engineering / internal developer platform (IDP) experience (e.g., Backstage)
Experience with event streaming / messaging (Kafka), caching (Redis), and managed data services
Compliance / regulatory exposure (SOC 2, ISO 27001, PCI, HIPAA, FedRAMP)
Relevant certifications (e.g., AWS/Azure/GCP Professional, CKA/CKAD)
Contributions to open-source infrastructure tooling
Equinix is committed to ensuring that our employment process is open to all individuals, including those with a disability. If you are a qualified candidate and need assistance or an accommodation, please let us know by completing this form.
Equinix is an Equal Employment Opportunity and, in the U.S., an Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to unlawful consideration of race, color, religion, creed, national or ethnic origin, ancestry, place of birth, citizenship, sex, pregnancy / childbirth or related medical conditions, sexual orientation, gender identity or expression, marital or domestic partnership status, age, veteran or military status, physical or mental disability, medical condition, genetic information, political / organizational affiliation, status as a victim or family member of a victim of crime or abuse, or any other status protected by applicable law.
We use artificial intelligence in our hiring process. Learn more here.
This posting is for a backfill position, meaning it is to fill an existing vacancy within our organization.