Skip Navigation or Skip to Content
Conversation, Person, People, Computer Keyboard, Adult, Male, Man, Monitor, Face, Head

    Staff Software Engineer - Platform Engineering & SRE

    • JR-149963
    • Híbrido
    • Singapore
    • Information Technology
    • Full time

    Who are we?

    Equinix is the world’s digital infrastructure company®, operating over 260 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software speed. Equinix enables organizations to access all the right places, partners and possibilities to scale with agility, speed the launch of digital services, deliver world-class experiences and multiply their value, while supporting their sustainability goals. 

     

    Our culture is based on collaboration and the growth and development of our teams.  We hire hardworking people who thrive on solving challenging problems and give them opportunities to hone new skills and try new approaches, as we grow our product portfolio with new software and network architecture solutions. We embrace diversity in thought and contribution and are committed to providing an equitable work environment that is foundational to our core values as a company and is vital to our success.

    Job Summary 

    We are looking for a highly skilled and motivated Platform Engineering & SRE Staff Engineer to join our team. As a Platform Engineering SRE, you will play a critical role in developing, maintaining and improving the reliability, scalability, and performance of our systems, ensuring seamless user experiences. This position blends software engineering and systems engineering expertise to create automated solutions for operational challenges. 

    Responsibilities 

    Reliability and Performance 

    • Ensure the high availability, reliability, and performance of production systems and services 

    • Implement and maintain disaster recovery plans and procedures 

    • Monitor and manage system health using metrics, logs, and tracing to proactively identify and resolve issues

    Automation and Infrastructure 

    • Automate repetitive tasks, including deployment, scaling, monitoring, and remediation of systems

    •  Build and maintain infrastructure as code (IaC) using tools like Terraform, CloudFormation, or similar 

    Incident Management 

    • Participate in incident response and troubleshooting efforts to minimize downtime and resolve issues quickly

    • Conduct root cause analysis for system failures and implement preventive measures to avoid future incidents

    • Respond to incidents, perform root cause analysis, and implement solutions to prevent recurrence

    • Maintain incident response playbooks and ensure efficient on-call rotations 

    Observability and Monitoring 

    • Design and implement monitoring solutions using tools like Prometheus, Grafana, Datadog, or similar

    Collaboration 

    • Work closely with development, QA, and operations teams to ensure smooth delivery of applications

    • Act as a bridge between software engineering and operations, advocating for DevOps best practices

    • Document system configurations, processes, and procedures to ensure knowledge sharing and maintain system integrity 

    Capacity and Scalability 

    • Conduct capacity planning and optimize system scalability to meet future demands

    • Implement strategies for horizontal and vertical scaling of applications 

    Security and Compliance 

    • Ensure infrastructure security by implementing best practices and addressing vulnerabilities 

    • Collaborate with the security team to meet compliance standards and audits 

    Data Engineering & Automation 

    • Develop and maintain scalable and efficient data pipelines

    • Automate data workflows for ETL/ELT processes, integrating data from various sources into data warehouses and other storage solutions

    • Develop and maintain solutions for data transformation, data modelling, and automate the orchestration of data processing 

    Data Warehouse Management 

    • Implement and maintain modern data warehouse architectures, ensuring effective data storage, retrieval, and accessibility

    • Work with cloud-based data warehouses (e.g., BigQuery, Snowflake, Redshift) and optimize data models for analytics and reporting

    • Develop and manage dimensional models, star/snowflake schemas, and data marts for operational and analytical use cases 

    Real-time and Batch Data Processing 

    • Build and manage real-time and batch data pipelines for high-volume data ingestion, processing, and analytics

    • Leverage technologies such as Apache Kafka, Apache Beam, Apache Spark, and Google Cloud Dataflow for streaming and batch processing 

    Qualifications 

    Experience 

    • 5+ years of experience in a Data Platform including Site Reliability Engineering, DevOps, or Systems Engineering role 

    Technical Skills 

    • Strong programming skills in languages such as Python, Java, or similar

    • Experience in developing Data ingestion pipelines, Governance, Quality and automation

    • Experience in cloud platforms such as Google Cloud / AWS / Azure

    • Hands-on experience with CI/CD pipelines using tools like GitHub Actions, Jenkins

    • Exposer to containerization and orchestration technologies like Docker and Kubernetes

    • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK Stack)

    ​ 

    Methodologies 

    • Knowledge of Software Engineering, Data Modelling and SDLC

    • Understanding of SRE principles, including SLIs, SLOs, and error budgets

    • Knowledge of incident management frameworks and root cause analysis techniques 

    Soft Skills 

    • Strong analytical and problem-solving skills

    • Excellent communication and collaboration abilities 

    Preferred Qualifications 

    • Familiarity with configuration management tools (e.g., Ansible, Puppet, Chef)

    • Background in performance testing and load testing 

     

    Tools & Technologies 

    • Google Cloud Platform 

    • Python, Java, SQL 

    • Apache Beam/Spark/Google Cloud Dataflow 

    • Apache Airflow 

    • Prometheus, Grafana, ELK Stack, Terraform, Ansible, Puppet, Github Actions, Kafka, Docker, Kubernetes 

    Equinix is committed to ensuring that our employment process is open to all individuals, including those with a disability.  If you are a qualified candidate and need assistance or an accommodation, please let us know by completing this form.

    Equinix is an Equal Employment Opportunity and, in the U.S., an Affirmative Action employer.  All qualified applicants will receive consideration for employment without regard to unlawful consideration of race, color, religion, creed, national or ethnic origin, ancestry, place of birth, citizenship, sex, pregnancy / childbirth or related medical conditions, sexual orientation, gender identity or expression, marital or domestic partnership status, age, veteran or military status, physical or mental disability, medical condition, genetic information, political / organizational affiliation, status as a victim or family member of a victim of crime or abuse, or any other status protected by applicable law. 

    Presenta ya tu solicitud

    Avísame sobre trabajos como este

    ¿No tú?

    Le hemos enviado un código por correo electrónico para verificar su identidad. Por favor revise su carpeta de spam/basura si no recibe el correo electrónico en su bandeja de entrada.

    Ahora está siendo redirigido al sitio web de la aplicación.

    ¿Conoces a alguien que sea perfecto para este trabajo? ¡Díselo desde aquí!

    Introduce tus datos 

    ¿No tú?

    Le hemos enviado un código por correo electrónico para verificar su identidad. Por favor revise su carpeta de spam/basura si no recibe el correo electrónico en su bandeja de entrada.

    Gracias

    Otras personas también han visto

    Supervisor, Critical Facilities Engineering

    Regular 22 - Sr Supervisor Brendan McMahon Operations JR-150936 Sydney New South Wales Australia Sydney IBX Critical Facilities Operations Full time M1
    Who are we? Equinix is the world’s digital infrastructure company®, operating over 260 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software spe...

    Data Center Critical Facilities IV

    Regular 03 - Career Tolulope Bature Operations JR-150190 Lagos Lagos Nigeria Lagos IBX Critical Facilities Operations Full time O4
    Who are we? Equinix is the world’s digital infrastructure company®, operating over 260 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software spe...

    Data Centre Critical Engineer- Electrical & Mechanical level III

    Fixed Term 02 - Developing Micky Ripley Operations JR-151096 Amsterdam Netherlands Amsterdam IBX Critical Facilities Operations Full time O3
    Who are we? Equinix is the world’s digital infrastructure company®, operating over 260 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software spe...

    Data Centre Technician Level III

    Fixed Term 02 - Developing Micky Ripley Operations JR-151097 Amsterdam Netherlands Amsterdam IBX Customer Operations Operations Full time O3
    Who are we? Equinix is the world’s digital infrastructure company®, operating over 260 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software spe...

    Senior Data Center Critical Facilities Engineer

    Regular 04 - Senior Fabio Gonella Operations FI OPS JR-151041 Helsinki Uusimaa Finland Helsinki IBX Critical Facilities Operations Full time O5
    Who are we? Equinix is the world’s digital infrastructure company®, operating over 260 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software spe...

    Senior Staff App and Database Administrator

    Regular 14 - Expert Akshatha K S Information Technology JR-150592 Bangalore Karnataka India Bangalore Application and Database Administration Information Technology Full time T4
    Who are we? Equinix is the world’s digital infrastructure company®, operating over 260 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software spe...

    Presenta ya tu solicitud

    Avísame sobre trabajos como este

    ¿No tú?

    Le hemos enviado un código por correo electrónico para verificar su identidad. Por favor revise su carpeta de spam/basura si no recibe el correo electrónico en su bandeja de entrada.

    Ahora está siendo redirigido al sitio web de la aplicación.