720 Empregos para Reliability engineer - São Paulo

Reliability Engineer

São Paulo, São Paulo ALSTOM Gruppe

Toque novamente para fechar

Descrição Do Trabalho

Select how often (in days) to receive an alert:

Reliability Engineer

Date: 13 Aug 2025

Company: Alstom

At Alstom, we understand transport networks and what moves people. From high-speed trains, metros, monorails, and trams, to turnkey systems, services, infrastructure, signalling, and digital mobility, we offer our diverse customers the broadest portfolio in the industry. Every day, 80,000 colleagues lead the way to greener and smarter mobility worldwide, connecting cities as we reduce carbon and replace cars.

Could you be the Reliability Engineer in São Paulo we’re looking for?

Your future role

Take on a new challenge and apply your **Reliability, Availability, and Maintainability (RAM)** expertise in a dynamic and innovative field. You’ll work alongside **collaborative and forward-thinking** teammates.

You'll play a key role in ensuring the reliability of our products, projects, and tenders, contributing to the optimization of performance and cost-effectiveness. Day-to-day, you’ll work closely with cross-functional teams across the business (engineering, maintenance, and product development), perform technical reliability analyses, and support continuous improvement initiatives, and much more.

You’ll specifically take care of **RAM calculations, predictions, and verifications**, but also **develop maintenance plans and recommendations** to meet performance targets.

We’ll look to you for:

Managing RAM activities on projects, tenders, and product development, ensuring effective communication with teams.

Performing technical RAM analyses using established methods, standards, and guidelines.

Carrying out reliability calculations, statistical estimations, and producing detailed plans and reports.

Making design, maintenance, and operational recommendations to meet RAM targets.

Defending technical choices with internal teams and customers.

Implementing RAM monitoring processes and issuing key performance indicators (KPIs).

Supporting continuous improvement through the creation of integrated systems.

All about you

We value passion and attitude over experience. That’s why we don’t expect you to have every single skill. Instead, we’ve listed some that we think will help you succeed and grow in this role:

Bachelor’s degree in Electrical Engineering, Mechanical Engineering, Electronics Engineering, Mechatronics Engineering, or Automation Engineering.

Experience or understanding of electrical/electronic hardware, schematics, analog and digital electronics, and probability/statistical concepts.

Knowledge of troubleshooting at the board level, failure analysis, and root cause diagnosis.

Familiarity with RAMS techniques, reliability calculations, and statistical tools (e.g., ReliaSoft BlockSim, Weibull+, or similar).

A technical certification or equivalent experience in reliability engineering is a plus.

Proficiency in MS-Office Suite, especially advanced skills in MS-Word and MS-Excel.

Strong analytical problem-solving skills and the ability to work in a fast-paced environment.

Things you’ll enjoy

Join us on a life-long transformative journey – the rail industry is here to stay, so you can grow and develop new skills and experiences throughout your career. You’ll also:

Enjoy stability, challenges, and a long-term career

Work with cutting-edge reliability engineering techniques and tools.

Collaborate with transverse teams and helpful colleagues.

Contribute to innovative projects that shape the future of mobility.

Utilise our **flexible and inclusive** working environment.

Steer your career in whatever direction you choose across functions and countries.

Benefit from our investment in your development, through award-winning learning.

Progress towards leadership roles in RAM engineering or other areas of interest.

Benefit from a fair and dynamic reward package that recognises your performance and potential, plus comprehensive and competitive social coverage (life, medical, pension).

You don’t need to be a train enthusiast to thrive with us. We guarantee that when you step onto one of our trains with your friends or family, you’ll be proud. If you’re up for the challenge, we’d love to hear from you!

Important to note

As a global business, we’re an equal-opportunity employer that celebrates diversity across the 63 countries we operate in. We’re committed to creating an inclusive workplace for everyone.

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Emprego já não disponível

Esta posição já não está listada no WhatJobs. O empregador pode estar a analisar as candidaturas, preencheu a vaga ou removeu a listagem.

No entanto, temos empregos semelhantes disponíveis para si abaixo.

Data Reliability Engineer

São Paulo, São Paulo TELUS Digital Brazil

Hoje

Toque novamente para fechar

Descrição Do Trabalho

Join to apply for the Data Reliability Engineer role at TELUS Digital Brazil

1 week ago Be among the first 25 applicants

Get AI-powered advice on this job and more exclusive features.

Overview

Welcome to TELUS Digital , where innovation meets impact. As an award-winning digital product consultancy, we're shaping the future of digital experiences through cutting-edge technology, agile thinking, and a culture that puts people first. We are the global digital section of TELUS, one of Canada’s largest telecommunications providers. Our global teams deliver transformative digital solutions and customer experiences for industry leaders in consumer electronics, finance, telecommunications, and utilities.

Location and flexibility

This role can be fully remote for candidates based in the states of São Paulo and Rio Grande do Sul as well as in the cities of Rio de Janeiro, and Belo Horizonte , due to team distribution and occasional in-person opportunities. If you are based in São Paulo or Porto Alegre, you are welcome to work from one of our offices on a flexible schedule.

Qualifications

5+ years of hands-on experience in supporting data engineering teams, strongly emphasizing data pipeline enhancement and optimization, and data integration.
Proficient in cloud computing, preferably Google Cloud Platform (GCP), but AWS and Azure are also valid.
Experience with cloud data-related services such as BigQuery, Dataflow, Cloud Composer, Dataproc, Cloud Storage, Pub/Sub, or the correlated services from other providers.
Solid proficiency with Python in terms of data processing.
Knowledge of SQL and experience with relational databases.
Proven experience optimizing data pipelines toward efficiency, reducing operational costs, and reducing the number of issues/failures.
Solid knowledge of monitoring, troubleshooting, and resolving data pipeline issues.
Familiarity with version control systems like Git.
Strong English communication and documentation skills.

Responsibilities

Design and implement scalable data pipeline architectures in collaboration with Data Engineers.
Continuously optimize data pipeline efficiency to reduce operational costs and minimize issues and failures.
Monitor performance and reliability of data pipelines, enhancing reliability through data quality, analysis, and testing.
Build and manage automated alerting systems for data pipeline issues.
Automate repetitive tasks in data processing and management.
Develop and manage disaster recovery and backup plans.
In collaboration with other Data Engineering teams, conduct capacity planning for data storage and processing needs.
Develop and maintain comprehensive documentation for data pipeline systems and processes, and provide knowledge transfer to data-related teams.
Monitor, troubleshoot and resolve production issues in data processing workflows.
Maintain infrastructure reliability for data pipelines, enterprise datahub, HPBI, and MDM systems.
Conduct post-incident reviews and implement improvements for data pipelines.

Why TELUS Digital?

At TELUS Digital, you’ll work with world-class brands like FOX, HBO, PepsiCo, and Domino's, building transformative digital products that impact millions. Our global reach allows you to collaborate with diverse, international teams, solving complex problems and delivering tech-driven solutions that matter.

We thrive on engineering excellence, using the latest technologies in cloud computing, AI, machine learning, DevOps, microservices architecture, and data engineering. Our teams embrace Agile methodologies, CI/CD pipelines, and a DevOps-first mindset to deliver solutions at scale.

In addition to being part of an international and innovative consultancy company, you will have:

A Global Innovation Hub: Be part of an international consultancy at the forefront of technology
Work-Life Harmony: Enjoy flexible hours and autonomy to balance your professional and personal life
Cutting-Edge Tech Playground: Dive into the latest technologies and shape the future of digital solutions
Prestigious Partnerships: Collaborate with world-renowned brands, making a real impact in the market
Growth-Centric Environment: Thrive in our collaborative ecosystem with a clear career development path
Global Exposure: Embrace optional international travel opportunities to broaden your horizons

Equality

At TELUS Digital, we are proud to be an equal opportunity employer and are committed to creating a diverse and inclusive workplace. We are committed to building an inclusive team that represents a variety of backgrounds, perspectives, beliefs, and experiences. Therefore we provide equal employment opportunities to all employees and applicants regardless of race, color, religion, gender identity, sexual orientation, national origin, age, or disability.

We will only use the information you provide to process your application and to produce tracking statistics. Since we do not request personal data deemed sensitive, we ask you to abstain from sharing that information with us.

For more information on how we use your information, see our Privacy Policy.

Seniority level: Mid-Senior level

Employment type: Full-time

Job function: Engineering and Information Technology

Industries: Software Development

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo Willis Towers Watson

Hoje

Toque novamente para fechar

Descrição Do Trabalho

Description

Summary :

We’re looking for an experienced Platform/Infrastructure Engineer with a strong Microsoft Azure background and deep knowledge of Kubernetes. You'll play a key role in designing, deploying, and maintaining infrastructure and services that power our products. This role requires hands-on experience with automation, modern IaC practices, CI/CD, and maintaining production-grade environments.

The Role:

Operate, monitor, and improve cloud infrastructure for high-availability services in Azure
Deploy, configure and manage Kubernetes workloads at scale, including the use of Helm, ArgoCD, Flux, or similar GitOps tools
Build and maintain CI/CD pipelines using Azure DevOps or similar tooling
Write and maintain Infrastructure as Code using Terraform or OpenTofu
Develop scripts and automation to support infrastructure and deployment workflows - PowerShell is preferred
Collaborate with engineering teams to support platform reliability and enable delivery
Maintain visibility and awareness through monitoring and logging tools such as Datadog, Azure Monitor, App Insights etc.
Support incident resolution and participate in an on-call rota to help maintain service uptime

Qualifications

The Requirements:

Essential Experience:

Proven experience in a Platform, Infrastructure, or DevOps engineering role
Hands-on experience operating 24x7 services in a public cloud, ideally Azure
Strong experience managing infrastructure using Terraform or OpenTofu
Experience managing and scaling Kubernetes clusters in production environments
Proficient with CI/CD tooling, preferably Azure DevOps (YAML pipelines)
Strong scripting skills using PowerShell
Experience with monitoring and logging solutions such as Azure Monitor, App Insights, or similar
Clear communicator with the ability to collaborate across cross-functional teams

Nice to Have:

Azure certifications (e.g. Azure Administrator, Azure DevOps Engineer)
Experience with GitOps and tools such as ArgoCD or Flux
Familiarity with Configuration as Code tools like Ansible or Puppet
Exposure to large-scale distributed systems or high-volume web APIs
Awareness of incident response processes and platform reliability best practices

Equal Opportunity Employer

At WTW, we believe difference makes us stronger. We want our workforce to reflect the different and varied markets we operate in and to build a culture of inclusivity that makes colleagues feel welcome, valued and empowered to bring their whole selves to work every day. We are an equal opportunity employer committed to fostering an inclusive work environment throughout our organisation. We embrace all types of diversity.

At WTW, we trust you to know your work and the people, tools and environment you need to be successful. The majority of our colleagues work in a ”hybrid” style, with a mix of remote, in-person and in-office interactions dependent on the needs of the team, role and clients. Our flexibility is rooted in trust and “hybrid” is not a one-size-fits-all solution.

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo Thales Group

Publicado há 3 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

Site Reliability Engineer page is loaded# Site Reliability Engineerremote type: On-Sitelocations: São Paulotime type: Full timeposted on: Posted 2 Days Agojob requisition id: R Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000 organizations already rely on us to verify the identities of people and things, grant access to digital services, analyze vast quantities of information and encrypt data to make the connected world more secure.**Site Reliability Engineer**This position is **on-site** model in our **Berrini** unit.**Position Summary**The candidate will be working as a SRE member who will help the organization to constantly ensure reliability, availability and performance of large-scale ODC services.SRE will work closely with development teams to design, build, and maintain scalable and reliable infrastructure, automate processes, monitor system health, and respond to incidents effectively with a mindset of efficiency on day-to-day activities. SRE will constantly adopt ITIL and Agile methodologies/processes, coaching and mentoring on best practices. Will endorse whole lifecycle over Public Cloud ensuring to meet external customer SLA and internal OLAs.**Key Areas of Responsibility*** Develop and maintain Infrastructure as a Code and automation tools* Responsible to Integrate, Operate and Support 7x24 mission critical services with 5x9 availability on public cloud.* Responsible to ensure tier 1 / Platinum SLAs* Responsible to review technical products and understand customer requirements.* Responsible to perform regular tuning.* Able to work with distributed teams worldwide.* Responsible for defining business continuity strategy for Operated services over public cloud.* Motivate the team on daily basis through Agile ceremonies (Daily, refinement, planning.)* Responsible for suggesting indicators on team monitoring.* Responsible for facilitating exchanges with the many stakeholders.* Continuously improve service reliability, performance, and security of the services* Collaborate with Service Delivery Managers on traffic trends, analyze the impact of mid-term business changes on capacity requirements.* Participate in capacity management processes and security audits.* Design and implement changes into the systems.* Adapt solution parameters to make architecture evolutions.* Maintain and enhance internal tools to improve service industrialization.* Participate in the Presales, deployment and integration of the solutions form the Support perspectives.* Definition of production requirements.**Minimum Qualifications*** Bachelor Degree in Information Technology or a related field* Experience in design, development and implementation of applications.* Experience in Public Cloud (GCP or AWS)* Minimum C1 English (Advanced Level)* Strong experience on Kubernetes (certification)* Strong experience on Apache Http Server* Strong experience on TLS >= 1.2* Ability to work SRE engineers during integration and operation project phases.* Strong experience working in Agile teams* ITIL/Agile certification* Experience in embedding agile performance metrics to drive accountability.* Effective verbal and written communication skills* Strong working experience on one of the scripting languages – SHELL/Python is required.If you’re excited about working with Thales, but not meeting the requirements for this position, we encourage you to join our Talent Community!**What We Offer**Thales provides an extensive benefits program for all full-time employees working 30 or more hours per week and their eligible dependents, including the following:* Elective Health and Dental plans.* Retirement Savings Plan with a company contribution and a match, and without vesting period.* Company paid holidays, vacation days, and paid sick leave.* Company provided Life Insurance.Say HI and learn more about working at Thales *.*#LI-Onsite#LI-IP1At Thales we provide CAREERS and not only jobs. With Thales employing 80,000 employees in 68 countries our mobility policy enables thousands of employees each year to develop their careers at home and abroad, in their existing areas of expertise or by branching out into new fields. Together we believe that embracing flexibility is a smarter way of working. Great journeys start here, apply now!
#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo CloudWalk, Inc.

Publicado há 3 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

About CloudWalk:

We are not just another fintech unicorn. We are a pack of dreamers, makers, and tech enthusiasts building the future of payments. With millions of happy customers and a hunger for innovation, we're now expanding our neural network - literally and metaphorically.

The Site Reliability Engineering (SRE) team aims to maximize the engineering velocity of developer teams while keeping products reliable. Working with us you will be responsible for the maintenance of sandbox and staging environments and the automation pipeline to ensure continuous testing.

What You'll Be Doing:

Help to develop and spread the DevOps culture (we love )
Create and maintain development sandbox environments
Automate and orchestrate workloads in cloud environments
Assist in the configuration, use, and management of test versions and test data
Integrate automated tests in the delivery pipeline
Horizontally interact with other SRE and Quality Engineers throughout CloudWalk's engineering team

What You Need To Succeed:

Experience with cloud environments (GCP, AWS)
Solid knowledge of Relational Databases, SQL, and ORM technologies
Experience with CI tools
Experience with containers technologies and orchestrators
A high bar for quality
Soft skills to master communication and collaboration throughout multiple teams

Join us at CloudWalk, where we’re not just engineering solutions; we’re building a smarter, AI-driven future for payments—together.

By applying for this position, your data will be processed as per CloudWalk's Privacy Policy that you can readhere in Portuguese andhere in English.

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Data Reliability Engineer

São Paulo, São Paulo TELUS Digital Brazil

Publicado há 4 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

Data Reliability Engineer Who We Are

Welcome to TELUS Digital, where innovation meets impact. As an award-winning digital product consultancy, we're shaping the future of digital experiences through cutting-edge technology, agile thinking, and a culture that puts people first.

We are the global digital section of TELUS, one of Canada’s largest telecommunications providers. Our global teams deliver transformative digital solutions and customer experiences for industry leaders in consumer electronics, finance, telecommunications, and utilities. With robust multi-shore delivery capabilities, multi-language programs, and secure infrastructure, we ensure exceptional service backed by our multi-billion-dollar parent company.

Location and Flexibility

Qualifications

5+ years of hands-on experience in supporting data engineering teams, strongly emphasizing data pipeline enhancement and optimization, and data integration.
Proficient in cloud computing, preferably Google Cloud Platform (GCP), but AWS and Azure are also valid.
Experience with cloud data-related services such as BigQuery, Dataflow, Cloud Composer, Dataproc, Cloud Storage, Pub/Sub, or the correlated services from other providers.
Solid proficiency with Python in terms of data processing.
Knowledge of SQL and experience with relational databases.
Proven experience optimizing data pipelines toward efficiency, reducing operational costs, and reducing the number of issues/failures.
Solid knowledge of monitoring, troubleshooting, and resolving data pipeline issues.
Familiarity with version control systems like Git.
Strong English communication and documentation skills.

Responsibilities:

Design and implement scalable data pipeline architectures in collaboration with Data Engineers.
Continuously optimize data pipeline efficiency to reduce operational costs and minimize issues and failures.
Monitor performance and reliability of data pipelines, enhancing reliability through data quality, analysis, and testing.
Build and manage automated alerting systems for data pipeline issues.
Automate repetitive tasks in data processing and management.
Develop and manage disaster recovery and backup plans.
In collaboration with other Data Engineering teams, conduct capacity planning for data storage and processing needs.
Develop and maintain comprehensive documentation for data pipeline systems and processes, and provide knowledge transfer to data-related teams.
Monitor, troubleshoot and resolve production issues in data processing workflows.
Maintain infrastructure reliability for data pipelines, enterprise datahub, HPBI, and MDM systems.
Conduct post-incident reviews and implement improvements for data pipelines.

Why TELUS Digital?

We thrive on engineering excellence, using the latest technologies in cloud computing, AI, machine learning, DevOps, microservices architecture, and data engineering. Our teams embrace Agile methodologies, continuous integration and deployment (CI/CD) pipelines, and a DevOps-first mindset to deliver solutions at scale.

In addition to being part of an international and innovative consultancy company, you will have:

A Global Innovation Hub: Be part of an international consultancy at the forefront of technology
Work-Life Harmony: Enjoy flexible hours and autonomy to balance your professional and personal life
Cutting-Edge Tech Playground: Dive into the latest technologies and shape the future of digital solutions
Prestigious Partnerships: Collaborate with world-renowned brands, making a real impact in the market
Growth-Centric Environment: Thrive in our collaborative ecosystem with a clear career development path
Global Exposure: Embrace optional international travel opportunities to broaden your horizons

Some of our benefits:

Health and dental plan
Life insurance
Monthly voucher for meals, culture, education, health and mobility
Child care assistance and more!

Equality

For more information on how we use your information, see our Privacy Policy.

Create a Job Alert

Interested in building your career at TELUS Digital Brazil? Get future opportunities sent straight to your email.

Apply for this job

First Name *

Last Name *

Email *

Phone *

Resume/CV *

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn URL (Please list N/A if you don't have one) *

How did you hear about us? * Select.

Where did you hear about us? * Select.

If you were referred by a team member, please provide their name.

Are you legally authorized to work in Brazil? * Select.

Please select your English language proficiency. We ask that you submit your resume and any other application materials in English. The interview process will be conducted in English. * Select.

(Brazil) Voluntary Demographic Questions Voluntary Self-Identification

WillowTree is committed to fostering a diverse, inclusive, and equitable workplace. To help us measure the effectiveness of our outreach and recruitment programs, we invite you to voluntarily self-identify in the following areas. Please note that completing these questions is optional and will have no impact on hiring decisions.

The information you provide will be kept confidential and used in aggregate for reporting and compliance purposes, ensuring that you cannot be identified individually.

Gender Identification * Select.

Disability Status (Disability is a long-term physical, mental, intellectual, or sensory impairment. Examples include restricted mobility, blindness, deafness, speech impairment, learning and attention issues, and/or post-traumatic stress disorder). * Select.

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo Volkswagen Financial Services | Brazil

Publicado há 5 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

• Garantir a disponibilidade, resiliência e escalabilidade dos serviços em produção.

• Criar e manter monitoramento, logging, tracing e alertas inteligentes para sistemas críticos.

• Desenvolver e manter pipelines CI/CD para entregas ágeis e seguras.

• Implementar Infraestrutura como Código (IaC) usando Terraform, Ansible ou CloudFormation.

• Atuar com Kubernetes e orquestração de containers para ambientes distribuídos.

• Definir e acompanhar SLIs, SLOs e SLAs junto aos times de engenharia.

• Liderar análises de incidentes e post-mortems, propondo melhorias contínuas.

• Trabalhar com segurança, governança e compliance em ambientes cloud.

*Requisitos desejáveis*

• Experiência comprovada como SRE, DevOps ou Engenheiro(a) de Infraestrutura.

• Domínio em cloud computing (AWS, GCP ou Azure).

• Forte experiência com Kubernetes e Docker.

• Conhecimento avançado em observabilidade (Prometheus, Grafana, Datadog, New Relic, etc.).

• Conhecimento em linguagens de automação (Python, Go, Shell Script).

• Prática com SRE principles: SLIs, SLOs, SLAs e Error Budgets.

*Diferenciais*

• Certificações cloud (AWS Solutions Architect, GCP Professional Cloud Engineer, Azure Expert).

• Experiência em migração para nuvem e modernização de aplicações.

• Conhecimento de arquiteturas de microsserviços.

Não encontrou uma vaga compatível? Cadastre-se em nosso Banco de Talentos! Banco de Talento - Vendas (Se inscreva, temos Kovi em várias regiões do Brasil) Banco de Talentos - Software Engineer Spec I e II (Júnior e Pleno)

São Paulo, São Paulo, Brazil 51 minutes ago

São Paulo, São Paulo, Brazil 15 hours ago

São Bernardo do Campo, São Paulo, Brazil 2 days ago

Engenheiro de Projetos - Sistemas de PMS #J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo Personetics

Publicado há 8 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

Personetics is shaping the Cognitive Banking era, harnessing AI to help banks anticipate customer needs, provide actionable insights, and deliver intelligent financial guidance. Our platform continuously analyzes and leverages real-time transactional data, enabling banks to proactively support customers in managing their finances and reaching their goals. As industry leaders—yes, we really are leaders—we partner with the world’s top financial institutions, empowering over 150 million customers monthly across 35 global markets from offices in New York, London, Singapore, São Paulo, and Tel Aviv.

About the position

We are seeking a Site Reliability Engineer to join our Cloud Operations team in Brazil. In this role, you’ll help design, deploy, and maintain reliable, scalable cloud solutions, support customer onboarding, troubleshoot production issues, and optimize system performance. This is a great opportunity to grow your skills while working with modern cloud, container, and automation technologies in a global, fast-paced environment.

Responsibilities

Install, integrate, and operate end-to-end solutions and features, from design to production.
Manage production systems and oversee CI/CD pipelines.
Support customers during onboarding, including connecting and integrating their data into our system.
Research, diagnose, troubleshoot, and resolve recurring environment issues.
Participate in the on-call rotation and serve as an escalation point for incidents.
Contribute to service design and architecture to proactively prevent system failures.

Requirements

2-5 years of experience in Application Integration, SRE, or Production Operations.
Bachelor’s degree in computer science, Software Engineering, or a related field
Hands-on experience with:

- Linux and Docker

- Kubernetes on AKS or other container orchestration tools

- Terraform or similar IaC tools; experience with GitOps

- CI/CD solutions, preferably Jenkins

- Networking, including configuring WAF rules, IP whitelisting, and troubleshooting

Strong problem-solving skills with the ability to prioritize effectively.
High level of proficiency in English, both written and spoken.

Nice to have

Experience with Maven and Nexus or similar registry solutions
Familiarity with Git version control systems
Knowledge of databases such as MySQL and PostgreSQL
Scripting skills in Python, Bash, or Groovy

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Seja o primeiro a saber

Sobre o mais recente Reliability engineer Empregos em São Paulo !

Definir alerta por e-mail:

Digite seu e-mail

Cargo

Localização

Site Reliability Engineer

São Paulo, São Paulo Personetics Ltd

Publicado há 8 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

About the position

Responsibilities

Install, integrate, and operate end-to-end solutions and features, from design to production.
Manage production systems and oversee CI/CD pipelines.
Support customers during onboarding, including connecting and integrating their data into our system.
Research, diagnose, troubleshoot, and resolve recurring environment issues.
Participate in the on-call rotation and serve as an escalation point for incidents.
Contribute to service design and architecture to proactively prevent system failures.

Requirements

2-5 years of experience in Application Integration, SRE, or Production Operations.
Bachelor’s degree in computer science, Software Engineering, or a related field
Hands-on experience with:

- Linux and Docker

- Kubernetes on AKS or other container orchestration tools

- Terraform or similar IaC tools; experience with GitOps

- CI/CD solutions, preferably Jenkins

- Networking, including configuring WAF rules, IP whitelisting, and troubleshooting

Strong problem-solving skills with the ability to prioritize effectively.
High level of proficiency in English, both written and spoken.

Nice to have

Experience with Maven and Nexus or similar registry solutions
Familiarity with Git version control systems
Knowledge of databases such as MySQL and PostgreSQL
Scripting skills in Python, Bash, or Groovy

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo Personetics Ltd

Publicado há 9 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

About the position

Responsibilities

Install, integrate, and operate end-to-end solutions and features, from design to production.
Manage production systems and oversee CI/CD pipelines.
Support customers during onboarding, including connecting and integrating their data into our system.
Research, diagnose, troubleshoot, and resolve recurring environment issues.
Participate in the on-call rotation and serve as an escalation point for incidents.
Contribute to service design and architecture to proactively prevent system failures.

Requirements

2-5 years of experience in Application Integration, SRE, or Production Operations.
Bachelor’s degree in computer science, Software Engineering, or a related field
Hands-on experience with:

- Linux and Docker

- Kubernetes on AKS or other container orchestration tools

- Terraform or similar IaC tools; experience with GitOps

- CI/CD solutions, preferably Jenkins

- Networking, including configuring WAF rules, IP whitelisting, and troubleshooting

Strong problem-solving skills with the ability to prioritize effectively.
High level of proficiency in English, both written and spoken.

Nice to have

Experience with Maven and Nexus or similar registry solutions
Familiarity with Git version control systems
Knowledge of databases such as MySQL and PostgreSQL
Scripting skills in Python, Bash, or Groovy
Customer-facing experience.

Apply for this position

Fields marked with * are mandatory.

First name *

Last name *

Email *

Phone *

Resume * Attach Resume

LinkedIn Profile URL

Attach Cover Letter

Attach Portfolio

Personal note

I agree that you can keep my data for an extended time period so that it will be easier for you to contact me about job opportunities.

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo Sur LATAM

Publicado há 22 dias atrás

Toque novamente para fechar

Descrição Do Trabalho

2 weeks ago Be among the first 25 applicants

Get AI-powered advice on this job and more exclusive features.

Our US based client is looking for a mission-driven Site Reliability Engineer to support and scale the infrastructure powering their secure, mission-critical SaaS platform.

You must be confident in operating and debugging both modern infrastructure (cloud-native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to respond to incidents quickly, support ongoing automation, and scale systems reliably.

Responsibilities

Be part of the team that owns the uptime and performance of our core backend infrastructure (Windows + Linux)
Maintain and enhance observability across systems using Kibana, CloudWatch, and custom telemetry
Manage CI/CD pipelines, infrastructure as code (Terraform, Ansible), and deployment automation
Support and maintain production Windows environments:
.NET Framework/Core apps running in IIS
SQL Server with AlwaysOn replication and Service Broker-based messaging
Support and operate cloud-native services:
AWS Lambdas, DynamoDB, Postgres/Aurora, Redshift, Redis, and containerized workloads in Docker
Participate in on-call rotation and incident response
Collaborate closely with engineering teams to improve system reliability and deployment workflows

Requirements

5+ years of SRE, DevOps, or WebOps experience supporting production SaaS systems
Strong experience with Windows Server, IIS, and .NET applications in production
Hands-on experience with SQL Server administration, including AlwaysOn and Service Broker
Proficiency in AWS operations, including Lambda, DynamoDB, CloudWatch, and IAM
Familiarity with Postgres, Redis, Kibana/ElasticSearch, and centralized logging
Experience with Docker, Terraform, and Ansible for infrastructure management
Strong scripting skills (PowerShell, Python)
Experience running and debugging containerized and distributed systems in production
Excellent incident response and debugging skills

Benefits

Salary: $6,000 USD/month + Holidays

Unlimited PTO

Seniority level

Seniority level Mid-Senior level

Employment type

Employment type Full-time

Job function

Job function Other
Industries IT Services and IT Consulting

Referrals increase your chances of interviewing at Sur LATAM by 2x

Sign in to set job alerts for “Site Reliability Engineer” roles. Site Reliability Engineer Pleno – SRE (Remoto) DevOps Engineer Career Opportunities at Dev.Pro - 01 Site Reliability Engineer (SRE) - Technical Referent Software Engineer (Node.js) Career Opportunities at Dev.Pro - 01 Site Reliability Engineer (Middle) ID38916 Software Engineer (C++) Career Opportunities at Dev.Pro - 01 Site Reliability Engineer - Remote Work | REF# Software Development Engineer in Test (Windows) Intermediate Software Engineer (React.js, Node.js) - OP01587-OS Software Development Engineer in Test (MacOS) Senior Software Engineer (Python) - OP01837 Junior Software Development Engineer in Test / R+D - Remote Work | REF#

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

Desculpe, este trabalho não está disponível em sua região

Indústria

Ver tudo Reliability engineer Empregos Ver todas as vagas em São Paulo

Menu

Sugestões de pesquisa

Pesquisas Recentes

Pesquisas populares

Sugestões de localização

Locais populares

Locais próximos

Outros empregos perto de mim

Indústria

720 Empregos para Reliability engineer - São Paulo

Reliability Engineer

Descrição Do Trabalho

Data Reliability Engineer

Descrição Do Trabalho

Site Reliability Engineer

Descrição Do Trabalho

Site Reliability Engineer

Descrição Do Trabalho

Site Reliability Engineer

Descrição Do Trabalho

Data Reliability Engineer

Descrição Do Trabalho

Site Reliability Engineer

Descrição Do Trabalho

Site Reliability Engineer

Descrição Do Trabalho

Seja o primeiro a saber

Site Reliability Engineer

Descrição Do Trabalho

Site Reliability Engineer

Descrição Do Trabalho

Site Reliability Engineer

Descrição Do Trabalho

Locais próximos

Outros empregos perto de mim

Indústria