405 Empregos para Reliability engineer - São Paulo
Site Reliability Engineer
Ontem
Trabalho visualizado
Descrição Do Trabalho
#J-18808-Ljbffr
Site Reliability Engineer
Ontem
Trabalho visualizado
Descrição Do Trabalho
About CloudWalk:
We are not just another fintech unicorn. We are a pack of dreamers, makers, and tech enthusiasts building the future of payments. With millions of happy customers and a hunger for innovation, we're now expanding our neural network - literally and metaphorically.
The Site Reliability Engineering (SRE) team aims to maximize the engineering velocity of developer teams while keeping products reliable. Working with us you will be responsible for the maintenance of sandbox and staging environments and the automation pipeline to ensure continuous testing.
What You'll Be Doing:- Help to develop and spread the DevOps culture (we love )
- Create and maintain development sandbox environments
- Automate and orchestrate workloads in cloud environments
- Assist in the configuration, use, and management of test versions and test data
- Integrate automated tests in the delivery pipeline
- Horizontally interact with other SRE and Quality Engineers throughout CloudWalk's engineering team
- Experience with cloud environments (GCP, AWS)
- Solid knowledge of Relational Databases, SQL, and ORM technologies
- Experience with CI tools
- Experience with containers technologies and orchestrators
- A high bar for quality
- Soft skills to master communication and collaboration throughout multiple teams
Join us at CloudWalk, where we’re not just engineering solutions; we’re building a smarter, AI-driven future for payments—together.
By applying for this position, your data will be processed as per CloudWalk's Privacy Policy that you can readhere in Portuguese andhere in English.
#J-18808-LjbffrData Reliability Engineer
Publicado há 2 dias atrás
Trabalho visualizado
Descrição Do Trabalho
Welcome to TELUS Digital, where innovation meets impact. As an award-winning digital product consultancy, we're shaping the future of digital experiences through cutting-edge technology, agile thinking, and a culture that puts people first.
We are the global digital section of TELUS, one of Canada’s largest telecommunications providers. Our global teams deliver transformative digital solutions and customer experiences for industry leaders in consumer electronics, finance, telecommunications, and utilities. With robust multi-shore delivery capabilities, multi-language programs, and secure infrastructure, we ensure exceptional service backed by our multi-billion-dollar parent company.
Location and FlexibilityThis role can be fully remote for candidates based in the states of São Paulo and Rio Grande do Sul as well as in the cities of Rio de Janeiro, and Belo Horizonte , due to team distribution and occasional in-person opportunities. If you are based in São Paulo or Porto Alegre, you are welcome to work from one of our offices on a flexible schedule.
Qualifications- 5+ years of hands-on experience in supporting data engineering teams, strongly emphasizing data pipeline enhancement and optimization, and data integration.
- Proficient in cloud computing, preferably Google Cloud Platform (GCP), but AWS and Azure are also valid.
- Experience with cloud data-related services such as BigQuery, Dataflow, Cloud Composer, Dataproc, Cloud Storage, Pub/Sub, or the correlated services from other providers.
- Solid proficiency with Python in terms of data processing.
- Knowledge of SQL and experience with relational databases.
- Proven experience optimizing data pipelines toward efficiency, reducing operational costs, and reducing the number of issues/failures.
- Solid knowledge of monitoring, troubleshooting, and resolving data pipeline issues.
- Familiarity with version control systems like Git.
- Strong English communication and documentation skills.
- Design and implement scalable data pipeline architectures in collaboration with Data Engineers.
- Continuously optimize data pipeline efficiency to reduce operational costs and minimize issues and failures.
- Monitor performance and reliability of data pipelines, enhancing reliability through data quality, analysis, and testing.
- Build and manage automated alerting systems for data pipeline issues.
- Automate repetitive tasks in data processing and management.
- Develop and manage disaster recovery and backup plans.
- In collaboration with other Data Engineering teams, conduct capacity planning for data storage and processing needs.
- Develop and maintain comprehensive documentation for data pipeline systems and processes, and provide knowledge
Site Reliability Engineer
Publicado há 3 dias atrás
Trabalho visualizado
Descrição Do Trabalho
• Garantir a disponibilidade, resiliência e escalabilidade dos serviços em produção.
• Criar e manter monitoramento, logging, tracing e alertas inteligentes para sistemas críticos.
• Desenvolver e manter pipelines CI/CD para entregas ágeis e seguras.
• Implementar Infraestrutura como Código (IaC) usando Terraform, Ansible ou CloudFormation.
• Atuar com Kubernetes e orquestração de containers para ambientes distribuídos.
• Definir e acompanhar SLIs, SLOs e SLAs junto aos times de engenharia.
• Liderar análises de incidentes e post-mortems, propondo melhorias contínuas.
• Trabalhar com segurança, governança e compliance em ambientes cloud.
*Requisitos desejáveis*
• Experiência comprovada como SRE, DevOps ou Engenheiro(a) de Infraestrutura.
• Domínio em cloud computing (AWS, GCP ou Azure).
• Forte experiência com Kubernetes e Docker.
• Conhecimento avançado em observabilidade (Prometheus, Grafana, Datadog, New Relic, etc.).
• Conhecimento em linguagens de automação (Python, Go, Shell Script).
• Prática com SRE principles: SLIs, SLOs, SLAs e Error Budgets.
*Diferenciais*
• Certificações cloud (AWS Solutions Architect, GCP Professional Cloud Engineer, Azure Expert).
• Experiência em migração para nuvem e modernização de aplicações.
• Conhecimento de arquiteturas de microsserviços.
Não encontrou uma vaga compatível? Cadastre-se em nosso Banco de Talentos! Banco de Talento - Vendas (Se inscreva, temos Kovi em várias regiões do Brasil) Banco de Talentos - Software Engineer Spec I e II (Júnior e Pleno)São Paulo, São Paulo, Brazil 51 minutes ago
São Paulo, São Paulo, Brazil 15 hours ago
São Bernardo do Campo, São Paulo, Brazil 2 days ago
Engenheiro de Projetos - Sistemas de PMS #J-18808-LjbffrSite Reliability Engineer
Publicado há 6 dias atrás
Trabalho visualizado
Descrição Do Trabalho
Personetics is shaping the Cognitive Banking era, harnessing AI to help banks anticipate customer needs, provide actionable insights, and deliver intelligent financial guidance. Our platform continuously analyzes and leverages real-time transactional data, enabling banks to proactively support customers in managing their finances and reaching their goals. As industry leaders—yes, we really are leaders—we partner with the world’s top financial institutions, empowering over 150 million customers monthly across 35 global markets from offices in New York, London, Singapore, São Paulo, and Tel Aviv.
About the positionWe are seeking a Site Reliability Engineer to join our Cloud Operations team in Brazil. In this role, you’ll help design, deploy, and maintain reliable, scalable cloud solutions, support customer onboarding, troubleshoot production issues, and optimize system performance. This is a great opportunity to grow your skills while working with modern cloud, container, and automation technologies in a global, fast-paced environment.
Responsibilities- Install, integrate, and operate end-to-end solutions and features, from design to production.
- Manage production systems and oversee CI/CD pipelines.
- Support customers during onboarding, including connecting and integrating their data into our system.
- Research, diagnose, troubleshoot, and resolve recurring environment issues.
- Participate in the on-call rotation and serve as an escalation point for incidents.
- Contribute to service design and architecture to proactively prevent system failures.
- 2-5 years of experience in Application Integration, SRE, or Production Operations.
- Bachelor’s degree in computer science, Software Engineering, or a related field
- Hands-on experience with:
- Linux and Docker
- Kubernetes on AKS or other container orchestration tools
- Terraform or similar IaC tools; experience with GitOps
- CI/CD solutions, preferably Jenkins
- Networking, including configuring WAF rules, IP whitelisting, and troubleshooting
- Strong problem-solving skills with the ability to prioritize effectively.
- High level of proficiency in English, both written and spoken.
- Experience with Maven and Nexus or similar registry solutions
- Familiarity with Git version control systems
- Knowledge of databases such as MySQL and PostgreSQL
- Scripting skills in Python, Bash, or Groovy
Site Reliability Engineer
Publicado há 6 dias atrás
Trabalho visualizado
Descrição Do Trabalho
Personetics is shaping the Cognitive Banking era, harnessing AI to help banks anticipate customer needs, provide actionable insights, and deliver intelligent financial guidance. Our platform continuously analyzes and leverages real-time transactional data, enabling banks to proactively support customers in managing their finances and reaching their goals. As industry leaders—yes, we really are leaders—we partner with the world’s top financial institutions, empowering over 150 million customers monthly across 35 global markets from offices in New York, London, Singapore, São Paulo, and Tel Aviv.
About the positionWe are seeking a Site Reliability Engineer to join our Cloud Operations team in Brazil. In this role, you’ll help design, deploy, and maintain reliable, scalable cloud solutions, support customer onboarding, troubleshoot production issues, and optimize system performance. This is a great opportunity to grow your skills while working with modern cloud, container, and automation technologies in a global, fast-paced environment.
Responsibilities- Install, integrate, and operate end-to-end solutions and features, from design to production.
- Manage production systems and oversee CI/CD pipelines.
- Support customers during onboarding, including connecting and integrating their data into our system.
- Research, diagnose, troubleshoot, and resolve recurring environment issues.
- Participate in the on-call rotation and serve as an escalation point for incidents.
- Contribute to service design and architecture to proactively prevent system failures.
- 2-5 years of experience in Application Integration, SRE, or Production Operations.
- Bachelor’s degree in computer science, Software Engineering, or a related field
- Hands-on experience with:
- Linux and Docker
- Kubernetes on AKS or other container orchestration tools
- Terraform or similar IaC tools; experience with GitOps
- CI/CD solutions, preferably Jenkins
- Networking, including configuring WAF rules, IP whitelisting, and troubleshooting
- Strong problem-solving skills with the ability to prioritize effectively.
- High level of proficiency in English, both written and spoken.
- Experience with Maven and Nexus or similar registry solutions
- Familiarity with Git version control systems
- Knowledge of databases such as MySQL and PostgreSQL
- Scripting skills in Python, Bash, or Groovy
Seja o primeiro a saber
Sobre o mais recente Reliability engineer Empregos em São Paulo !
Site Reliability Engineer
Publicado há 7 dias atrás
Trabalho visualizado
Descrição Do Trabalho
Personetics is shaping the Cognitive Banking era, harnessing AI to help banks anticipate customer needs, provide actionable insights, and deliver intelligent financial guidance. Our platform continuously analyzes and leverages real-time transactional data, enabling banks to proactively support customers in managing their finances and reaching their goals. As industry leaders—yes, we really are leaders—we partner with the world’s top financial institutions, empowering over 150 million customers monthly across 35 global markets from offices in New York, London, Singapore, São Paulo, and Tel Aviv.
About the positionWe are seeking a Site Reliability Engineer to join our Cloud Operations team in Brazil. In this role, you’ll help design, deploy, and maintain reliable, scalable cloud solutions, support customer onboarding, troubleshoot production issues, and optimize system performance. This is a great opportunity to grow your skills while working with modern cloud, container, and automation technologies in a global, fast-paced environment.
Responsibilities- Install, integrate, and operate end-to-end solutions and features, from design to production.
- Manage production systems and oversee CI/CD pipelines.
- Support customers during onboarding, including connecting and integrating their data into our system.
- Research, diagnose, troubleshoot, and resolve recurring environment issues.
- Participate in the on-call rotation and serve as an escalation point for incidents.
- Contribute to service design and architecture to proactively prevent system failures.
- 2-5 years of experience in Application Integration, SRE, or Production Operations.
- Bachelor’s degree in computer science, Software Engineering, or a related field
- Hands-on experience with:
- Linux and Docker
- Kubernetes on AKS or other container orchestration tools
- Terraform or similar IaC tools; experience with GitOps
- CI/CD solutions, preferably Jenkins
- Networking, including configuring WAF rules, IP whitelisting, and troubleshooting
- Strong problem-solving skills with the ability to prioritize effectively.
- High level of proficiency in English, both written and spoken.
- Experience with Maven and Nexus or similar registry solutions
- Familiarity with Git version control systems
- Knowledge of databases such as MySQL and PostgreSQL
- Scripting skills in Python, Bash, or Groovy
- Customer-facing experience.
Fields marked with * are mandatory.
First name *
Last name *
Email *
Phone *
Resume * Attach Resume
LinkedIn Profile URL
Attach Cover Letter
Attach Portfolio
Personal note
I agree that you can keep my data for an extended time period so that it will be easier for you to contact me about job opportunities.
#J-18808-LjbffrSite Reliability Engineer
Publicado há 20 dias atrás
Trabalho visualizado
Descrição Do Trabalho
2 weeks ago Be among the first 25 applicants
Get AI-powered advice on this job and more exclusive features.
Our US based client is looking for a mission-driven Site Reliability Engineer to support and scale the infrastructure powering their secure, mission-critical SaaS platform.
You must be confident in operating and debugging both modern infrastructure (cloud-native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to respond to incidents quickly, support ongoing automation, and scale systems reliably.
Responsibilities
- Be part of the team that owns the uptime and performance of our core backend infrastructure (Windows + Linux)
- Maintain and enhance observability across systems using Kibana, CloudWatch, and custom telemetry
- Manage CI/CD pipelines, infrastructure as code (Terraform, Ansible), and deployment automation
- Support and maintain production Windows environments:
- .NET Framework/Core apps running in IIS
- SQL Server with AlwaysOn replication and Service Broker-based messaging
- Support and operate cloud-native services:
- AWS Lambdas, DynamoDB, Postgres/Aurora, Redshift, Redis, and containerized workloads in Docker
- Participate in on-call rotation and incident response
- Collaborate closely with engineering teams to improve system reliability and deployment workflows
- 5+ years of SRE, DevOps, or WebOps experience supporting production SaaS systems
- Strong experience with Windows Server, IIS, and .NET applications in production
- Hands-on experience with SQL Server administration, including AlwaysOn and Service Broker
- Proficiency in AWS operations, including Lambda, DynamoDB, CloudWatch, and IAM
- Familiarity with Postgres, Redis, Kibana/ElasticSearch, and centralized logging
- Experience with Docker, Terraform, and Ansible for infrastructure management
- Strong scripting skills (PowerShell, Python)
- Experience running and debugging containerized and distributed systems in production
- Excellent incident response and debugging skills
Salary: $6,000 USD/month + Holidays
Unlimited PTO Seniority level
- Seniority level Mid-Senior level
- Employment type Full-time
- Job function Other
- Industries IT Services and IT Consulting
Referrals increase your chances of interviewing at Sur LATAM by 2x
Sign in to set job alerts for “Site Reliability Engineer” roles. Site Reliability Engineer Pleno – SRE (Remoto) DevOps Engineer Career Opportunities at Dev.Pro - 01 Site Reliability Engineer (SRE) - Technical Referent Software Engineer (Node.js) Career Opportunities at Dev.Pro - 01 Site Reliability Engineer (Middle) ID38916 Software Engineer (C++) Career Opportunities at Dev.Pro - 01 Site Reliability Engineer - Remote Work | REF# Software Development Engineer in Test (Windows) Intermediate Software Engineer (React.js, Node.js) - OP01587-OS Software Development Engineer in Test (MacOS) Senior Software Engineer (Python) - OP01837 Junior Software Development Engineer in Test / R+D - Remote Work | REF#We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-LjbffrDatabase Reliability Engineer
Publicado há 20 dias atrás
Trabalho visualizado
Descrição Do Trabalho
Responsável por oferecer suporte às equipes de desenvolvimento em questões relacionadas a bancos de dados, principalmente em ambientes de nuvem (AWS). Atuará com tunning de performance, soluções para altos volumes de dados, sugestões e implementação de novas arquiteturas, monitoramento de processos, automações para otimizar rotinas operacionais e aumentar a resiliência, além de propor, testar e implementar inovações em bancos de dados.
Responsabilidades Diárias- Fornecer suporte técnico às equipes de desenvolvimento em bancos de dados;
- Gerenciar ambientes em nuvem (AWS);
- Realizar troubleshooting e otimizações de performance;
- Sugerir e implementar novas arquiteturas de banco de dados;
- Monitorar processos e automações;
- Interagir com equipes de aplicação;
- Elaborar automações para eliminar rotinas operacionais e aumentar a resiliência;
- Propor, testar e implementar inovações em bancos de dados.
- Formação superior completa em Engenharia, Sistemas de Informação ou áreas relacionadas;
- Capacidade de trabalhar em ambientes dinâmicos com múltiplos projetos e tecnologias;
- Boa comunicação e interação com diversas áreas;
- Sólida experiência com serviços AWS (S3, KMS, RDS, EC2, EBS e outros);
- Conhecimento em bancos de dados em nuvem (PostgreSQL, MySQL, DynamoDB, MongoDB);
- Experiência com suporte a grandes fabricantes como Microsoft e AWS;
- Disponibilidade para atuar em modelo híbrido na Faria Lima, São Paulo.
- Participação nos Lucros e Resultados (PLR);
- Auxílio alimentação e refeição;
- Plano médico e odontológico;
- Auxílio creche/babá;
- Vale transporte;
- WellHub, TotalPass, Programa de Apoio Pessoal (EAP);
- Planos de previdência privada e seguro de vida por adesão;
- Desconto em farmácias, programas de nutrição e de gestantes;
- Licença maternidade e paternidade estendida.