2.448 Empregos para Reliability Engineer - Brasil

Reliability Engineer

Belém, Pará Kerry

Ontem

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

About Kerry

Kerry is the world's leading taste and nutrition company for the food, beverage and pharmaceutical industries. Every day we partner with customers to create healthier, tastier and more sustainable products that are consumed by billions of people across the world. Our vision is to be our customers' most valued partner, creating a world of sustainable nutrition. A career with Kerry offers you an opportunity to shape the future of food while providing you opportunities to explore and grow in a truly global environment.

About the role

This position leads the site’s Preventative Maintenance and Mechanical Integrity functions, with a strong focus on building and sustaining a robust Reliability-Centered Maintenance (RCM) program. Key responsibilities include enhancing inspection procedures, continuously reviewing and updating the critical equipment list, and providing training to the maintenance team as needed.

The role works in close partnership with the Maintenance Manager to define, plan, and schedule all maintenance activities—ranging from daily tasks and one-day outages to preventive maintenance and major shutdowns.

Key responsibilities
  • Lead Failure Investigations : Conduct thorough investigations into equipment failures using methodologies such as Failure Modes and Effects Analysis (FMEA) and Root Cause Analysis (RCA). Develop and implement corrective actions to eliminate root causes and prevent recurrence.
  • Optimize Preventive Maintenance Programs : Design and continuously refine preventive maintenance strategies based on equipment criticality and historical failure data. Prioritize tasks to minimize operational impact and maximize asset reliability and lifespan.
  • Implement Predictive Maintenance Technologies : Deploy and manage predictive maintenance tools such as vibration analysis, infrared thermography, oil analysis, and ultrasonic testing. Leverage data insights to proactively schedule maintenance and avoid unplanned downtime.
  • Analyze Equipment Performance : Monitor and interpret equipment performance metrics to identify trends, inefficiencies, and potential risks. Use statistical tools and reliability indicators to drive data-informed decisions and continuous improvement.
  • Foster Cross-Functional Collaboration : Partner with operations, maintenance, and engineering teams to resolve reliability challenges, share best practices, and support plant-wide improvement initiatives. Provide technical leadership to align efforts with maintenance and reliability goals.
  • Train and Mentor Maintenance Personnel : Provide training and mentorship on reliability best practices, preventive and predictive maintenance techniques, and effective troubleshooting. Promote a proactive maintenance culture across the organization.
  • Maintain Documentation and Reporting : Keep detailed records of maintenance activities, equipment performance, and reliability metrics. Prepare and present reports to management, highlighting key findings, progress on initiatives, and recommendations for improvement.
Qualifications and skills
  • Bachelor’s degree in Engineering or completion of a technical school training program.
  • 3–5 years of experience as a Reliability Engineer or in a similar role within a manufacturing environment, preferably in the Food & Beverage industry.
  • 1–3 years of maintenance and supervisory experience preferred.
  • CMRP (Certified Maintenance & Reliability Professional) certification preferred.
  • Strong troubleshooting and problem-solving skills.
  • Experience with TPM (Total Productive Maintenance) and/or Lean Manufacturing initiatives.
  • Proven experience developing and managing predictive and preventive maintenance programs.
  • Proficiency with computer applications, including SAP, CMMS, and other business software.
  • Solid understanding of networked systems and PC-based tools used in maintenance and reliability operations.
Compensation

The typical hiring range for this role is $75,602 to $123,432 annually and is based on several factors including but not limited to education, work experience, certifications, location, etc. Kerry offers benefits such as a comprehensive benefits package, incentive and recognition programs, equity stock purchase and retirement contribution (all benefits and incentives are subject to eligibility requirements).

Equal Employment Opportunity

Kerry is an equal opportunity employer. Employment decisions are made without regard to race, color, religion, national or ethnic origin, sex, sexual orientation, gender identity or expression, age, disability, protected veteran status or other characteristics protected by law. Kerry will only employ those who are legally authorized to work in the United States for this opening. Any offer of employment is conditional upon the successful completion of a background investigation and drug screen. Additional information can be found at: Know Your Rights: Workplace Discrimination is Illegal (dol.gov).

Job details
  • Seniority level: Mid-Senior level
  • Employment type: Full-time
  • Industries: Food and Beverage Manufacturing
  • Location: Allentown, PA (also Bethlehem, PA as listed in posting)
#J-18808-Ljbffr
Desculpe, este trabalho não está disponível em sua região

Reliability Engineer

Flinks

Ontem

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Flinks is where financial data moves—with purpose, trust, and impact.

We’re on a mission to simplify access to financial data and help businesses build better, faster, and more secure financial products and experiences. Since 2016, we’ve been bridging the gap between fintechs, financial institutions, and consumers by enabling seamless, secure data connectivity.

From instant account funding to smarter lending, our solutions help power some of the most innovative financial products in North America. We partner with lenders, banks, and fintechs to streamline onboarding, prevent fraud, and fuel real-time decision-making with enriched, reliable data.

As pioneers in Canada’s open banking movement, we’re not waiting for the future—we’re building it. If you’re bold, curious, and ready to help shape the future of finance, we’d love to meet you.

About the Reliability Team

As a Reliability Engineer, you will play a pivotal role in ensuring the stability, performance, and reliability of Flinks Fintech product platforms, and monitoring & alerting systems. You will serve as an expert in both software development and system support, working closely with engineering, operations, and product teams to troubleshoot complex issues, resolve incidents, and continuously improve the technical foundation of our products. This role demands a combination of advanced coding skills, incident management experience, and an understanding of the fin-tech industry.

What You’ll Do
  • Develop and maintain code to quickly resolve product issues, ensuring fast recovery and long-term system stability
  • Provide live operational support across multiple client applications, monitoring services and alerts to detect and resolve critical failures with minimal downtime
  • Own and troubleshoot complex incidents, conduct root cause analyses, and implement long-term solutions—adhering to SLAs and internal SLOs
  • Build monitoring dashboards and alerting systems to proactively detect and address issues, supporting system scalability and stability
  • Analyze operational metrics and KPIs to identify trends, surface client pain points, and drive improvements
  • Automate tooling and processes to improve efficiency and reduce manual work across LiveOps
  • Collaborate with cross-functional teams to deliver lasting fixes for production issues and contribute to technical analyses of product gaps
  • Lead and mentor reliability engineers, providing guidance and ensuring consistent delivery of high-quality work
  • Participate in post-incident reviews, documenting outcomes and driving preventative action items
  • Support after-hours on-call coverage as part of the LiveOps rotation
Qualifications
  • 5+ years of experience with .NET Framework (C#), ensuring production system stability
  • Strong coding, debugging, and troubleshooting skills, particularly in performance optimization of large-scale applications
  • Operationally focused with expertise in incident management and resolving live production issues
  • Proven experience in building and maintaining reliable monitoring and alerting systems in high-demand environments, with a focus on production support
  • Strong knowledge of Kubernetes, Docker, and cloud platforms (GCP preferred)
  • Proficiency with monitoring tools like Prometheus, Grafana, and Kibana
  • Experience with incident ticketing/documentation tools like FreshDesk and Confluence
  • Critical thinker who can identify system weaknesses and find innovative solutions
  • Strong project management skills with a focus on scalability and system stability
Nice to haves
  • ITIL Service Management certification (or equivalent) is highly desired, such as ITIL v3, ITIL v4, or other equivalent certifications
  • Experience with PowerBI, web scraping, or Golang
The Interview Process
  • Head of People Ops
  • Case Assignment & Presentation
  • Director Interview
Seniority level
  • Mid-Senior level
Employment type
  • Full-time
Job function
  • Engineering and Information Technology
  • Industries
  • Technology, Information and Internet

Referrals increase your chances of interviewing at Flinks by 2x

#J-18808-Ljbffr
Desculpe, este trabalho não está disponível em sua região

Reliability Engineer

Laguna, Santa Catarina Philippine Geothermal Production Company, Inc.

Publicado há 10 dias atrás

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Philippine Geothermal Production Company, Inc. (PGPC) is a Filipino corporation operating the Tiwi geothermal steam field in the province of Albay and the Mak-Ban geothermal steam field in the provinces of Batangas and Laguna. It is owned by the SM Investments Corporation.

Tiwi and Mak-Ban are the result of a successful partnership between Philippine Geothermal’s predecessor and the National Power Corporation that began in 1971, when a 2.5 kW government experiment was transformed into the first commercial geothermal power project in Southeast Asia, and led to the birth of the geothermal industry in the Philippines.

Philippine Geothermal continues its legacy of providing a clean, stable, reliable, and renewable source of energy to meet the country’s growing power requirements. Its vision is to be the leading geothermal energy company, recognized not only for its world-class performance but also for contributing to the improvement in the lives of the people in the communities where it operates.

The successful candidate will provide engineering support to the Assets in ensuring the reliability, operability, availability, and maintainability of the steam fields’ equipment, process/systems, and people, in order to help in optimizing generation, minimizing costs and achieving performance objectives.

The Role
  • Implements Reliability and Integrity Management Process (RIM) programs
  • Leads the review of maintenance philosophies in collaboration with Operations and Maintenance (O&M) personnel on existing and new equipment through Failure Modes & Effects Analysis (FMEA) and/or Reliability-Centered Maintenance (RCM) philosophies and recommends Preventive/Predictive Maintenance (PM/PdM) techniques and other improvements, as necessary
  • Provides reliability data interpretation, analysis, and reporting
  • Develops and calculates reliability metric as basis for reliability improvements/programs
  • Coordinates with the Quality Assurance (QA) Group in implementing inspection/reliability programs
  • Supports asset expense and capital projects to ensure quality, timeliness, cost effectiveness
  • Ensures compliance with the OE requirements, engineering and industry codes and standards, QA/QC programs and government regulations in the conduct of activities
  • Coaches and mentors other engineers to develop their skills and competencies
The Individual
  • Bachelor’s degree in Engineering preferably in Mechanical, Chemical or Industrial
  • With at least 5 years of working experience in Reliability and/or Maintenance Engineering
  • Preferably with experience in facilitating Reliability-Centered Maintenance (RCM) and/or Failure Modes & Effect Analysis (FMEA) workshops
  • Well-versed in engineering codes, industry standards and practices
  • Experience in Process Engineering and Project Engineering an advantage

If you are encountering difficulties submitting your application through this website, kindly send your resume and filled out application form directly to .

#J-18808-Ljbffr
Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

ITeam

Hoje

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Sobre a Empresa

Com mais de 20 anos de mercado, a ITeam se destaca pelo comprometimento com o cliente. Baseamos nosso relacionamento em valores sólidos e objetivos claros, oferecendo soluções e serviços de TI que auxiliam na realização das metas dos nossos clientes. Nossa missão é fornecer serviços de TI que se alinhem com a estratégia e processos dos nossos clientes, sempre a partir de um capital humano qualificado.


SRE Sênior

Espanhol Fluente

Atuará no período noturno 21h ~ 6h horário Brasil

Remoto



Sobre o Papel


O profissional será responsável por suporte e resolução de chamados de 2o e 3o níveis, além de acompanhar ciclos de faturamento e fluxos diários de arrecadação e cobrança.



Responsabilidades


  • Suporte e resolução de chamados de 2o e 3o níveis. (análise e direcionamento de causa raiz);
  • Acompanhamento dos ciclos de faturamento (Billing) para garantir entregas (emissão de faturas, entregas fiscais e contábeis etc.);
  • Acompanhamento dos fluxos diários de arrecadação e cobrança para garantir entregas ao negócio.
  • Ajuda o tech lead/Liderança a resolver problemas de confiabilidade e prioriza nas atividades do projeto, dado os desafios de negócio e das necessidades da solução.
  • É proativo ao pedir feedbacks, escuta e evolui continuamente.
  • É autodidata, aprende coisas novas com regularidade por iniciativa própria.
  • Se atenta a o que outros projetos já fizeram e traz experiências passadas para o projeto atual, visando minimizar erros.
  • Se adapta rapidamente frente às mudanças do projeto como novas tarefas, repriorização, apoios técnicos.
  • Manter a qualidade das soluções desenvolvidas independente da complexidade da tarefa ou processo a ser melhorado.
  • Tem o "radar" ligado, se preocupa com riscos, premissas e se mobiliza para alcançar os objetivos traçados com o time.
  • Alta capacidade de fazer acontecer assuntos complexos, dada sua mobilização, criatividade e experiências passadas.
  • Mantém-se focado em tornar os produtos confiáveis.
  • Mapeamento do estado atual para identificar possíveis melhorias e tornar a plataforma mais resiliente.


Qualificações


  • Experiência em Sistema Operacional Linux (ex: Debian, Red Hat, etc) modo texto.


Habilidades Necessárias


  • Criação de scripts em Shell Script ou Powershell
  • Automatização em Terraform | CloudFormation | Pulumi *
  • GIT
  • Saber fazer CI/CD
  • Experiência com Jenkins ou Gitlab
  • Experiência com Docker
  • Noções de Kubernetes
  • Conhecimento de Cloud Platforms: AWS | AZURE ou GCP
  • Experiência ter trabalhado em times ágeis
  • Ter experiência em estimar prazos e participar planning backlog;
  • Saber desenvolver soluções com docker e docker-compose para microserviços, APIs, etc.
  • Automações são eficientes e possuem certo grau de escalabilidade quando necessário (adaptabilidade, performance e confiabilidade).
  • Tem domínio na criação de alertas e métricas essenciais para os sistemas através de ferramentas ou serviços como Splunk, Prometheus, Grafana, Cloud Watch, etc.
  • Suas soluções e aprendizados são compartilhados com o time, a comunidade.
  • Executa e/ou suporta Chaos Engineering através de ferramentas de testes de desempenho, falha, etc. (Ex: Jmeter, P4All)


Habilidades Preferenciais


  • Tem domínio técnico da linguagem de desenvolvimento de soluções, assim como também para Cloud, Segurança e Performance.
  • Constrói automações ou recursos de fácil reuso e manutenção.
  • Identifica causas-raízes, aplica sessões de postmortem diminuindo a complexidade ao lidar com futuros incidentes.
  • Dissemina sua solução técnica, preocupado em torná-la referência principalmente para outros SREs.
  • Implanta diretrizes de confiabilidade em suas soluções e dá apoio técnico para que o time faça o mesmo.
  • Implanta métricas, alertas, para deixar as soluções aderente ao negócio e a experiência do cliente.
  • Executa automação de deploy contínuo para evitar tarefas repetitivas.
  • Experiência em Cloud Platform: AWS, Azure ou GCP
  • Orquestração em Kubernetes
  • Experiência em CI/CD
  • Ferramentas de Observability
  • Experiência em ferramentas de deploy contínuo (Terraform, Puppet)
  • Mindset de "Automatize tudo que for possível"
  • Experiência em infraestrutura de Código: Terraform & Cloudformation
  • Conhecimento de alguma linguagem de programação: Java, Kotlin, Go, Python, Ruby ou Rust.
  • Vivência em lidar com ambientes críticos ou alta escalabilidade.
  • Experiência na prestação de serviços para empresas do segmento de TELECOM
Desculpe, este trabalho não está disponível em sua região

Deployment Reliability Engineer

HCLTech

Publicado há 2 dias atrás

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho


Your role and responsabilities:


  • Manage continuous delivery and configuration of SAP Ariba Cloud products using modern deployment tools.
  • Respond quickly to deployment requests and provide technical support for the SAP Ariba suite.
  • Collaborate with engineering subject matter experts to ensure seamless operations.
  • Handle user tickets and change requests within defined SLAs.
  • Automate manual tasks to improve scalability and efficiency.
  • Lead complex deployment projects including new site setups and disaster recovery planning.
  • Manage certificate renewals and troubleshoot related issues.
  • Document SOPs and apply ITIL best practices.


Requiriments and Qualifications:


  • Experience in a Unix/Linux environment.
  • Familiar with SAP Ariba Cloud products
  • Proven experience in 24x7 enterprise environments.
  • Hands-on expertise with cloud provisioning (preferably GCP and AWS).
  • Proficiency in Terraform and CI/CD tools like Jenkins, Artifactory, Docker, Vault.
  • Development experience in Python, Go, or Groovy.
  • Strong knowledge of system applications (Apache, DNS, SSH, TCP/IP, NFS).
  • Deep understanding of OS internals and file system structures.
  • Experience with certificate management and scripting (Perl, Python, Shell).
  • Basic knowledge of HANA database administration.
  • Excellent communication, analytical, and multitasking skills.
  • Bachelor’s degree in MIS, CS, or equivalent experience.
  • Advanced English


Please submit resumé in English

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

HCLTech

Publicado há 2 dias atrás

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Your role and responsabilities:


  • Handling major incidents via CIRS (Critical Issue Response System) and providing frequent updates until resolution.
  • Performing deep-dive application troubleshooting and identifying preventive actions.
  • Managing CIRS-related requests including deployments, feature toggles, and data fixes.
  • Following up on major production incidents and coordinating with cross-functional teams.
  • Enhancing monitoring capabilities using tools like Dynatrace, Kibana, and Splunk .
  • Writing and improving monitoring scripts and alerts based on incident learnings.
  • Handling customer escalations and coordinating with Support & Engineering teams.
  • Supporting planned activities and responding to ad-hoc requests from CES teams.


Requirements and Qualifications:


  • Deep experience in DevOps and Production Support .
  • Experience in automation and CI/CD practices.
  • Familiarity with cloud platforms (GCP, AWS, or Azure preferred).
  • Hands-on experience with monitoring tools such as Dynatrace, Kibana, Splunk .
  • Strong troubleshooting skills and ability to deep dive into application issues.
  • Excellent communication and coordination skills across teams.


Please submit resumé in English.

Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

Gauge

Publicado há 15 dias atrás

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Somos uma empresa do Grupo Stefanini. Especializados em marketing digital, utilizamos uma abordagem integrada que combina tecnologia, inteligência de dados, design e profundo conhecimento do comportamento do consumidor. Nosso foco está em potencializar os resultados de nossos parceiros, oferecendo soluções que vão desde consultoria estratégica até a execução e acompanhamento dos projetos. Com um time dedicado e altamente qualificado, a Gauge se destaca por sua capacidade de entender as necessidades específicas de cada cliente e entregar resultados de alta performance.

Com forte presença na América Latina e em expansão nos Estados Unidos, estamos sempre na vanguarda, aplicando as últimas tendências de mercado e mantendo um olhar atento à inovação contínua.



Desculpe, este trabalho não está disponível em sua região
Seja o primeiro a saber

Sobre o mais recente Reliability engineer Empregos em Brasil !

Site Reliability Engineer

Buenos Aires, Pernambuco DEUNA

Hoje

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Overview

As a Mid SRE at DEUNA, you’ll ensure the reliability, scalability, and performance of our AWS-based platform by integrating observability, automation, and SRE best practices across the software lifecycle. You will work closely with development teams to improve uptime, provide observability tooling, and ensure we scale efficiently and securely.

Key Responsibilities
  • Design, define, and maintain observability and monitoring for our AWS infrastructure
  • Define and track SLIs, SLOs, and SLAs for critical systems
  • Improve system uptime, latency, and fault tolerance across the platform
  • Provide internal libraries and toolsets to developers for diagnostics and debugging
  • Manage scaling, performance, and resilience efforts related to system reliability
  • Collaborate with technical teams on capacity planning, load testing, and scaling policies
  • Improve production operations by defining and evolving deployment strategies and conducting disaster recovery (DR) testing
Technical Skills
  • Expertise with Prometheus, Grafana, OpenTelemetry, AWS CloudWatch, or other observability tools
  • Experience designing dashboards, alerts, and log aggregation pipelines
  • Deep understanding of AWS services: ECS, Lambda, RDS, CodePipeline
  • Strong proficiency in Go programming language
  • Skilled at defining SLIs, SLOs, error budgets, and improving Mean Time to Recovery (MTTR)
  • Experience conducting failure drills (e.g., Chaos Monkey, Gremlin) to ensure system resilience
Soft Skills
  • Excellent communication and collaboration skills
  • Adaptability to thrive in dynamic, fast-paced environments
  • Strong time management and task prioritization
  • Proficiency in English
What you will find when you join DEUNA
  • A multicultural team distributed throughout LATAM
  • Dynamism, agility and constant innovation
  • Being part of a high-impact solution for an entire region
  • The best tools and technology to operate
  • Being part of the startup culture
  • We are in full expansion!
Benefits
  • Vacations and additional PTO
  • Remote work from anywhere
  • Economic support for health insurance, internet and cell phone line
  • We all own DEUNA, we offer stock options
  • Learning and development platform
  • Multidisciplinary, diverse and dynamic team
  • Growth and career path
  • Be part of a dynamic team that's creating the next generation payments platform
  • Join us at DEUNA
Details
  • Seniority level: Not Applicable
  • Employment type: Full-time
  • Job function: Engineering and Information Technology
  • Industries: Software Development

#J-18808-Ljbffr
Desculpe, este trabalho não está disponível em sua região

Data Reliability Engineer

São Paulo, São Paulo TELUS Digital Brazil

Hoje

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Join to apply for the Data Reliability Engineer role at TELUS Digital Brazil

1 week ago Be among the first 25 applicants

Get AI-powered advice on this job and more exclusive features.

Overview

Welcome to TELUS Digital , where innovation meets impact. As an award-winning digital product consultancy, we're shaping the future of digital experiences through cutting-edge technology, agile thinking, and a culture that puts people first. We are the global digital section of TELUS, one of Canada’s largest telecommunications providers. Our global teams deliver transformative digital solutions and customer experiences for industry leaders in consumer electronics, finance, telecommunications, and utilities.

Location and flexibility

This role can be fully remote for candidates based in the states of São Paulo and Rio Grande do Sul as well as in the cities of Rio de Janeiro, and Belo Horizonte , due to team distribution and occasional in-person opportunities. If you are based in São Paulo or Porto Alegre, you are welcome to work from one of our offices on a flexible schedule.

Qualifications
  • 5+ years of hands-on experience in supporting data engineering teams, strongly emphasizing data pipeline enhancement and optimization, and data integration.
  • Proficient in cloud computing, preferably Google Cloud Platform (GCP), but AWS and Azure are also valid.
  • Experience with cloud data-related services such as BigQuery, Dataflow, Cloud Composer, Dataproc, Cloud Storage, Pub/Sub, or the correlated services from other providers.
  • Solid proficiency with Python in terms of data processing.
  • Knowledge of SQL and experience with relational databases.
  • Proven experience optimizing data pipelines toward efficiency, reducing operational costs, and reducing the number of issues/failures.
  • Solid knowledge of monitoring, troubleshooting, and resolving data pipeline issues.
  • Familiarity with version control systems like Git.
  • Strong English communication and documentation skills.
Responsibilities
  • Design and implement scalable data pipeline architectures in collaboration with Data Engineers.
  • Continuously optimize data pipeline efficiency to reduce operational costs and minimize issues and failures.
  • Monitor performance and reliability of data pipelines, enhancing reliability through data quality, analysis, and testing.
  • Build and manage automated alerting systems for data pipeline issues.
  • Automate repetitive tasks in data processing and management.
  • Develop and manage disaster recovery and backup plans.
  • In collaboration with other Data Engineering teams, conduct capacity planning for data storage and processing needs.
  • Develop and maintain comprehensive documentation for data pipeline systems and processes, and provide knowledge transfer to data-related teams.
  • Monitor, troubleshoot and resolve production issues in data processing workflows.
  • Maintain infrastructure reliability for data pipelines, enterprise datahub, HPBI, and MDM systems.
  • Conduct post-incident reviews and implement improvements for data pipelines.
Why TELUS Digital?

At TELUS Digital, you’ll work with world-class brands like FOX, HBO, PepsiCo, and Domino's, building transformative digital products that impact millions. Our global reach allows you to collaborate with diverse, international teams, solving complex problems and delivering tech-driven solutions that matter.

We thrive on engineering excellence, using the latest technologies in cloud computing, AI, machine learning, DevOps, microservices architecture, and data engineering. Our teams embrace Agile methodologies, CI/CD pipelines, and a DevOps-first mindset to deliver solutions at scale.

In addition to being part of an international and innovative consultancy company, you will have:

  • A Global Innovation Hub: Be part of an international consultancy at the forefront of technology
  • Work-Life Harmony: Enjoy flexible hours and autonomy to balance your professional and personal life
  • Cutting-Edge Tech Playground: Dive into the latest technologies and shape the future of digital solutions
  • Prestigious Partnerships: Collaborate with world-renowned brands, making a real impact in the market
  • Growth-Centric Environment: Thrive in our collaborative ecosystem with a clear career development path
  • Global Exposure: Embrace optional international travel opportunities to broaden your horizons
Equality

At TELUS Digital, we are proud to be an equal opportunity employer and are committed to creating a diverse and inclusive workplace. We are committed to building an inclusive team that represents a variety of backgrounds, perspectives, beliefs, and experiences. Therefore we provide equal employment opportunities to all employees and applicants regardless of race, color, religion, gender identity, sexual orientation, national origin, age, or disability.

We will only use the information you provide to process your application and to produce tracking statistics. Since we do not request personal data deemed sensitive, we ask you to abstain from sharing that information with us.

For more information on how we use your information, see our Privacy Policy.

Seniority level: Mid-Senior level

Employment type: Full-time

Job function: Engineering and Information Technology

Industries: Software Development

#J-18808-Ljbffr
Desculpe, este trabalho não está disponível em sua região

Site Reliability Engineer

São Paulo, São Paulo Willis Towers Watson

Hoje

Trabalho visualizado

Toque novamente para fechar

Descrição Do Trabalho

Description

Summary :

We’re looking for an experienced Platform/Infrastructure Engineer with a strong Microsoft Azure background and deep knowledge of Kubernetes. You'll play a key role in designing, deploying, and maintaining infrastructure and services that power our products. This role requires hands-on experience with automation, modern IaC practices, CI/CD, and maintaining production-grade environments.

The Role:

  • Operate, monitor, and improve cloud infrastructure for high-availability services in Azure
  • Deploy, configure and manage Kubernetes workloads at scale, including the use of Helm, ArgoCD, Flux, or similar GitOps tools
  • Build and maintain CI/CD pipelines using Azure DevOps or similar tooling
  • Write and maintain Infrastructure as Code using Terraform or OpenTofu
  • Develop scripts and automation to support infrastructure and deployment workflows - PowerShell is preferred
  • Collaborate with engineering teams to support platform reliability and enable delivery
  • Maintain visibility and awareness through monitoring and logging tools such as Datadog, Azure Monitor, App Insights etc.
  • Support incident resolution and participate in an on-call rota to help maintain service uptime
Qualifications

The Requirements:

Essential Experience:

  • Proven experience in a Platform, Infrastructure, or DevOps engineering role
  • Hands-on experience operating 24x7 services in a public cloud, ideally Azure
  • Strong experience managing infrastructure using Terraform or OpenTofu
  • Experience managing and scaling Kubernetes clusters in production environments
  • Proficient with CI/CD tooling, preferably Azure DevOps (YAML pipelines)
  • Strong scripting skills using PowerShell
  • Experience with monitoring and logging solutions such as Azure Monitor, App Insights, or similar
  • Clear communicator with the ability to collaborate across cross-functional teams

Nice to Have:

  • Azure certifications (e.g. Azure Administrator, Azure DevOps Engineer)
  • Experience with GitOps and tools such as ArgoCD or Flux
  • Familiarity with Configuration as Code tools like Ansible or Puppet
  • Exposure to large-scale distributed systems or high-volume web APIs
  • Awareness of incident response processes and platform reliability best practices

Equal Opportunity Employer

At WTW, we believe difference makes us stronger. We want our workforce to reflect the different and varied markets we operate in and to build a culture of inclusivity that makes colleagues feel welcome, valued and empowered to bring their whole selves to work every day. We are an equal opportunity employer committed to fostering an inclusive work environment throughout our organisation. We embrace all types of diversity.

At WTW, we trust you to know your work and the people, tools and environment you need to be successful. The majority of our colleagues work in a ”hybrid” style, with a mix of remote, in-person and in-office interactions dependent on the needs of the team, role and clients. Our flexibility is rooted in trust and “hybrid” is not a one-size-fits-all solution.

#J-18808-Ljbffr
Desculpe, este trabalho não está disponível em sua região

Locais próximos

Outros empregos perto de mim

Indústria

  1. workAdministrativo
  2. ecoAgricultura e Florestas
  3. schoolAprendizagem e Estágios
  4. apartmentArquitetura
  5. paletteArtes e Entretenimento
  6. paletteAssistência Médica
  7. diversity_3Assistência Social
  8. diversity_3Atendimento ao Cliente
  9. flight_takeoffAviação
  10. account_balanceBanca e Finanças
  11. spaBeleza e Bem-Estar
  12. shopping_bagBens de grande consumo (FMCG)
  13. restaurantCatering
  14. point_of_saleComercial e Vendas
  15. shopping_cartCompras
  16. constructionConstrução
  17. supervisor_accountConsultoria de Gestão
  18. person_searchConsultoria de Recrutamento
  19. person_searchContábil
  20. brushCriativo e Digital
  21. currency_bitcoinCriptomoedas e Blockchain
  22. child_friendlyCuidados Infantis
  23. shopping_cartE-commerce e Redes Sociais
  24. schoolEducação e Ensino
  25. boltEnergia
  26. medical_servicesEnfermagem
  27. foundationEngenharia Civil
  28. electrical_servicesEngenharia Eletrotécnica
  29. precision_manufacturingEngenharia Industrial
  30. buildEngenharia Mecânica
  31. scienceEngenharia Química
  32. biotechFarmacêutico
  33. gavelFunção Pública
  34. gavelGerenciamento
  35. gavelGerenciamento de Projetos
  36. gavelHotelaria e Turismo
  37. smart_toyIA e Tecnologias Emergentes
  38. home_workImobiliário
  39. handymanInstalação e Manutenção
  40. gavelJurídico
  41. gavelLazer e Esportes
  42. clean_handsLimpeza e Saneamento
  43. inventory_2Logística e Armazenamento
  44. inventory_2Manufatura e Produção
  45. campaignMarketing
  46. local_hospitalMedicina
  47. local_hospitalMídia e Relações Públicas
  48. constructionMineração
  49. medical_servicesOdontologia
  50. sciencePesquisa e Desenvolvimento
  51. local_gas_stationPetróleo e Gás
  52. emoji_eventsRecém-Formados
  53. groupsRecursos Humanos
  54. securitySegurança da Informação
  55. local_policeSegurança Pública
  56. policySeguros
  57. diversity_3Serviços Sociais
  58. directions_carSetor Automotivo
  59. wifiTelecomunicações
  60. psychologyTerapia
  61. codeTI e Software
  62. local_shippingTransporte
  63. local_shippingVarejo
  64. petsVeterinária
Ver tudo Reliability engineer Empregos