Description:
Our team also puts a high value on work-life balance, and we understand that striking a healthy balance between your personal and professional life is crucial to your happiness and success here. We offer a flexible hybrid schedule so you can have a more productive and well-balanced life both in and outside of work.
Responsibilities
- Collaborate with a team of SRE engineers to operate SaaS capabilities across multiple regions on the cloud platform
- Design, implement, configure, and utilize monitoring systems to monitor the health of SaaS products
- Manage infrastructure used for ArcGIS Velocity and ArcGIS Workflow Manager, respond to alerts, and troubleshoot problems to resolution
- Develop, implement, and maintain automation solutions for repetitive operational tasks, such as deployment pipelines, incident resolution, and scaling processes
- Design and implement the deployment and upgrade containerized micro-service components that, when combined, power Esri’s SaaS offerings
- Create and automate Git workflows to simplify code integration, testing, and infrastructure deployments
- Participate in technical spike efforts, bringing new innovative ideas to future versions of our software
- Troubleshoot the system incidents and provide root cause analysis reports
- Provide rotational on-call technical support
Requirements
- 5+ years of experience managing Kubernetes (EKS), logging and monitoring (ELK, Prometheus), and container technologies (Docker)
- Proficient in using Terraform for automating infrastructure provisioning and management
- Ability to design and automate Git workflows for streamlined code integration, testing, and infrastructure deployment
- Ability to write scripts to deploy infrastructure and/or applications (Bash, Python, Terraform)
- Expert level understanding and experience with cloud computing platforms (AWS or Microsoft Azure)
- Strong knowledge of Linux Operating system administration, including troubleshooting, performance tuning, and shell scripting
- Proficient in cloud networking, including VPCs, subnets, security groups, and VPNs in platforms like AWS or Azure
- Skilled in identifying and resolving system and application issues through effective troubleshooting and root cause analysis
- Working knowledge of a source control and issue management system
- Bachelor’s in computer science, computer engineering, GIS, or information systems