Site Reliability Engineer 1

WEX, Inc.
life insurance, paid time off, tuition reimbursement
United States, Maine, Portland
1 Hancock Street (Show on map)
Nov 21, 2024
About the Team/Role The WEX Site Reliability Engineering (SRE) team is looking for a motivated and quick-learning Level 1 Site Reliability Engineer to join our growing team. We are passionate about developing software and solutions for observability, incident response, reliability, performance, operational excellence, and compliance. As a member of the SRE organization, you will support internal stakeholders and Engineering teams, tackling complex challenges and enhancing our engineering teams' and customers' experience. You will have the opportunity to work alongside experienced SREs and gain valuable hands-on experience in a dynamic and supportive environment. As a Level 1 SRE at WEX, you will: Learn the fundamentals of SRE: Gain a solid understanding of core SRE principles, including monitoring, incident management, and automation. Develop basic automation scripts: Use scripting languages like Python or Bash to automate simple tasks and improve operational efficiency. Triage and resolve incidents: Participate in on-call rotations, assisting with the identification and resolution of incidents under the guidance of senior SREs. Monitor system health: Utilize monitoring tools to identify and escalate potential issues, ensuring the stability and performance of our systems. Collaborate with development teams: Work closely with software engineers to understand their systems and provide operational support. Contribute to documentation: Help maintain and improve internal documentation, including runbooks, knowledge base articles, and playbooks. Continuously learn and grow: Expand your knowledge of cloud technologies, DevOps practices, and SRE tools through internal and external training opportunities. How you'll make an impact Develop a basic understanding of code, networking, operating systems, and storage solutions. You'll be able to identify and troubleshoot common issues related to these areas. Assist in developing automation and utilizing monitoring tools to ensure system reliability. You'll learn how to use tools to automate tasks and monitor system health. Participate in incident response and troubleshooting alongside senior SREs. You'll gain experience in identifying, escalating, and resolving incidents. Participate in 24x7 Site Reliability rotations and escalation workflows with guidance from senior team members. You'll learn how to respond to incidents and escalate issues appropriately. Learn to identify and address basic performance bottlenecks. This will include understanding code optimization, configuration changes, and infrastructure upgrade recommendations. Collaborate with development teams to ensure software design meets operational requirements. You'll learn how to communicate effectively with developers and advocate for operational best practices. Work with development teams to make sure operational needs are met by assisting with support requests from other engineering teams. You'll gain experience in providing support and collaborating with different teams. Contribute to the continuous improvement of processes and procedures to increase system reliability and efficiency. You'll participate in team discussions and contribute ideas for improvement. Stay up-to-date with the latest industry trends and technologies. You'll be encouraged to learn new technologies and share your knowledge with the team. Experience you'll bring Basic understanding of at least one major programming language: C#, Java, GoLang, Python. You should be able to read and understand code, and write scripts. Familiarity with a Cloud Computing platform (AWS, Azure, or GCP): You should have a basic understanding of cloud concepts and services. Strong communication and collaboration skills: You'll be working closely with different teams, so effective communication is essential. BA/BS degree in Computer Science or related technical field or equivalent job experience: A strong foundation in computer science principles is important. Nice to have Basic understanding of infrastructure as code, preferably Terraform: Familiarity with IaC concepts and tools is a plus. Working knowledge of RESTful APIs: Understanding how APIs work is beneficial. Exposure to observability and logging technologies: Any experience with monitoring and logging tools is helpful. Experience with at least one major RDBMS and NoSQL data store: Familiarity with databases is a plus. Exposure to containerization technologies such as Docker or Kubernetes: Basic knowledge of containers and orchestration is beneficial. Familiarity with GitOps: Understanding of GitOps principles is helpful. The base pay range represents the anticipated low and high end of the pay range for this position. Actual pay rates will vary and will be based on various factors, such as your qualifications, skills, competencies, and proficiency for the role. Base pay is one component of WEX's total compensation package. Most sales positions are eligible for commission under the terms of an applicable plan. Non-sales roles are typically eligible for a quarterly or annual bonus based on their role and applicable plan. WEX's comprehensive and market competitive benefits are designed to support your personal and professional well-being. Benefits include health, dental and vision insurances, retirement savings plan, paid time off, health savings account, flexible spending accounts, life insurance, disability insurance, tuition reimbursement, and more. For more information, check out the "About Us" section. Pay Range: $66,000.00 - $87,000.00