|
What You'll Do: The Lead Associate Principal, Linux Server Administration is responsible for overseeing the design, implementation, and optimization of enterprise-wide Linux server infrastructure with a focus on automation and containerization platforms across on-premises and cloud environments. This role provides technical leadership and strategic direction for Linux systems architecture, Ansible Automation Platform, Red Hat Satellite, OpenShift, and AWS cloud infrastructure while mentoring team members and ensuring high availability, security, and performance across all Linux systems. The position serves as the primary technical authority for complex Linux server challenges and drives innovation in infrastructure automation, cloud-native development, hybrid cloud integration, and enterprise disaster recovery solutions. Primary Duties and Responsibilities: To perform this job successfully, an individual must be able to perform each primary duty satisfactorily.
Lead the design, deployment, and maintenance of enterprise Linux server environments (RHEL, CentOS, Ubuntu, SUSE, Amazon Linux) with hands-on configuration and troubleshooting across on-premises and AWS cloud infrastructure Plan, execute, and manage enterprise-wide Linux patching strategies including security patches, kernel updates, and critical vulnerability remediation across thousands of servers Develop and maintain comprehensive disaster recovery (DR) plans for Linux infrastructure including RPO/RTO targets, failover procedures, and recovery testing schedules Implement and enforce CIS (Center for Internet Security) benchmarks and security baselines across all Linux systems including automated compliance scanning, remediation, and reporting Plan, execute, and manage RHEL operating system upgrades across enterprise environments including in-place upgrades (Leapp), migration strategies, and rollback procedures Develop and implement infrastructure automation strategies using Ansible Automation Platform (AAP) including playbook development, workflow orchestration, and automation controller management Manage and optimize Red Hat Satellite infrastructure for system provisioning, patch management, and content lifecycle management across the enterprise Implement and manage automated patching workflows using Red Hat Satellite, Ansible, and AWS Systems Manager for both on-premises and cloud environments Design, deploy, and manage AWS Linux EC2 instances including instance configuration, auto-scaling, and integration with AWS services Create, maintain, and manage AMI (Amazon Machine Image) lifecycle including image hardening, patching, golden image development, and automated AMI pipeline creation Implement AMI versioning strategies, testing procedures, and distribution processes across multiple AWS accounts and regions Design and implement disaster recovery solutions including backup strategies, replication technologies, failover automation, and multi-region/multi-site architectures Design and maintain NFS storage solutions and distributed file systems for enterprise applications Architect, deploy, and manage OpenShift container platforms and Kubernetes environments in hybrid cloud configurations Implement and support Red Hat Dev Spaces for cloud-native development workflows Conduct regular DR drills and testing to validate backup and recovery procedures Develop and maintain security hardening standards based on CIS benchmarks, STIG requirements, and organizational security policies Manage incidents, requests, and change management processes using ITSM tools such as ServiceNow including ticket resolution, escalations, and SLA compliance Maintain technical documentation, knowledge base articles, runbooks, and operational procedures in Confluence Establish and enforce Linux server security standards, hardening procedures, and compliance protocols across on-premises and cloud environments Oversee system performance monitoring, capacity planning, and optimization initiatives across all platforms Provide escalation support for complex technical issues and lead incident response efforts Collaborate with cross-functional teams including networking, storage, security, and application development Drive continuous improvement initiatives and evaluate emerging Red Hat, AWS, and cloud-native technologies Create and maintain comprehensive technical documentation, runbooks, and standard operating procedures Participate in on-call rotation and provide 24/7 support for critical systems as needed Lead vendor management activities and coordinate with Red Hat and AWS support
Supervisory Responsibilities:
Provide technical mentorship and guidance to Linux administrators and junior team members Lead technical training sessions and knowledge transfer initiatives on Ansible, Satellite, OpenShift, AWS, patching, and DR procedures
Qualifications: The requirements listed are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the primary functions.
10+ years of progressive hands-on experience in Linux/Unix system administration 5+ years in a technical leadership or senior engineering role Strong hands-on experience with Ansible Automation Platform (AAP) including automation controller, execution environments, and workflow development Proven expertise in Red Hat Satellite for system lifecycle management and content management Extensive experience planning and executing enterprise-scale Linux patching programs including change management, patch testing, and emergency patching procedures Demonstrated experience designing and implementing disaster recovery solutions for Linux infrastructure including backup/restore, replication, and failover strategies Demonstrated experience planning and executing RHEL OS upgrades across major versions (e.g., RHEL 7 to 8, RHEL 8 to 9) using Leapp and other upgrade methodologies Extensive hands-on experience with AWS Linux EC2 instances, including Amazon Linux and RHEL on AWS Demonstrated experience in AMI creation, customization, hardening, and lifecycle management Proven track record of building automated AMI pipelines using tools such as Packer, Ansible, or AWS Image Builder Demonstrated experience with AWS cloud services and hybrid cloud architectures Extensive hands-on experience with OpenShift container platform and Kubernetes orchestration Demonstrated experience implementing and managing NFS and distributed storage solutions Working knowledge of Red Hat Dev Spaces for development environment provisioning Proven track record of designing and implementing large-scale automated Linux infrastructure in hybrid environments Strong understanding of DevOps principles and CI/CD methodologies Excellent problem-solving abilities and analytical thinking skills Outstanding communication skills with ability to explain technical concepts to non-technical stakeholders Strong project management capabilities and ability to manage multiple priorities Red Hat certifications (RHCE, RHCA) and/or AWS certifications (Solutions Architect, SysOps Administrator) highly preferred
Technical Skills: Required Core Skills:
Advanced hands-on proficiency in Red Hat Enterprise Linux administration and troubleshooting Extensive experience with Linux patching and patch management including:
Enterprise-scale patch deployment using Red Hat Satellite and Ansible Patch testing and validation in non-production environments Emergency and zero-day vulnerability patching procedures Kernel patching strategies including live patching (kpatch) Patch rollback and recovery procedures Compliance reporting and audit trail maintenance Patch scheduling and maintenance window coordination AWS Systems Manager Patch Manager for cloud-based patching
Expert-level disaster recovery and business continuity experience including:
Backup and restore strategies (Bacula, Veeam, AWS Backup, snapshots) Replication technologies (rsync, DRBD, storage-level replication) Multi-site and multi-region DR architectures RPO/RTO analysis and optimization Failover and failback automation DR testing and validation procedures Disaster recovery documentation and runbooks Cloud-based DR solutions (AWS disaster recovery services)
Extensive experience with RHEL OS upgrade processes including:
In-place upgrades using Leapp utility (RHEL 78, RHEL 89) Pre-upgrade assessment and compatibility testing Application compatibility validation and remediation Upgrade automation using Ansible and Satellite Rollback and disaster recovery planning for upgrade failures Post-upgrade validation and system optimization Managing kernel and package dependencies during upgrades
Expert-level experience with Ansible Automation Platform (AAP) including playbook development, roles, collections, automation controller, and execution environments Strong expertise in Red Hat Satellite for provisioning, patch management, configuration management, and content views Extensive hands-on experience with AWS Linux EC2 including instance management, auto-scaling groups, launch templates, and Amazon Linux/RHEL optimization Expert-level experience with AMI creation and management including:
Building custom AMIs using Packer, AWS Image Builder, or manual processes AMI hardening and security baseline implementation Automated AMI patching and update workflows AMI versioning, tagging, and lifecycle policies Golden image development and maintenance Cross-account and cross-region AMI sharing and distribution AMI testing and validation procedures
Proficiency with AWS services including VPC, security groups, IAM, EBS, EFS, S3, CloudWatch, Systems Manager, AWS Backup, and AWS CLI Extensive hands-on experience with OpenShift/Kubernetes including deployment, cluster management, CI/CD pipelines, and troubleshooting Proficiency with NFS configuration, performance tuning, and high-availability implementations Experience with Red Hat Dev Spaces for containerized development environments
Additional Technical Skills:
Expert-level scripting capabilities (Bash, Python, Perl) Experience with infrastructure as code tools (Terraform, CloudFormation, AWS CDK, Packer) Experience with additional configuration management tools (Puppet, Chef, SaltStack) Deep knowledge of containerization technologies (Docker, Podman) Strong networking knowledge (TCP/IP, DNS, DHCP, routing, firewalls, AWS networking) Proficiency with monitoring and logging solutions (Logic Monitor, Splunk, Nagios, Prometheus, Grafana, ELK Stack, CloudWatch) Experience with storage technologies (SAN, LVM, GlusterFS, Ceph, EBS, EFS, filesystems) Experience with backup solutions (Bacula, Veeam, CommVault, AWS Backup, snapshots) Knowledge of web services (Apache, Nginx), databases (MySQL, PostgreSQL, MongoDB, RDS) Familiarity with version control systems (Git, GitLab, GitHub, CodeCommit) Understanding security frameworks and compliance standards (PCI-DSS, HIPAA, SOC 2, AWS Well-Architected Framework, CIS benchmarks) Experience with GitOps practices and CI/CD pipelines Knowledge of virtualization technologies (VMware, KVM, Xen) Experience with high availability clustering (Pacemaker, Corosync) Knowledge of package management (RPM, YUM, DNF) and repository management Experience with API integrations and automation workflows
Education and/or Experience:
Minimum: Bachelor's degree in computer science, Information Technology, or related field Preferred: Master's degree in related field or equivalent combination of education and experience 10+ years of relevant hands-on Linux system administration experience required 3+ years of hands-on experience with Ansible Automation Platform required 3+ years of experience with Red Hat Satellite and OpenShift preferred 3+ years of hands-on experience with AWS Linux EC2 and cloud infrastructure required 2+ years of hands-on experience with AMI creation, management, and automation required Demonstrated experience executing RHEL OS upgrades across multiple major versions required Extensive experience with enterprise Linux patching programs and disaster recovery planning required Proven experience implementing CIS benchmarks and security hardening across Linux environments required Working experience with ITSM tools (ServiceNow, JIRA, Confluence) for ticket management and documentation required
Certificates or Licenses:
Red Hat certifications (RHCE, RHCA, Red Hat Certified Specialist in Ansible Automation, or OpenShift certifications) strongly preferred AWS certifications (AWS Certified Solutions Architect, AWS Certified SysOps Administrator, or AWS Certified DevOps Engineer) strongly preferred Security certifications (Security+, CISSP, or CIS certification) a plus ITIL Foundation certification a plus
About Us The Options Clearing Corporation (OCC) is the world's largest equity derivatives clearing organization. Founded in 1973, OCC is dedicated to promoting stability and market integrity by delivering clearing and settlement services for options, futures and securities lending transactions. As a Systemically Important Financial Market Utility (SIFMU), OCC operates under the jurisdiction of the U.S. Securities and Exchange Commission (SEC), the U.S. Commodity Futures Trading Commission (CFTC), and the Board of Governors of the Federal Reserve System. OCC has more than 100 clearing members and provides central counterparty (CCP) clearing and settlement services to 19 exchanges and trading platforms. More information about OCC is available at www.theocc.com. Benefits A highly collaborative and supportive environment developed to encourage work-life balance and employee wellness. Some of these components include:
- A hybrid work environment, up to 2 days per week of remote work
- Tuition Reimbursement to support your continued education
- Student Loan Repayment Assistance
- Technology Stipend allowing you to use the device of your choice to connect to our network while working remotely
- Generous PTO and Parental leave
- 401k Employer Match
- Competitive health benefits including medical, dental and vision
Visit https://www.theocc.com/careers/thriving-together for more information. Compensation
- The salary range listed for any given position is exclusive of fringe benefits and potential bonuses. If hired at OCC, your final base salary compensation will be determined by factors such as skills, experience and/or education.
- In addition, we believe in the importance of pay equity and consider internal equity of our current team members as part of any final offer.
- We typically do not hire at the maximum of the range in order to allow for future and continued salary growth. We also offer a substantial benefits package as noted on www.theocc.com/careers
- All employees may be eligible for a discretionary bonus. Discretionary bonuses are based on various factors, including, but not limited to, company and individual performance and are not guaranteed.
Salary Range $135,100.00 - $181,900.00
Incentive Range 8% to 15%
This position is eligible for an annual discretionary incentive compensation award, for which the target range is listed above (see Incentive Range). The amount of such award, if any, will be based on various factors, including without limitation, both individual and company performance. Step 1 When you find a position you're interested in, click the 'Apply' button. Please complete the application andattach your resume. Step 2 You will receive an email notification to confirm that we've received your application. Step 3 If you are called in for an interview, a representative from OCC will contact you to set up a date, time, and location. For more information about OCC, please click here. OCC is an Equal Opportunity Employer
|