We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Senior Systems Engineer

Columbia University
United States, New York, New York
Jan 04, 2025

  • Job Type: Officer of Administration
  • Bargaining Unit:
  • Regular/Temporary: Regular
  • End Date if Temporary:
  • Hours Per Week: 35
  • Standard Work Schedule:
  • Building:
  • Salary Range: $120,000-$160,000


The salary of the finalist selected for this role will be set based on a variety of factors, including but not limited to departmental budgets, qualifications, experience, education, licenses, specialty, and training. The above hiring range represents the University's good faith and reasonable estimate of the range of possible compensation at the time of posting.

Position Summary

Columbia University Irving Medical Center (CUIMC), located in the city of New York, is one of the premier Ivy League schools in the country. Columbia is a home of almost 100 faculty and alumni who won Nobel Prizes, setting a high bar for education and research. The Center for Computational Biology and Bioinformatics (C2B2) manages a secure High-Performance Computing (HPC) facility having thousands of CPU cores, millions of NVIDIA GPU CUDA cores, 50 TB memory, tens of Petabytes of storage, 100-200 Gbps networking. C2B2 serves dozens of research labs, including highly reputed Irving Comprehensive Cancer Center, JP Sulzberger Genome Center, Mailman School of Public Health, and AI/ML Radiology Lab.

Reporting to the Director of IT, the Senior Systems Engineer will be participating in the University's research mission. The engineers will manage the above complex infrastructure for highly skilled researchers (PhD students and above). The position requires significant technical and customer service skills, multi-tasking, resource management, both on-campus data center as well as the Cloud, where security of clinical data is critical.

Responsibilities

Specific responsibilities include, but not limited to:



  • Administer an infrastructure based on RedHat/Rocky Linux, web servers (Apache, Nginx, uwsgi), Databases servers (MySQL/MariaDB), and Wiki.
  • Manage OpenHPC stack, Warewulf, SLURM, Ansible, XDMOD, Open OnDemand, Prometheus, Nagios, Grafana, Containers (Singularity/Kubernetes), OpenStack, virtualization (VMWare/KVM), versioning (CVS/Git), and data movers (Globus).
  • Manage 100-200 Gbps IP and IB networking that interconnects all HPC sub-systems, and work with networking team for campus & external networks.
  • Administer a development environment based on gcc, Python, NumPy, Jupyter, R, Java, MATLAB, AlphaFold, OpenMPI/mpich2, and more as need arises.
  • Manage storage servers (Weka, Isilon, TrueNAS) and backups, with emphasis on security of large volumes of data containing sensitive information (PHI/PII).
  • Manage security of the whole environment using tools like strong encryption, ssh, SELinux, VPN, Firewalls. Assist the CISO in conducting periodic PenTests, IT Audits and addressing the Gap Analysis.
  • Keep abreast of the rapidly changing Research IT landscape, geared towards AI/ML.
  • Promptly handle system alerts and work with the hardware vendors to ensure resolution of issues, prevent data loss, and ensure safety of the environment.
  • Keep systems documentation in text and videos up to date on Wiki to help researchers optimize the utilization of computing resources.
  • Install/move equipment weighing up to 50 lbs.
  • Attend occasional emergencies outside working hours to ensure the availability of all systems running in a 24x7 mode.
  • Additional related responsibilities as assigned by the Director of IT.


Minimum Qualifications



  • Bachelor's degree in an IT-related field.
  • Minimum four years of related experience.
  • Experts level understanding of Linux clusters, and scripting (Bash/Python/PHP).
  • Demonstrated knowledge of managing enterprise level IT systems.
  • Excellent written and verbal communications skills to collaborate with the fellow team members and conduct user training.
  • The experience requirements might be lowered for higher degree holders and candidates with lower experience might be considered for a lower-level position.


Preferred Qualifications



  • Master's degree in an IT-related field.
  • Additional IT certifications in DevOps, Security, and Cloud technologies.
  • Knowledge of Microsoft Windows, PowerShell, Active Directory, LDAP & Kerberos.
  • Knowledge of HIPAA and NIST federal regulations.
  • Experience with networking tools like iPerf/perfSONAR/Zabbix/SolarWinds.
  • Experience with the management of co-location service in modern data centers.
  • Experience with ServiceNow, iLab or other enterprise level issue-tracking systems.
  • Experience in educational or research environment.


Equal Opportunity Employer / Disability / Veteran

Columbia University is committed to the hiring of qualified local residents.

Applied = 0

(web-6f6965f9bf-j5kl7)