Reporting to the Director of IT, the Systems Engineer will assist with technical directions of new IT technology solutions for the cutting-edge research in computational biology. The candidate will be experienced in system design and implementation requiring a high level of expertise in areas such as, high performance computing (HPC), enterprise storage systems, and high speed networking. The Systems Engineer will also advise desktop administrators on security, storage, and system domain management issues, to help ensure high-quality desk support services for a community of approximately 100 scientific researchers.
The ability to adjust to rapidly changing technical requirements is essential, as is willingness to take initiative. Ability to multi-task, solve complex problems and work well with various external groups is crucial. Strong interpersonal and communications skills are required. The physical ability to install rack mounted servers is needed. Ability to handle off-hours, on-site support calls. Must possess the following technical skills:
Carry out technical projects, like HPC cluster installation and configuration using tools like Puppet/Ansible, CentOS upgrades.
Oversee the technical design, maintenance and support of a 6500 CPU-cores and 75,000 GPU-cores HPC cluster, managed by Sun Grid Engine (SGE) job scheduler.
Provide a software development environment based on Python, Perl, Java, C/C++ and research applications like MatLab, R.
Install/update/troubleshoot enterprise NAS storage like Isilon, Qumulo, TrueNAS etc.
Provide escalated technical guidance to desktop administrators.
User and vendor interaction, system alarm investigation and documentation.
As a member of the Systems Management team, must be a team player.
HPC and Storage systems run 24x7, must be willing to attend emergencies during off hours and weekends.
Additional responsibilities as assigned by the Director of IT.
Bachelor's degree or equivalent in education and experience, and at least three years of related experience.
Master's degree in computer related area.
Knowledge of other programming languages like C/C++, Java, PHP.
Good understanding of modern file systems, like NTFS, Ext4, XFS, ZFS, GPFS, Lustre.
Understanding of Data Center services including power, cooling, humidity and alarms.
Industry certifications in OS, Networking, Storage, and Virtualization.
Experience in academic/research environment will be a plus.
Expert knowledge of Linux (Ubuntu, CentOS) system design and integration.
In depth knowledge of high-performance computing and parallel processing
Fluency in Bash and Python scripting.
Good understanding of batch queue management systems like SGE/SLURM.
Good understanding of configuration tools like Puppet/Ansible.
Expert knowledge of TCP/IP & Infiniband networking.
Good understanding of enterprise storage servers, (iSCSI, NFS, CIFS).
Expert knowledge of Active Directory, LDAP, in a mix of Linux, Windows and Macs.
Hands on experience with VMWare virtualization and cloud services.
Excellent written and verbal communications skills.
Relevant higher education and longer experience may offset some of the essential requirements listed above.
Equal Opportunity Employer / Disability / Veteran
Columbia University is committed to the hiring of qualified local residents.
Internal Number: 507883
About Columbia University
Columbia University is one of the world's most important centers of research and at the same time a distinctive and distinguished learning environment for undergraduates and graduate students in many scholarly and professional fields. The University recognizes the importance of its location in New York City and seeks to link its research and teaching to the vast resources of a great metropolis. It seeks to attract a diverse and international faculty and student body, to support research and teaching on global issues, and to create academic relationships with many countries and regions. It expects all areas of the university to advance knowledge and learning at the highest level and to convey the products of its efforts to the world.