HPC Software Engineer
Job posting number: #7066389
Posted: May 22, 2020
Application Deadline: Open Until Filled
Job DescriptionGeneral Summary/Purpose:
The Maryland Advanced Research Computing Center (MARCC) is a state of the art High Performance Computing (HPC) facility that provides resources (HPC, storage and analytics) for researchers at Johns Hopkins University, The University of Maryland at College Park and eventually to all other schools in the state of Maryland. The software engineer will serve as technical resource to all users on highly complex code development, architecture, debugging, profiling, optimization, documentation, installation and maintenance of open source scientific applications; data mining and best practices to utilize HPC resources. The software engineer will be a liaison between the systems group and the application support group. S/He is expected to carry out web programming, scripting processes, develop applications that benefit MARCC staff and the research community.
Responsible for the creation, implementation, maintenance, performance, production support and documentation of various departmental and enterprise-wide application systems. This includes but is not limited to the installation, modification, and testing of new and/or upgraded applications (packages or home grown), operating systems, file structures, hardware, communication devices, and productivity tools. Applies analysis techniques and procedures to gather and then translate business requirements into functional/technical specifications and designs. Using functional specifications and designs, produces all or part of the deliverables. Maintains databases and application system code.
Responsible for full life-cycle of medium to large sized complex projects; strong technical skills; strong ability to understand complex business processes. Develops solutions based on extensive technical knowledge, skills and experience; influences client towards innovative/integrated solutions.
The responsibilities listed below are typical examples of the work performed by this position. Not all duties assigned to this position are included, nor is it expected that everyone in this position will be assigned every job responsibility.
ANALYSIS AND REQUIREMENTS GATHERING
Define complex business/clinical/education problems by meeting with clients to observe and understand current processes and the issues related to those processes. Provide written documentation of findings to share with the client and other IT colleagues.
Gather complex system requirements by meeting with clients and researching existing technology to understand the business requirements and possible solutions for new applications.
DESIGN AND DEVELOPMENT
Develop detailed tasks and project plans by analyzing project scope and milestones for complex projects in order to ensure product is delivered in a timely fashion according to software lifecycle standards.
Write functional/technical specifications from the complex system requirements, putting them into functional and technical descriptions for use by programmers and business analysts to develop technical solutions.
Develop/change data input, files/database structures, data transformation, algorithms, and data output by using appropriate computer language/tools to provide technical solutions for complex application development tasks.
Document code and associated processes by adhering to development methodologies, adding code comments and appropriate documentation to various knowledge-base system(s) to simplify code maintenance and to improve support.
Provide monitoring and guidance in application design and development to more junior staff.
Provide thought leadership in designing and developing innovative integrated solutions.
TESTING AND DOCUMENTATION
Create and document complex test scenarios using the appropriate testing tools to validate and verify application functionality.
Test all changes by using the appropriate complex test scenarios to ensure all delivered solutions work as expected and errors are handling in a meaningful way.
Author and maintain documentation by writing audience-appropriate materials to serve as technical and/or end-user references.
Mentor junior staff in testing tools and technologies by reviewing their work.
IMPLEMENTATION AND MAINTENANCE
Implement changes by adhering to the change management policies and procedures for any given project to communicate to all parties the nature, significance, and risk factors of the solution.
Monitor changes and resolve complex problems by responding as they occur, by reviewing all processing and output of the newly implemented solution, and by proactively ensuring the solution works successfully in order to satisfy the customer requirements and to provide a smooth transition to the new solution.
Provide support by investigating and resolving issues, including complex issues to ensure prompt, effective service.
Describe the Position’s Roles & Interactions:
Identify and debug problems with scientific applications.
Install and maintain scientific applications.
Collaborate with research groups in application development, optimization.
Develop common tools that benefit application optimization and performance.
Provide software architecture expertise to procure external funding.
Ensure solutions released to the community are stable and usable.
Ensure resources meet the community’s needs and are highly available to the group with limited interruption.
Perform thorough and complex programming including designing architectural protocols to address research needs of faculty and students in a comprehensive manner.
General HPC support
Extensively document processes so that users can easily find useful information and other IT staff can perform routine tasks and provide backup.
Conduct extensive research to resolve HPC challenges.
Work closely with the facility’s director, systems and application groups to successfully implement policies and procedures.
Continuously evaluate new tools and technologies for use in existing and future clusters.
Recommend solutions and new technologies.
Provide required facility activity data for University and government reports.
Contributed to the Development of materials and workshops describing best practices on application development.
Attend department and University-sponsored training to increase knowledge, improve skills, and learn new skills. May substitute University training for supervisor approved commercial job related course offerings.
Describe the specific systems, applications, projects for which the position is responsible:
Familiarity with scientific application management packages like Lua modules, Environment modules, spack.
Familiarity with queuing systems like SLURM, PBS, Torque.
Excellent scripting skills, python, perl, shell.
Knowledge of scientific software applications in academic supercomputing environments is desired.
Experience in database programming (mysql, Mariadb).
Proficient in scientific programming languages, C, C++, or Fortran.
Familiarity in parallel programming, MPI and/or OpenMP.
Advanced knowledge of Linux, PHP/Python/Perl technology/toolkits. Proficiency on scientific applications like Matlab, R, others per discipline.