View all jobs

Cloud System Administrator 1

Annapolis Junction, MD · Information Technology

The team is searching for a Cloud System Administrator 1 to work in a cloud environment platform, built with Java on Free and Open Source Software products including Kubernetes, Hadoop and Accumulo, to enable the execution of data-intensive analytics on a managed infrastructure. This position is on the Operations Team that ensures day-to-day operations stability, provides customer support, as well as, knowledge in technical and troubleshooting repair expertise. The ideal candidate will have the ability to thrive in a fast paced team environment who is self motivated and proactively completes tasks with strong attention to detail. The candidate will be exposed to a variety of technologies depending on customer requirements.  The candidate should have a strong background in troubleshooting operational issues in a Linux environment. Additional knowledge of Docker, Kubernetes, Hadoop and scripting experience such as python and bash is beneficial.  Other experience that could benefit the candidate: Prometheus, JIRA, Hadoop Distributed File System (HDFS), Virtualization, Salt, Grafana, Openstack and AWS.

 The Cloud System Administrator 1 provides support for implementation and trouble shooting and maintenance on large clusters.

  • Shall have at least one (1) year of experience performing system administration and monitoring large  distributed system consisting of:

Multiple clusters;
Clustering implemented across at least 3 racks of equipment;
Minimum of 60 nodes per site.

  • Shall have experience diagnosing and troubleshooting large scale cloud computing systems including familiarity with distributed systems for storage and retrieval of data e.g. Hadoop, CASSANDRA, SCALITY, SWIFT, Gluster, Lustre, GPFS, Amazon S3, or another other comparable technology for big data management or High performance computing.

  • Shall have demonstrated ability to work within a pre-defined mission focused team structure, follow SOP’s, communicate effectively, accept constructive feedback, and receive technical guidance and advice from senior level technical resources.

  • Shall have demonstrated a willingness to learn new technologies and leverage senior level resources to expand current technical foundation using team structure.

  • Demonstrated ability to work independently on complex tasks, show a willingness to educate and train more junior technical resources.

  • Demonstrated ability to plan, communicate, lead and oversee complex technical tasks requiring interaction with multiple groups.

  • Shall have three (3) years experience writing software scripts using scripting languages including bash, perl, or python.

  • Shall have five (5) years experience demonstrating a fundamental understanding and working knowledge of core components of the Linux operating system including the management of user and group accounts in LDAP configuration of DHCP, DNS, and TFTP.

  • Shall have demonstrated experience with configuration management tools including Puppet and SALT

  • Understanding of the end to end Linux PXE/Network provisioning process to include familiarity with Anaconda Kickstart configurations, RAID controller utilities, TFTP images, and disk detect scripts.

  • Experience accessing and troubleshooting systems via remote utilities to perform hardware diagnosis and repair including VNC, serial over LAN interfaces, and IPMI, BIOS-level configuration.

  • Understanding of overall corporate architecture as well as familiarity with openSSL and Java keystore manipulation.

  • Experience troubleshooting commodity hardware platforms including previous experiences with SGI/HP hardware including SGI’s J series.

One (1) year experience is required
Bachelor’s Degree in Engineering, Systems Engineering, Computer Science, Mathematics is highly desired and will be considered equivalent to two (2) years of experience.
Hadoop/Cloud System Administrator Certification or comparable Cloud System/Service Certification is required.

  • Advanced knowledge of SSH tunneling and protocols including the implementation of dynamic SOCKS proxies as well as other ssh-based utilities, including rysn, pdsh, pdcp, and WinSCP.

  • Basic understanding of low level network concepts including vlans, port channel bonding and layer2/layer 3 switch interactions.

  • Familiarity with software load balancers for large scale webservice implementations including haproxy and nginx

  • Experience with Kubernetes orchestration services and Docker images.

  • Experience with log aggregation and search tools including ElasticSearch, logstatsh, filebeats, Grafana, and rsyslog.

Share This Job

Powered by