logo

View all jobs

Site Reliability Engineer 3 w/ 10 years experience

Annapolis Junction, MD · Information Technology
TO BE CONSIDERED FOR THIS POSITION YOU MUST HAVE AN ACTIVE TS/SCI W/ POLYGRAPH SECURITY CLEARANCE (U.S. CITIZENSHIP REQUIRED)
 

This is a position within an open source Accumulo product development team. The candidate will have a primary focus of supporting all aspects of agile software development/engineering, including requirements analysis, software development, installation, integration, evaluation, enhancement, maintenance, testing and problem diagnosis/resolution for the open source Accumulo product that is integrated on large scale compute clusters. 

 Core Competencies and Skills:
•    Willingness to be a committer/contributor to open source applications
•    Java programming for distributed systems, with experience in networking and multi-threading
•    Apache Hadoop
•    Apache Accumulo
•    Apache NiFi
•    Agile development experience
•    Well-grounded in Linux fundamentals and knowledge in at least one scripting language (e.g.,
Python, Ruby, Perl, etc.)
•    Experience with source code management practices and tools
•    Enabling tools: Git, Maven, Jira
•    Continuous Integration / Continuous Testing: Bamboo, Jenkins, GitLab Cl/Pipelines
•    Continuous Monitoring: ELK Stack (ElasticSearch, Logstash and Kibana), Nagios
•    Familiarity with microservices software development technique and container-orchestration (e.g., Kubernetes)

Candidate will be expected to perform requirements analysis, software development, installation, integration, evaluation, enhancement, maintenance, testing, and problem diagnosis/resolution at a high level of proficiency and independence. Duties may also include communicating directions and providing guidance to more junior programmer/analysts, as required. Additionally, Software Engineers may be responsible for evaluating project needs, determining tasks and durations, and generating and reviewing designs for technical accuracy, completeness.

One of the following certifications is required:
AWS Certified Solutions Architect - Professional
AWS DevOps Engineer Professional
Elastic Certified Observability Engineer


The Site Reliability Engineer provides support in software development/engineering, including requirements analysis, software development, installation, integration, evaluation, enhancement, maintenance, testing, and problem diagnosis/resolution. Provides support for highly distributed, massively parallel computation needs such as Hbase, Hadoop, Acumulo, Big Table, Cassandra, Scality et cetera.
Cloud Systems Administrator or Developer Certification.
Bachelor's Degree in Computer Science or in a related technical field is highly desired which will be considered equivalent to two (2) years of experience. A Master's degree in a Technical Field will be considered equivalent to four (4) years of experience. NOTE: A degree in Mathematics, Information Systems, Engineering, or similar degree will be considered as a technical field.
·      Ten (10) years demonstrated experience developing software for one of the following:  UNIX, or Linux OS.
·      Knowledge and experience with developing distributed storage routing and querying algorithms.
·      Experience in developing documentation required to support a program’s technical issues and training situations.
·      Ten (10) years of experience developing software systems using object- oriented programming languages (i.e. Java, Python, et cetera).
·      Experience developing solutions integrating and extending COTS products.
·      Demonstrated knowledge of analytical needs and requirements, query syntax, data flows, and traffic manipulation.
·      Ten (10) years of experience in developing system performance, availability, scalability, manageability, and security requirements for mid-to-large scale programs.
·      Experience designing, developing, testing, evaluating, and integrating information systems into a services oriented environment.
·      Experience optimizing storage, retrieval, backup, and retention strategies across globally distributed, high throughput, text and multimedia storage within clustered or cloud environments.
·      Experience operating in a multi-thread environment.
·      Experience debugging and troubleshooting complex software in a cloud environment.
·      Familiarity with Configuration Management and monitoring tools.
·      Familiarity with Agile software methodologies and practices.
Significant experience provisioning and sustaining network infrastructures and have experience developing, operations, and managing networks required operating in a secure PKI, IPSEC, or VPN enabled environment.
 
  • Shall have fourteen (14) years of experience in software development/engineering, including requirements analysis, software development, installation, integration, evaluation, enhancement, maintenance, testing, and problem diagnosis/resolution.
  • Shall have ten (10) years experience in system engineering/architecture.
  • Shall have ten (10) years experience working with products that support highly distributed, massively parallel computation needs such as Hbase, Hadoop, Acumulo, Big Table, Cassandra, Scality et cetera.
  • At least ten (10) years experience writing software scripts using scripting languages such as Perl, Python, or Ruby for software automation.
  • At least four (4) years of experience managing and monitoring large Cloud System (>1000 nodes).
  • Experience in performing and providing technical direction for the development, engineering, interfacing, integration, and testing of complete hardware/software systems to include monitoring technical health of a system, improving organizational processes, implementation of postmortem (failure) analysis and incident management

Share This Job

Powered by