Advance how our customers operate while you advance your career. Join GDIT as an Observability Systems Engineer and build an impactful career in enterprise IT, collaborating with people who are driven and resourceful like you.
MEANINGFUL WORK AND PERSONAL IMPACT: As an Observability Systems Engineer, the work youll do at GDIT will be impactful to the mission of USCENTCOM. You will play a crucial role in ensuring the performance, reliability, and visibility of mission critical applications, networks, and infrastructure. You will design, implement, and maintain observability solutions that deliver real time insights across complex distributed systems, enabling rapid issue detection, improved operational readiness, and enhanced mission success.
WHAT YOU WILL DO:
Application Performance Monitoring (APM)
Design and implement APM solutions to monitor and optimize application performance across CENTCOM systems.
Analyze application behavior, identify bottlenecks, and provide actionable recommendations to improve performance and reliability.
Develop and maintain dashboards, alerts, and reports to track key performance indicators (KPIs).
End User Experience Monitoring
Deploy tools and methodologies to measure end user interactions with applications and services.
Analyze user experience metrics including response times, error rates, and service availability.
Collaborate with Development, Network, Cyber, and System Operations teams to enhance user experience and resolve mission impacting issues.
Network Performance Monitoring
Implement and manage network performance monitoring platforms to ensure optimal network health.
Monitor traffic, latency, and throughput to identify and resolve performance issues.
Provide insights into network behavior and recommend improvements to enhance reliability and scalability.
Wire Traffic Monitoring
Deploy and maintain wire level monitoring solutions to capture and analyze network packets.
Identify anomalies, troubleshooting issues, and ensure secure, efficient data transmission.
Leverage packet level data to support incident response and root cause analysis.
Observability Tools & Integration
Configure, maintain, and optimize observability platforms including Dynatrace, AppDynamics, Riverbed Alluvio Suite, Splunk ITSI, and SolarWinds.
Support stakeholders in defining observability requirements and integrating monitoring tools into existing workflows.
Develop custom scripts, plugins, and integrations to extend monitoring capabilities.
Tools & Technologies
Dynatrace, AppDynamics, and Riverbed Alluvio Suite for fullstack application and network performance monitoring.
Splunk Enterprise / Splunk ITSI for log analytics, event correlation, and service health monitoring.
SolarWinds and NetScout for network performance, device monitoring, and packetlevel visibility.
Prometheus and Grafana for metrics collection and visualization in containerized or DevSecOps environments.
Zeek, Suricata, and Wireshark for wiredata analysis, packet inspection, and network anomaly detection
Proactive Monitoring & Incident Response
Establish proactive monitoring practices to detect and address issues before they impact mission operations.
Work with cross functional teams to investigate and resolve incidents with minimal downtime.
Deliver detailed post incident analysis and recommendations for future prevention.
Documentation & Knowledge Sharing
Create and maintain documentation for observability tools, processes, and best practices.
Train and mentor team members on observability methodologies and toolsets.
WHAT YOULL NEED TO SUCCEED: Bring your technology expertise and drive for innovation to GDIT. The Observability Engineer /Systems Engineer Senior must have:
Certification: Security + CE or higher (DoW 8140 compliant)
Experience: 5+ years of related work experience
Required Technical Skills:
Strong back-end engineering capabilities, including building, testing, and validating systems in controlled environments prior to deployment on production networks.
Strong experience with observability platforms such as Dynatrace, AppDynamics, Riverbed Alluvio Suite, Splunk ITSI, and SolarWinds.
Proficiency in APM, end user experience monitoring, network performance monitoring, and wire traffic analysis.
Hands on experience with network protocols, packet analysis, and traffic monitoring tools.
Familiarity with scripting and automation (Python, PowerShell, Bash).
Strong analytical and troubleshooting skills across application, network, and infrastructure layers.
Excellent communication skills with the ability to convey technical insights to diverse audiences.
Demonstrated ability to collaborate with developers, network engineers, cybersecurity teams, and operations personnel.