Join us at Sparksoft, where we're not just another tech company—we're a catalyst for change. Our mission isn't just to offer IT solutions; it's to revolutionize the way you work. Here, passion isn't just a buzzword; it's the fuel behind groundbreaking ideas and transformative technologies. We serve a wide range of government clients, delivering impact that's felt across the nation.
Our true strength lies in our people. They're the problem-solvers and innovators consistently delivering extraordinary outcomes. With Sparksoft, you're not stepping into a routine job; you're joining a team committed to innovation and excellence. Our innovation extends beyond just delivering projects. Through our specialized Innovation Centers, we continuously refine our methods, ensuring we remain industry leaders.We are Sparksoft!
ROLE & RESPONSIBILITIES:
- Monitoring: Oversee production and lower environments monitoring to ensure smooth operations with applications built on Java, Serverless, and containers. This includes 24/7 monitoring of high transactional systems.
- Troubleshooting: Troubleshoot application and infrastructure problems and work towards their resolution. This includes solid troubleshooting experience and root cause analysis.
- Escalations: Handle escalations promptly and efficiently, ensuring minimal disruption to operations.
- Documentation: Document all relevant incidents, actions, and solutions, maintaining a comprehensive database of queries and resolutions.
- Training: Train team members and other staff on operational procedures and troubleshooting techniques.
- Dashboard Building: You must have solid experience building dashboards on platforms like Splunk, New Relic, Dynatrace, Prometheus, Cloud watch, etc.
- Alerts and On-Call Support: Experience working with alerts and providing on-call support via PagerDuty or similar tools.
- Building and Maintaining Professional Relationships with customers and stack holders
- Escalation and Status Updates - Ensure timely escalation of incidents and problems that cannot be resolved within the expected time frame. Provide regular status updates to stakeholders, keeping them informed about the progress and resolution of issues.
REQUIRED EXPERIENCE:
- 5+ years of experience with overseeing production and lower environments monitoring to ensure smooth operations with applications built on Java, Serverless, and containers.
- 3+ years of experience with 24/7 monitoring of high transactional systems.
- 2+ years of experience in Troubleshoot application and infrastructure problems and work towards their resolution.
- 2+ years of experience managing incidents, actions, and solutions, maintaining a comprehensive database of queries and resolutions.
- 2+ years of experience in training team members and other staff on operational procedures and troubleshooting techniques.
- 2+ years of experience in building dashboards on platforms like Splunk, New Relic, Dynatrace, Prometheus, Cloud watch, etc.
- 1+ years of experience in setting up alerts and providing on-call support via PagerDuty or similar tools.
- Candidates must be able to obtain and maintain a Public Trust clearance
- Candidates must have lived in the United States 3 out of the past 5 years
PREFERRED EXPERIENCE:
- Current AWS Certification(s)
EDUCATION & CERTIFICATIONS:
- Bachelor’s degree in computer science, Information Technology, or a related field.
If you need accommodation seeking employment with Sparksoft Corporation, please email Sparksoft.Accommodations@sparksoftcorp.com or call 410-424-7700. Accommodations are made on a case-by-case basis.At Sparksoft Corporation, we take security and protection of personal information very seriously. We will never ask you to send private personal information over email. Accordingly, we ask you to immediately contact our security team via email at abuse@sparksoftcorp.com upon receiving a suspicious request.