logo

View all jobs

Senior Site Reliability Engineer

Sandy Springs, GA · Information Technology
This is an exciting opportunity for a Senior Site Reliability Engineer in the Consumer SRE Team at IMT division, to provide secure, resilient, scalable and maintainable services for mortgage borrowers and lenders.  IMT is a division of our client based in Atlanta, which operates numerous financial and commodity marketplaces and exchanges, including the New York Stock Exchange (NYSE).
 
Automation is a big part of what we do – we use infrastructure-as-code within our hybrid cloud to bring stability and scalability to Windows, Linux, Docker and Serverless applications in AWS, On-Prem and Azure environments. We reduce toil through scripting and automation of repetitive tasks. You will collaborate with Developers to deliver robust services, build actionable alerts to detect / avoid incidents and to detect performance bottlenecks, as well as automation to remediate issues.

Responsibilities
  • Employ deep troubleshooting skills to improve the availability, performance, and security of Ellie Mae Services.
  • Ensure services are designed with 24/7 availability and operational readiness and rigor
  • Implement proactive monitoring, alerting, trend analysis and self-healing systems
  • Define and measure KPIs and SLOs
  • Build automated deployments, automated tests, and operational tools
  • Participate in on-call rotation for Production support
  • Collaborate with Product and Support teams to plan and deploy product releases
  • Partner with other SREs and lead by example
Knowledge and Experience
  • 10+ years of Application/Systems engineering in 24x7 Production Services environments
  • BS in Computer Science, Computer Engineering, Math, or equivalent professional experience
  • Excellent troubleshooter, utilizing a systematic problem-solving approach
  • Demonstrate the ability to lead Incident Response and root cause analysis (RCA)
  • Fluency with one or more current generation scripting language used by SRE/DevOps professionals (Powershell, Python, Perl, PHP, Ruby) + Java/.NET development
  • Experience running a SaaS application in a public cloud, on-prem or hybrid cloud environment
 
       Additional credit for:   
  • Proficiency in Windows and on-prem environments
  •     Experience with Continuous Integration and Continuous Delivery concepts.
  •     Automation in RunDeck or Jenkins
  •     Infrastructure-as-code or Configuration Management, utilizing tools like Terraform,        CloudFormation or Chef/SaltStack/Puppet/DSC
  •     Containers/Docker/Micro-Services

 

Share This Job

Powered by