Site Reliability Engineer Job at Softworld, a Kelly Company, Atlanta, GA

dHVpK2tPSVQ0L043MGZ5OWJhdVYzZFBGYmc9PQ==
  • Softworld, a Kelly Company
  • Atlanta, GA

Job Description

The Cloud Site Reliability Engineer (SRE) works closely with cloud development team, IT operations team and business partners to streamline and implement enhanced monitoring and alerting capability across infrastructure, application layers. By leveraging automation tools, SREs address and resolve issues, minimizing manual workload and enhancing system scalability and reliability. Their core focus lies in standardization and automation to build and run fault-tolerant systems. Typically, SREs possess a background in software engineering, system engineering, or system administration, coupled with substantial IT operations experience. SREs oversee availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning.

  • Writing and developing code to automate processes, such as analyzing logs, testing production environments and responding to any issues?
  • Collaborates with agile teams and business partners to develop specifications that resolve problems and enhancement needs, including focusing on monitoring, and metrics for operational readiness
  • Identify bottlenecks in development and deployment processes and designs automation solutions to mitigate?
  • Develop new capabilities in displaying/monitoring/alerting on key performance indicators by tracking business transactions in real-time
  • Maintain and grow knowledge of platform configuration management, monitoring of established metrics, and troubleshooting ?
  • Provides continuous feedback to development teams on system stability, defect analysis, and system enhancements ?
  • Design and develop alert escalation and incident response automation?
  • Provide production support for cloud service outages and incidents and work on both tactical and strategic plans for outage prevention?
  • Provide feedback on resiliency and maintainability of solutions to Cloud and App architects?
  • Conduct disaster recovery scenario generation and testing?
  • Implement sustainable, audit-ready processes that support information technology controls, including deployment execution, access management, audits, incident management and related requirements.

Must-have technical skills:

  • Should have at least 3 years’ experience as a site reliability engineer on a cross functional agile team working in Azure.
  • Have working knowledge of agile development methodologies (scrum, sprints, KanBan etc.) and tools (Azure DevOps etc.)
  • Have at least 3 years hands-on experience using IaC tools Terraform, Github, Ansible and Packer
  • Proven experience across testing, integration, source code management, deployment and containerization
  • Sound problem-solving skills with the ability to quickly process complex information and present it clearly and simply?
  • Experience with cloud technologies and services including those for Compute, Storage, Databases and API Management
  • On-premise to cloud migration experience

Job Tags

Similar Jobs

Canon U.S.A., Inc.

Senior Account Executive -Production Print Job at Canon U.S.A., Inc.

 ...support visa sponsorship. All applicants must reside in the United States at the time of hire. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin,... 

TRS Staffing Solutions

Safety Specialist Job at TRS Staffing Solutions

 ...Job Title: Safety Specialist Location: Travelers Rest, SC Industry: plastic injection molding manufacturing...  ...approval to bringing chemicals on site. Manage the hearing conservation program and first aid/CPR training. Develop skills matrix for safety training... 

Roastiva

Delivery Driver Job at Roastiva

 ...milkman deliveries with our environmentally-conscious glass jars filled with premium Organic and Fair Trade blends. Our delivery drivers enjoy a relaxed and positive work atmosphere. At Roastiva, every delivery route is consistent, taking the guesswork out of each shift... 

One Stop Management, LLC

Receptionist Job at One Stop Management, LLC

Job Summary: The Front Desk Administrator at The Bay is the first point of contact for residents, guests, vendors, and staff. This role is critical to creating a welcoming and organized atmosphere in a 189-unit luxury residential building. The ideal candidate will be ...

Encore Talent Solutions

Lead Payroll Staff Accountant Job at Encore Talent Solutions

 ...connect top talent with meaningful opportunities to drive business success. Job Description We are seeking a Lead Payroll Staff Accountant to join our dynamic team. In this role, you will play a key part in assisting in the creation of accurate payroll and budget...