IBM Cloud Platform is seeking to hire a Cloud Site Reliability Engineer Technical Leader. This is a great opportunity for an individual to be a part of a team with IBM’s leading technology. This role will be joining a highly collaborative, cross-organization environment with a “whatever-it takes” leadership team. As the Technical Leader, this position requires teamwork to improve the integration and consistency of implementation and operational tools and processes across the Bluemix platform operations, development, support, and SRE teams. The SRE Technical Leader with focus on Availability, Performance, Automation, Efficiency & Change Management across the entire IBM Cloud Platform, to drive stability and reliability concepts into every SaaS Development Team. This role requires first-hand experience and achievement within a large-scale Site Reliability Engineering-established organization. This resource will have instituted SRE concepts within a Cloud technology organization and will have worked in a Site Reliability Engineering Team in the Cloud industry. The SRE Technical Leader will be responsible for the technology decisions and operations architecture to provide a platform that is available at a market-leading availability levels across all delivery models of Public, Dedicated, and Local. Responsibility spans across the Bluemix platform and all core components (Cloud Foundry, Docker, OpenStack, Cloud Foundation Services, BSS, and UI/CLI). It also encompasses the services available via the platform from middleware, data services, analytics, and other offerings. The platform will deliver platform capabilities, IBM services and runtimes and offer 3rd party service integration. The continuous delivery aspect of the role involves the management and continuous maintenance delivery of the environment where updates to the entire platform and all services are performed while maintaining availability of all workloads in the environment.
SRE Technical Leader Qualifications:
Experience delivering solutions to improve operational effectiveness and increase scalability of an SRE Team.
Demonstrated proficiency to develop strategies, vision and implement technical solutions using operational best practices and top technology to improve stability and reliability across all aspects of the platform and Cloud Services
Proven ability to provide technical knowledge/expertise and tightly manage technical interaction between Softlayer & Bluemix local clients on infrastructure as a service (IaaS), Bluemix Platform as a service (PaaS), and IBM’s Software as a service (SaaS)
Ability to act as an Incident Commander during critical technical outages
Ability to provide direction to other IBM Business Units or Original Equipment Manufacturers (OEMs) to implement new requirements for strategic design of network, security and capacity on a global basis
Strong conflict resolution skills and customer relationship acumen