We Hire America Jobs

Mobile We Hire America Logo
WeHireAmerica.jobs is a service of HR Policy Foundation and DirectEmployers Association. These two non-profit organizations are providing this free resource to help educators, policy makers and job seekers understand the great employment opportunities available here in the U.S. at some of America's biggest and best companies.

Job Information

T. Rowe Price Principal Site Reliability Engineer, Cloud Infrastructure in Owings Mills, Maryland

A career at T. Rowe Price says you want to contribute and make a difference at a leading global investment management firm where success results from the dedication our associates have in building success for our clients. We are a growing organization of associates from diverse backgrounds, experiences, and perspectives.We take a long-term view on associates and their careers. Our associates do phenomenal work with purpose, and as a result, we provide growth opportunities through in-person and online training, management development programs, and career development on the job.If you are seeking a meaningful work experience along with a workplace culture that thrives on teamwork, we invite you to explore the opportunity to join us.

Overview

In this role as Principal Site Reliability Engineer, Cloud Infrastructure you will formulate, develop, implement, and lead a team of Site Reliability Engineers (SREs) focused on the observability, sustainability, scalability, measurability and recoverability of T. Rowe Price’s innovative cloud & on-prem solutions by leveraging automation and best-of-breed tools. The successful candidate will have a strong operations & engineering background, is hands-on when needed, and has expertise in the cloud environments (public, private), infrastructure operations, DevOps practices, CI/CD toolchain and systems, code build and deployment, incident response, and 24x7 monitoring and support.

The candidate will also have extensive experience in building and running an SRE function within a complex, distributed environment. They will have a demonstrated ability to work horizontally and vertically within an organization with diverse partners and sponsor groups.

Role summary and job responsibilities

  • Possesses extensive knowledge in own area of expertise and extensive in-depth knowledge of the broader portfolio for comprehensive understanding of up/downstream impacts across technology infrastructure

  • Overall responsibility for the design of technology solutions to prevent or minimize service disruptions

  • Prevents technology service disruptions through technology solution recommendations and automations

  • Fosters a culture of deep learning through blameless post-mortems to improve the shared goal of reliability across services

  • Transform operations teams by facilitating internal change to adopt SRE standard methodologies across the organization and driving strategic growth in this area within Global Technology

  • Analyzes incidents impacting technology availability for high-level trends across the broad portfolio

  • Drive initiatives to reduce or prevent technology failures in a complex, distributed technology environment

  • Pulls together information from disconnected systems into cohesive views of the technology portfolio for identifying trends, redundancies, and risk

  • Overall responsibility for creation and execution of road maps for applications and technology platforms

  • Demonstrates outstanding awareness of the complexities of the tech and asset management industries

  • May lead initiatives of varying degrees of complexity that span multi-functional areas and of varying degrees of complexity

  • Contributes to definition of target state architecture and design of the technology environment

Requirements

  • 10+ years of relevant technology experience

  • 5+ years building and supporting solutions in Amazon AWS

  • 5+ years of experience building and running a DevOps and/or SRE function

  • Experience with implementation and operation of the chaos model at scale

  • Strategic and program-level implementation experience

  • Demonstrable experience implementing new technology, tools, and platforms

  • System administration and scripting experience

  • Demonstrable experience leveraging automation to proactively prevent or quickly remediate incidents

  • Fluent in multiple programming languages (e.g., Python, Java, GO, Node.js, .Net Core, etc.).

  • Proficiency with database development (SQL Server, PostgreSQL, MySQL, etc.)

  • Proficiency with defining, right-sizing, tracking, and reporting on Service Level Objectives (SLOs), Service Level Indicators (SLIs), system availability, and the progress and outcomes related to reliability

  • Experience with implementing and managing Error Budgets

  • Proficiency with understanding and explaining incident situations and their recovery plans to prevent recurrence

  • Knowledge/experience driving dashboard standardization across the ecosystem for observability, APM and infrastructure monitoring, and application-specific logging

  • Knowledge/experience with observability tools such as New Relic, Elastic Stack, Prometheus, Grafana, Splunk, and cloud native tools is desirable

  • Knowledge/experience with cloud management tools such as Ansible, Terraform, Vault, and Vagrant.

  • Works independently, with guidance in only the most complex situations

  • Makes sound decisions with limited facts or resources.

  • Balances strategic and pragmatic concerns when solving problems

  • Adjusts communication style and materials to suit a given audience

  • Able to clearly articulate operational principles, practices, and policies

  • Stays abreast of industry trends and technologies

  • Accountable for work of self and others; sets standards around which others will operate

  • Maintains a broad internal professional network and knows when to engage/activate it

  • Develops or mentor’s diverse talent on the team

  • Ability to be on-call and/or work during off-hours

Job Family: Infrastructure Operations

Track: Knowledge Management (KM)

Level: 5

T. Rowe Price is an equal opportunity employer and values diversity of thought, gender, and race. We believe our continued success depends upon the equal treatment of all associates and applicants for employment without discrimination on the basis of race, religion, creed, colour, national origin, sex, gender, age, mental or physical disability, marital status, sexual orientation, gender identity or expression, citizenship status, military or veteran status, pregnancy, or any other classification protected by country, federal, state, or local law.T. Rowe Price is an asset management firm focused on delivering global investment management excellence and retirement services that investors can rely on–now, and over the long term.

Not ready to apply? Join our Talent Community!

DirectEmployers