Job Title: Principal Engineering Lead, Digital Operations Reliability
FLSA Exemption Status:
State/Business Line Specific: IT
Reports to: VP of Product, Digital Backbone
Supervises: Individual Contributor
InnovAge’s Program of All-inclusive Care for the Elderly (PACE) is an alternative to nursing facilities. Seniors receive customized healthcare and social support at a nearby PACE center. Each PACE participant has a team of medical experts dedicated to providing personalized healthcare and support to help them age at home.
The InnovAge Digital Backbone team is positioned to proactively be customer focused, be the trusted thought partner to the business, to provide transformative experiences, platforms, services and unlock new business models in the Digital economy. We serve our workforce, our Participants, their Care Givers, to deliver the best services and care in the industry.
It is a compelling opportunity to be part of passionate, purposeful team that values Transparency, Inclusion, Diversity, Change Agency & Trust with a meaningful mission to serve our Seniors, make health equitable & empower them to live life on their terms.
The vision of a Digitally connected enterprise is predicated on the need to foster digital dexterity & highest quality of service and operational health of Cloud and OnPrem assets.
The need to establish a culture of always on systems with high availability & resiliency coupled with faster predictable response times for business continuity and customer delight, is paramount.
Operations & Resiliency is a key pillar of Innovage’s Digital Backbone’s ability to consistently deliver on the promise of Digital transformation & customer centric experiences.
Under the supervision of VP of Product, Digital Backbone, the Principal Engineering Lead, Digital Operations Reliability focus on driving Operational health & performance through predictive & proactive practices. As a leader in this space, this role will establish & grow the practice of Observability & Resiliency by Design, modernize & rethink IT Operations for the Digital era.
Essential Functions and Work Responsibilities
Functional Category: Management
Estimated Percent of time Spent – 100%
- Strong expertise with Microsoft Skype for Business, Exchange Online and O365 infrastructure in a distributed environment.
- Build a culture of reliability engineering: Develop an E2E resiliency Framework and toolkit, developing a Observability & Chaos testing strategy, assessing existing systems for vulnerabilities, being proactive in building in reliability to new development initiatives, reviewing E2E pipeline and defining reliability gates. Modernize IT Ops by automating and leveraging ML Ops for predictive monitoring & support.
- Monitors, maintains, and supports InnovAge storage environment
- Standardizes instrumentation and monitoring of IT Applications such as Skype, SharePoint, Teams and others with a unified telemetry and aid predictive problem detection & resolution.
- Collaborates with vendors to ensure adherence and adoption of standards.
- Generate technical documentation for standards, processes and procedures following ITIL methodologies
- Adheres to all InnovAge compliance and information security policies, practices and procedures which include the handling of systems and data
- Ensures that the InnovAge compute environment is current within industry guidelines.
- Leads the design and implementation of new InnovAge network topologies and infrastructure as needed in order to support maximum system availability of all InnovAge applications.
- Ensures that backup and recovery are in place including requirements for Disaster Recovery.
- Monitors Telephony (Voice) and network performance and proactively reports on system status and utilization.
- Supports technology projects as required, while providing minimal disruption to operational systems.
- Ensures that leadership is aware of all known and identified risks to system availability, performance, reliability etc.
- Branch Refresh & Maintenance: Engage with stakeholders & sponsors to own & drive this key initiative across all Centers to ensure frictionless operations. Define a modernization roadmap that aligns with Cloud first, asset lite & automation strategy for operational efficiency.
- Establish, expand and implement our Observability strategy & modern IT Ops Center, SRE Centre of Excellence (COE) to deliver predictable customer experiences.
- Automate, X-as-Code: Identify and build automation to dramatically improve productivity, efficiency & predictability with X-as-Code approach. Identify APIs and other approaches to manage cloud infrastructure resources, change controls & code deployments. Build closed loop automation through production env for Dev, Deploy, Test & Share.
- Security, Reliability & Resiliency first Design: Design for reliability into our ecosystem through defining & implementing practices in Resiliency Engineering, Automation, Observability & Chaos Testing while also engraining a proactive Reliability Culture.
- Establish standards & mature IT Ops with automation & intelligence: Define & own a centralized One Innovage Observability Platform as a Service with SLIs, SLOs & SLAs. as a Service.
- Collaborate with Application owners, Architects & Data owners to design, instrument & implement for operationally healthy digital ecosystem that extends beyond DevSecOps.
- Establish standards and benchmarks for response time metrics for digital assets on the Cloud and on Prem and monitor performance against them.
- You are a thought leader in devising ways to proactively monitor our systems health & rapidly diagnose and recover when incidents do occur.
- Weave in with DevSecOps: Mature the practice to shift-left earlier in the DevOps cycle to enable testing and validation for quality & increased testing coverage.
- Establish KPIs and mechanics to achieve – Application Performance Monitoring, Network Performance Monitoring, Infrastructure Monitoring, Digital Experience Monitoring, Log Management, Security Monitoring – across Innovage systems and digital assets.
- Collaborate with Product & Engineering teams to establish shared services & SDKs and drive adoption across teams.
- Establish Operations & Release readiness standards & change management processes.
- Modernize IT Ops & own the scope, outcomes & delivery
This role can be staffed anywhere in the US; the company is headquartered in Denver, CO, and the expectation is that the resources filling the role will be available to work CO hours (8am – 5pm MT). . Occasional travel (no more than 10%) will be required to visit InnovAge’s facilities.
To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. Requirements listed below are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions
REQUIRED
- Degree in Computer Science, Engineering or related technical discipline with a lean startup mindset.
- 10+ years of Operational leadership experience a focus on customer experience improvement, predictive monitoring & running IT Ops Center
- Analytical, quantitative, and statistical skills to build business cases with forecasted growth.
- Experience envisioning, designing and pitching business cases to secure funding & collaborating with Engineering to iteratively evolve from concept to launch & growth.
- Proven track record of delivering results cross-functional initiatives while managing multiple priorities
- Effective communication skills (verbal and written) and proven ability to influence leadership
- Demonstrated ability to understand and discuss deep technical concepts, scheduled tradeoffs and opportunistic new ideas
- Excellent interpersonal skills, with ability to work successfully in a matrixed organization across disciplines.
Other Knowledge Skills and Abilities Required
Computer Skills
- Must be computer proficient and possess experience with Microsoft Word, Excel, and Outlook.
- Must be able to quickly learn specific software and new applications.
Mathematical/Financial Skills
- Ability to apply concepts such as fractions, percentages, ratios, and proportions to practical situations.
- Able to analyze data and statistics and draw reasonable conclusions and compile accurate reports.
- Experience with P/L and developing and managing budgets
Language Skills
- Ability to read, analyze and interpret regulations and other documents.
- Strong interpersonal skills and ability to effectively and tactfully present information to, and communicate with, co-workers, employees, and others.
- Possess exceptional English written and verbal communication skills, including accurate grammar and business correspondence knowledge.
- Ability to read and write memos, reports, and correspondence that conform to prescribed style and format.
Reasoning Ability
- Ability to define problems, collects data, establish facts, and draw valid conclusions.
Other Skills and Abilities:
- Able to establish and maintain cooperative and positive working relationships.
- Organized, detail-oriented, diplomatic, proactive, self-motivated, dependable, and driven by excellence.
- Even-tempered and able to balance multiple tasks in accordance with changing deadlines and priorities in a fast-paced environment.
- Ability to work sensitively and effectively with individuals of diverse ethnic and cultural backgrounds.
InnovAge Service Standards Requirements
Safety
- Safety- Maintains a safe work place. Reports all unsafe work conditions to supervisor and/or Safety & Loss Control Manager and works in conjunction with supervisor, Safety & Loss Control Manager, and staff to correct unsafe work conditions. Follows and enforces all safety policies.
Accountability
- Commitment – Commits to his/her job and to the success of the company. Continuously puts forth the effort to achieve goals and continuous quality improvement. Degree to which employee goes the extra step to ensure job/task completion. Takes initiative to offer ideas to improve processes or results.
- Cooperativeness – Consistently supports management decisions as demonstrated by his/her actions. Demonstrates a “can do” attitude by responding positively to instructions. Follows instructions and works harmoniously with others to complete the job or task.
- Attendance – Meets or exceeds punctuality and attendance expectations/requirements. Faithfully reports to work and conforms to scheduled work hours. When necessitated, follows call-in procedures and informs others of absences.
Caring
- Customer Service – Embraces the organization’s commitment to internal and external customer service and demonstrates a customer-centric approach when interacting with co-workers, participants, clients, and all other business contacts.
- Confidentiality – Maintains confidentiality of employee, participant, and client data/information, and any other sensitive organization information as appropriate.
Integrity
- Adherence to Company Policy – Follows and enforces guidelines as established by policies. Conforms to company and job standards and requirements. Shows respect for others. Acts in the best interests of the company at all times. Serves as an example for others. Conducts business in an ethical fashion.
- Reliability – Completes responsibilities with minimal direct supervision. Follows through with assigned jobs and tasks all the way through completion. Puts forth the effort to achieve goals and objectives under varying circumstances.
- Alignment with Company Goals & Objectives – Supports the organization’s mission, vision, and values and holding self-accountable for applying these principles daily and personally living them when working with co?workers, participants, clients, and all other business contacts.
Quality
- Quantity of Work / Productivity – Produces at a high volume. Always puts forth the effort to maximize productivity. Meets or exceeds established work deadlines. Engages in a productive work effort whenever possible. Meets goals and objectives.
- Quality of Work – Produces work that is accurate and reliable. Accomplishes work quickly and efficiently. Works in a thorough and organized manner while minimizing down time. Results are consistently within acceptable quality standards.
- Job Knowledge – Demonstrates a thorough understanding of his/her job processes and procedures. Integrates knowledge to efficiently accomplish job requirements. Efficiently uses resources (including staff and management) to obtain additional knowledge.
- Communication – Exhibits good interpersonal skills. Develops and fosters professional relationships with co-workers, participants, clients, and vendors. Keeps others informed as directed by operational demands and need-to-know. Keeps self informed of announcement made via established company venues
Enter physical requirements/Work Environment based on location of position