- Career Center Home
- Search Jobs
- Critical Infrastructure Engineer (Fully Remote with Travel)
Description
This role requires 10–15+ years in multi-site colocation data center operations, hyperscale-only experience is not sufficient.
Type: Contract-to-hire
Compensation: $150-$200 / hour + travel
Location: Remote with travel to nationwide colocation sites
99 Mission Critical is a critical environment performance & risk audit platform designed specifically for multi-tenant colocation operators. This is an opportunity for a senior mission-critical engineer to take a technical leadership role in shaping how this service is delivered, refined, and scaled.
This is a hands-on, high-autonomy role for someone who wants to apply their deep operational expertise inside an early-stage and fast-moving environment, where your actions are directly tied to customer outcomes and company growth.
You will be expected to help define and iterate on service scope on a customer-by-customer basis, as well as share and defend your opinions.
Main Responsibilities:
- Lead comprehensive, on-site Critical Environment Performance & Risk Audits within live multi-tenant colocation facilities
- Serve as Technical Lead for the audit service, exercising discretion over methodology refinement, scope prioritization, instrumentation standards, and service evolution based on field experience.
- Evaluate mechanical plant performance including chilled water systems, economizers, pump staging logic, VFD optimization, partial-load efficiency behavior, and redundancy mode impacts on efficiency.
- Conduct hands-on airflow and white space diagnostics including plenum pressure mapping, containment validation, bypass airflow identification, rack-level T verification, and thermal imaging analysis.
- Identify overcooling, economizer underutilization, fan overspeed, low T syndrome, and inefficiencies tied to redundancy staging.
- Design and deploy short-term instrumentation strategies to validate system performance and quantify inefficiencies.
- Quantify defensible kW reduction and PUE improvement ranges from operational adjustments.
- Assess operational risk exposure including mechanical redundancy gaps, UPS/battery lifecycle risk indicators, firmware obsolescence awareness, open corrective maintenance trends, and asset health concerns.
- Evaluate telemetry reliability including sensor drift, alarm threshold validity, and monitoring blind spots.
- Identify gaps between documented preventative maintenance practices and observable field execution.
- Prioritize findings across energy, reliability, and lifecycle risk dimensions.
- Translate technical observations into structured audit reports, risk heat maps, and executive-ready summaries.
- Participate in client-facing sales discussions, site tours, executive briefings, and presentation of findings.
- Contribute to refinement and evolution of the audit framework based on real-world field experience.
- Advise leadership on service roadmap, expansion opportunities, and technical positioning based on real-world audit findings.
Requirements
Qualifications:
- 10–15+ years of mission-critical data center operations experience.
- Minimum 5+ years in multi-tenant colocation facilities (hyperscale-only experience is insufficient).
- Proven hands-on airflow diagnostics experience in active critical environments.
- Experience conducting structured or semi-structured facility performance or risk assessments.
- Strong systems-level understanding across mechanical and electrical infrastructure.
- Working knowledge of UPS efficiency behavior, redundancy staging tradeoffs, battery lifecycle considerations, and infrastructure inefficiencies at partial IT load.
- Experience identifying asset health, lifecycle, or maintenance execution risks in operational environments.
- Demonstrated ability to quantify estimated kW and PUE impact from operational changes.
- Comfortable presenting technical findings to facility leadership and executive stakeholders.
- Structured thinker capable of prioritizing issues across performance, risk, and feasibility dimensions.
- High professional credibility in front of experienced operators.
