Description:
We are looking for a hands-on Head of Engineering to own the full technical stack from day one. This is not a management-track role. This is not an oversight position. This role exists to architect, build, and evolve a production AI platform where reliability, speed, and intelligence compound directly into user trust every single day. You will work closely with the Founder and CPO on system design, technical direction, and long-term platform architecture. This role carries real ownership. The systems you build will directly determine how well WYLE executes, scales, and delivers on its promise to users. There is no downstream team to inherit your decisions, no architecture committee to defer to, and no "we'll fix it in v2." If the platform is wrong, you are the person who makes it right.
You Will Own
- The full technical architecture end-to-end: AI pipelines, data layer, real-time infrastructure, APIs, and cloud
- Multi-model AI orchestration across providers: routing, fallback, tool use, and structured output at scale
- Agentic and multi-agent systems: intent classification, task decomposition, planning, and execution at production reliability
- Retrieval and memory systems: context representation, knowledge retrieval, and hybrid search at scale
- On-device intelligence: low-latency, privacy-preserving inference for real-time user interactions
- Voice AI: real-time audio processing pipelines with sub-200ms response latency
- Platform integrations: email, calendar, messaging, and third-party services
- Cloud infrastructure: scalable, cost-efficient, production-grade from day one
- Engineering standards, sprint execution, and the technical hiring roadmap
- Applied R&D that ships directly into production, gets measured, and improves over time
You Have
- 10+ years of hands-on software engineering, with complex systems shipped to real users
- Minimum 3 years in a technical leadership role where you stayed deeply in the code
- Deep production experience with large language model APIs: real reliability requirements, not demos or prototypes
- Hands-on experience building retrieval and memory systems at scale, including graph-based and sparse retrieval approaches
- Experience deploying small language models and quantized inference in production environments
- Strong backend engineering across high-performance, low-latency service architectures
- Real-time systems experience: streaming data, event pipelines, and push infrastructure
- Hands-on experience with agentic AI system design and multi-agent coordination in production
- Cloud infrastructure ownership at scale: containerisation, orchestration, CI/CD, and monitoring
- Comfort operating in ambiguity, early-stage environments, and high ownership cultures
- Currently operating as a Head of Engineering, Principal Engineer, Staff Engineer, or equivalent