Senior Machine Learning Performance Engineer
Wayve
At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law.
About us
Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.
Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving. Join our world-class team as we tackle today's most complex challenges and pave the way for a smarter, safer future.
At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment. Make Wayve the experience that defines your career!
The Role
We are seeking skilled engineers to join our Machine Learning Platform team working on optimising large scale training jobs as we aim to scale our models through the next order of magnitude. The Machine Learning Platform team owns our GPU training infrastructure and software abstractions around it, and you will have a specific focus on improving training efficiency.
Challenges you will own
- Maximising the MFU of our large scale training jobs.
- Profiling and identifying bottlenecks in training code.
- Implementing GPU kernels to improve training throughput.
- Working closely with Research teams to integrate and test training efficiency improvements.
- Owning and improving our GPU training clusters.
About You
Essential:
- 5+ years experience in performance optimization or ML engineering.
- Experience optimize large scale training jobs on GPU compute clusters.
- Experience in working in platform teams and working with research teams.
- Experience in reporting and tracking over time benchmarked performance in an open and accessible way.
- Ability to write high quality, well-structured and tested Python code
- BS or MS in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience
Desirable:
- Solid experience working with concurrent, parallel and distributed computing.
- Experience using Nvidia NSight Systems.
- Experience implementing GPU kernels.
- Knowledge of computing fundamentals - what makes code fast, secure and reliable.
We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you’re passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.
This is a full-time role based in our office in Mountain View, California. At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home. We operate core working hours so you can determine the schedule that works best for you and your team.
For more information visit Careers at Wayve.
DISCLAIMER: We will not ask about marriage or pregnancy, care responsibilities or disabilities in any of our job adverts or interviews. However, we do look to capture information about care responsibilities, and disabilities among other diversity information as part of an optional DEI Monitoring form to help us identify areas of improvement in our hiring process and ensure that the process is inclusive and non-discriminatory.
#LI-HH1