Seeking experienced Machine Learning Engineers and Applied ML Researchers to design, evaluate, and solve complex machine learning challenges while helping improve next-generation AI systems.
Machine Learning Evaluation Specialist
Job description
Alignerr is seeking machine learning domain experts to help design advanced evaluation challenges for frontier AI systems.
This role focuses on building expert-level machine learning evaluation tasks that test the limits of state-of-the-art AI models using deep scientific and technical expertise.
Contributors will help shape how next-generation AI systems are:
- Benchmarked
- Evaluated
- Stress-tested
- Improved
The work directly supports AI evaluation and safety research at the frontier of the field.
Key Responsibilities
Evaluation & Benchmark Design
- Design original, expert-level machine learning problems grounded in specialized research domains
- Create evaluation tasks requiring advanced expertise beyond standard ML pipelines
- Define:
- Problem statements
- Evaluation criteria
- Gold-standard solutions
AI Assessment
Evaluate AI-generated machine learning solutions for:
- Correctness
- Methodological rigor
- Creativity
- Technical quality
Document:
- Expected failure modes
- Problem difficulty
- Required domain expertise
Research-Driven Contributions
- Use personal research and domain expertise to develop highly challenging AI evaluation tasks
- Identify areas where:
- General ML knowledge fails
- Specialized domain reasoning becomes necessary
Collaboration
- Work asynchronously with global researchers and engineers on advanced AI evaluation initiatives
Required Skills & Qualifications
Graduate-level expertise in a scientific or technical field intersecting with machine learning
- MS preferred
- PhD strongly preferred
Strong understanding of:
- Machine learning pipelines
- Model selection
- Feature engineering
- Evaluation metrics
- ML experimentation
Deep familiarity with active research challenges in a specialized domain
Strong analytical and research-oriented thinking
Excellent written communication skills
Ability to work independently on complex technical problems
Preferred Backgrounds & Domains
Examples include:
- Computational biology
- Bioinformatics
- Climate science
- Environmental modeling
- Medical imaging
- Healthcare AI
- Materials science
- Computational chemistry
- Astrophysics
- Signal processing
- NLP
- Robotics
- Reinforcement learning
- Financial modeling
- Quantitative analysis
Why Join
Frontier AI Research
Contribute directly to advanced AI evaluation and safety initiatives.
High-Impact Work
Your expertise helps define how cutting-edge AI systems are tested and improved.
Flexible Remote Work
- Fully remote
- Flexible hours
- Independent schedule
Collaboration
Work alongside researchers and engineers contributing to frontier AI systems globally.
Long-Term Opportunities
Strong contributors may receive:
- Contract extensions
- Expanded research involvement
- Additional advanced projects
Work Details
- Organization: Alignerr
- Employment Type: Freelance Contract
- Commitment:
- 10–40 hours/week
- Location:
- Remote
Additional Information
This role is designed for highly specialized technical and scientific experts interested in applying deep machine learning and research expertise toward AI benchmarking, evaluation, and frontier model improvement.
You will be redirected to the company's website to complete your application.