Back to remote jobs

Generalist – English & Hindi (LLM Evaluator)

Mercor

Linguistics & Translation Full-time
The United States, India $12.19/hr February 24, 2026

Job description

Why This Role Exists

Mercor partners with leading AI teams to improve the quality, usefulness, and reliability of general-purpose conversational AI systems.

This project focuses on evaluating and improving general chat behavior in large language models (LLMs). You will assess AI-generated responses across diverse topics and provide structured human feedback to ensure outputs are accurate, well-reasoned, and aligned with human expectations.

What You’ll Do

  • Evaluate LLM-generated responses for clarity, correctness, and completeness.
  • Conduct fact-checking using trusted public sources and verification tools.
  • Annotate strengths, weaknesses, and factual inaccuracies.
  • Assess reasoning quality, tone, and conversational alignment.
  • Ensure outputs comply with system guidelines and expected behavior.
  • Apply consistent annotations using structured taxonomies and evaluation rubrics.

Who You Are

  • Bachelor’s degree holder.
  • Native Hindi speaker or ILR 5 / C2 proficiency.
  • Fluent in English.
  • Experienced user of large language models (LLMs).
  • Strong writing skills with ability to provide nuanced feedback.
  • Highly detail-oriented and analytical.
  • Comfortable working across diverse topics and domains.
  • Strong college-level mathematics skills.

Nice-to-Have

  • Experience with RLHF, model evaluation, or annotation workflows.
  • Experience comparing multiple outputs and making fine-grained qualitative judgments.
  • Familiarity with evaluation rubrics and benchmarking systems.
  • Background in research, analytics, linguistics, or engineering.

What Success Looks Like

  • You consistently identify factual inaccuracies and reasoning gaps.
  • Your evaluation artifacts are clear, consistent, and reproducible.
  • Your feedback leads to measurable improvements in AI response quality.
  • AI systems improve before public deployment due to your evaluations.

Contract & Payment

  • Independent contractor engagement.
  • Fully remote with flexible schedule.
  • Weekly payments via Stripe or Wise.
  • Geography restricted to India and USA.
  • $12.19 per hour.

About Mercor

Mercor partners with leading AI labs and enterprises to train frontier models using human expertise. Contributors collaborate with researchers to improve advanced AI systems used globally.

Apply now

You will be redirected to the company's website to complete your application.

Mercor

Discover more opportunities that match your skills and interests.

Stay in the loop.

One email per week, 5 hand-picked roles.