logo inner

Junior AI/ML Engineer – LLM-Based Content Moderation

TrustLabPalo Alto, California, United StatesOnsite
This job is no longer open

About TrustLab
Online misinformation, hate speech, child endangerment, and extreme violence are some of the world's most critical and complex problems. TrustLab is a fast-growing, VC-backed startup, founded by ex-Google, TikTok and Reddit executives determined to use software engineering, ML, and data science to tackle these challenges and make the internet healthier and safer for everyone. If you’re interested in working with the world’s largest social media companies and online platforms, and building technologies to mitigate these issues, you’ve come to the right place. 

About the role


  • We are seeking an AI/ML Engineer with expertise in Large Language Models (LLMs) to enhance the precision and recall of classification systems detecting content abuse, including hate speech, sexual content, misinformation, and other policy-violating material. You will work with cutting-edge AI models to refine detection mechanisms, improve accuracy, and minimize false positives/negatives.

Responsibilities


  • Design, develop, and optimize AI models for content moderation, focusing on precision and recall improvements.
  • Fine-tune LLMs for classification tasks related to abuse detection, leveraging supervised and reinforcement learning techniques.
  • Develop scalable pipelines for dataset collection, annotation, and training with diverse and representative content samples.
  • Implement adversarial testing and red-teaming approaches to identify model vulnerabilities and biases.
  • Optimize model performance through advanced techniques such as active learning, self-supervision, and domain adaptation.
  • Deploy and monitor content moderation models in production, iterating based on real-world performance metrics and feedback loops.
  • Stay up-to-date with advancements in NLP, LLM architectures, and AI safety to ensure best-in-class content moderation capabilities.
  • Collaborate with policy, trust & safety, and engineering teams to align AI models with customer needs.

Minimum Qualifications


  • Bachelor's or Master’s degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field. 
  • 1+ years of experience in AI/ML, with a focus on NLP, deep learning, and LLMs.
  • Proficiency in Python and deep learning frameworks such as TensorFlow, PyTorch, or JAX.
  • Experience in fine-tuning and deploying transformer-based models like GPT, BERT, T5, or similar.
  • Familiarity with evaluation metrics for classification tasks (e.g., F1-score, precision-recall curves) and best practices for handling imbalanced datasets.

Preferred skills


  • Experience working with large-scale, real-world content moderation datasets.
  • Knowledge of regulatory frameworks related to content moderation (e.g., GDPR, DSA, Section 230).
  • Familiarity with knowledge distillation and model compression techniques for efficient deployment.
  • Experience with reinforcement learning (e.g., RLHF) for AI safety applications.

Opportunities and perks


  • Work on cutting-edge AI technologies shaping the future of online safety.
  • Collaborate with a multidisciplinary team tackling some of the most challenging problems in content moderation.
  • Competitive compensation, comprehensive benefits, and opportunities for professional growth.

This job is no longer open

Life at TrustLab

Our mission is a healthy and safe internet, with empowered users and trusted platforms. Our team develops solution to detect and mitigate online safety threats and works with social media companies, advocacy groups and other interested parties to implement them. Watch this space to find out more or email founders@trustlab.com We are hiring.
Thrive Here & What We Value1. Competitive salary + stock options at a rapidly growing Series A startup | 2. Comprehensive health insurance packages | 3. Workfrom-home office setup support | 4. Individual wellness stipend | 5. Professional development and mentorship opportunities | 6. Influence new product direction from idea to commercialization | 7. Help develop critical tech to solve one of the 21st century’s most pressing societal challenges.</s> | Competitive compensation, comprehensive benefits, and opportunities for professional growth.</s> | Collaborate with a multidisciplinary team tackling some of the most challenging problems in content moderation | Competitive compensation, comprehensive benefits, and opportunities for professional growth</s> | 1. Fastgrowing startup | 2. VCbacked | 3. Determined to use software engineering, ML, and data science to tackle complex problems | 4. Building technologies to mitigate online issues | 5. Opportunities for professional development and mentorship</s>

Related Sub

This job belongs to these sub. Explore related roles here:
Machine learning jobs
Your tracker settings

We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

logo innerThatStartupJob
Discover the best startup and their job positions, all in one place.
Copyright © 2025