26.05.2026

Python Insfrastructure Engineer - Model Evaluation

Engineer

Location: Denver

USD50 - 75

Remote FlexTime Contract

Skills:

monthsOfExperience: 36 OCaml Python Iteration

Full Job Description

Python Infrastructure Engineer --- Model Evaluation (AI Training) About The Role What if your Python expertise could directly shape how the world's most advanced AI models are built, tested, and improved? We're looking for a senior Python engineer to design and build the data pipelines, evaluation harnesses, and annotation tooling that sit at the heart of cutting-edge AI development.

This is a fully remote, flexible contract role working alongside leading AI research labs on real production systems. If you're a strong Python engineer who wants to do meaningful, high-impact work at the frontier of AI --- this is the role for you.

Organization: Alignerr
Type: Hourly Contract
Location: Remote
Commitment: 20--40 hours/week

What You'll Do

Design, build, and optimize high-performance Python systems supporting AI data pipelines and model evaluation workflows
Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control
Build and maintain evaluation harnesses that integrate with ML inference frameworks
Improve reliability, performance, and safety across existing Python codebases
Instrument systems with observability and metrics collection to monitor reliability and model performance
Identify bottlenecks and edge cases in data and system behavior, and implement scalable fixes
Collaborate with data, research, and engineering teams to support model training and evaluation workflows
Participate in synchronous design reviews to iterate on architecture and implementation decisions

Who You Are

Native or fluent English speaker with clear written and verbal communication skills
Full-stack developer with a strong systems programming background
3--5+ years of professional experience writing production-grade Python
Experienced building evaluation harnesses for ML models and integrating with inference frameworks
Solid background in observability, metrics collection, and monitoring for production systems
Self-motivated and reliable --- able to commit 20--40 hours per week

Nice to Have

Prior experience with data annotation, data quality, or evaluation systems
Familiarity with AI/ML workflows, model training, or benchmarking pipelines
Experience with distributed systems or developer tooling
Background in MLOps or AI infrastructure

Why Join Us

Work directly on cutting-edge AI projects alongside leading research labs
Fully remote and flexible --- structure your work week around your life
Freelance autonomy with the depth and consistency of meaningful, long-term technical work
Make a tangible impact on how next-generation AI models are evaluated and improved
Potential for ongoing work and contract extension as new projects launch

Python Insfrastructure Engineer - Model Evaluation

Skills:

Full Job Description

Alignerr

How to apply?

Your next job starts here!

ABOUT US

FOR JOB SEEKERS

LEGAL

MY ACCOUNT

Unlock PikaJob Premium!

Python Insfrastructure Engineer - Model Evaluation

Skills:

Full Job Description

Alignerr

How to apply?

Your next job starts here!

ABOUT US

FOR JOB SEEKERS

LEGAL

MY ACCOUNT

Job Details

Unlock PikaJob Premium!

Maybe later