🍁 SearchCanadaJobs.com

Senior Software Engineer — AI Evaluation & Benchmarks (Python)

Company

G2i Inc.

Location

, , argentina, , , argentina

Type

Full-time

This role is open to contractors in accepted locations only. Please confirm your country is on the list before applying — we're unable to process applications from unlisted locations. List of accepted countries and locations.

For US applicants: This is a 1099 independent contractor role. It is not compatible with F-1 OPT, STEM OPT, or any visa status that requires W-2 employment, guaranteed hours, or employer sponsorship. We are unable to provide offer letters or employment verification for this role.

What You'll Be Doing

Design and build the coding benchmarks and evaluation pipelines used to test frontier AI models on real software engineering work:

  • Design coding benchmarks that evaluate frontier models on real-world programming tasks — reasoning, debugging, and production-quality code
  • Build and maintain scalable data pipelines for evaluation workflows
  • Analyze model-generated code for correctness, reliability, and...

🍁 Ready to Apply?

Take the next step in your Canadian career

Apply Now