AI Research Engineer – Model Compression & Quantization (Remote)
tether · Roma
Job description
About the role
Join Tether’s AI model team to drive innovation in model serving and inference architectures. You will focus on optimizing model deployment and inference strategies to deliver responsive, efficient, and scalable performance across diverse applications.
Key responsibilities
- Design and optimize model serving pipelines for both resource‑efficient and complex multi‑modal models.
- Develop inference frameworks that handle text, image, and audio data at scale.
- Collaborate with cross‑functional teams to integrate optimized models into production systems.
- Research and implement state‑of‑the‑art compression and quantization techniques.
Required profile
- Strong background in advanced model architectures and AI research.
- Proven experience designing scalable model serving and inference solutions.
- Excellent English communication skills and ability to work remotely with a global team.
Required skills
Questions fréquentes
Why are you reporting this job?
Apply in 30 seconds
Enter your email to apply. An account will be created automatically.
By continuing, you accept our terms of use.
Already have an account? Login
Published 10 ore fa
Expires tra 1 mese
4 views · 0 applications
Boost your chances
Upload your CV — we will match you with relevant openings.
Analyzing your CV...
tether
Roma
Related job offers
-
AI Research Engineer – Agentic Post‑training (Remote)
tether Roma -
AI Research Engineer – Kernel & Inference Optimization (Remote)
tether Roma -
AI Research Engineer (Model Compression & Quantization)
tether Roma -
Junior Project Manager – French Speaking
agap2 Italia Gênes -
Project Manager – French Speaking
agap2 Italia Gênes