Senior ML Engineer

Описание вакансии

Stealth-mode AI-powered Cloud-Native Health-Tech. The Business Domain revolves around providing Better Care for the US Patients; meaning, it’s about Treating Patients well.

It’s not vaporware, our platform supports US Physician Networks (IPAs) by enabling Smarter, Risk-Adjusted, and more Predictive Care that improves real patient outcomes.

Compensation and Benefits:

Paid Time Off

The company has Unlimited PTOs Policy and compensated New Years Holidays on top of that. The misuse of the policy isn’t welcomed, though it’s definitely possible to take at least two weeks – and fully compensated – vacation, or more.

Corporate Hardware

The company provides Corporate Hardware for employees who completed their Probationary Period, as well as proven their value.

Means of Communication

We use Google Workspace, Slack, Zoom, and similar collaboration tools for messaging, meetings, and document sharing across the organization. We don’t use Microsoft Outlook, Microsoft Team and similar software.

Cloud-Native

There’s real ability to make most solutions in a Cloud-Native and Third-Party Integrations manner, rarely spinning something self-hosted (e.g., over GKE). Fully Serverless Approach, based on Cloud Functions and Cloud Run is definitely not welcomed.

Relocation

Relocation assistance to a desired country may be provided after the probationary period, based on business needs and demonstrated performance.

About team

Working jointly with other ML engineers as part of a full-fledged AI team that includes data and medical analysts, data engineers, and data scientists.

Team Building

The company partially compensates Team Building events, when multiple teammates are located in nearby countries.

Location and timezone:

We are focused on hiring in time zones overlapping with the US (Portugal, Spain) or Western Europe

We are only able to engage contractors in countries where international payroll and compliance can be reliably supported

We are open to considering additional locations where time zone overlap and payroll compliance can be reliably supported, including certain Eastern European and Middle Eastern countries (Bulgaria, UAE, etc.).

Languages:

Upper-Intermediate English or higher; ability to present work and lead discussions with US-based teammates, customers, and stakeholders.

Responsibilities:

Build and operate ML solutions for multimodal healthcare data processing and real-time AI agent serving.

Key Focus Areas

Clinical document analysis: extract, classify, and structure medical data

Medical text processing: summarize clinical notes, provide clinical decision support

Conversational AI: develop voice and chat agents for healthcare workflows

What You’ll Own:

Architect, implement and optimize ML infrastructure using GCP tools (Vertex AI, BigQuery, Cloud Composer, GKE, Pub/Sub)

Design and develop DAG-based batch pipelines and real-time API endpoints for ML model and multimodal agent serving

Deploy and maintain self-hosted LLM solutions with custom fine-tuning and production optimization

Build monitoring and observability for production systems: track system health, model performance, inference latency and operational costs

Drive technical decisions and collaborate with cross-functional teams to integrate ML solutions into healthcare business processes

Required Experience

ML Engineering

5+ years as an ML Engineer building, deploying and running ML solutions in high-load, real-world business processes

Proven track record of deployment of high-load ML solutions into production from R&D and Prototyping to Deployment and Productivization

Proven track record of building monitoring systems for ML solutions

Cloud-Native & GCP

Cloud-Native Experience with GCP – 3+ years using ML/data products (Vertex, BigQuery, Cloud Composer, GKE, Pub/Sub)

We don’t – at all – consider Legacy-only Big Data Experts. Meaning, it’s not enough to know outdated technologies, such as Teradata, Hadoop, Spark over Hadoop, etc.

This role requires working in a Unix-like Development Environment (e.g., macOS, Linux). We do not use Windows-based workstations for Engineering or AI-related tasks. Virtualization (e.g., WSL) isn’t enough.

Engineering and Fundamentals

We expect strong productization skills — the ability to take ML and LLM solutions beyond prototypes and deliver production-ready systems with a focus on latency, reliability, observability, cost efficiency, and continuous operation in healthcare environments.

We prioritize Cloud-Native architecture on GCP, leveraging managed and scalable services such as Vertex AI, BigQuery, Cloud Composer, GKE, and Pub/Sub. We avoid fully serverless designs when they limit control, visibility, or performance tuning for ML/LLM workloads.

Experience with high-performance LLM inference engines such as vLLM (or similar)

Strong Python expertise is required for production-grade ML/NLP/LLM development, including pandas, asyncio-based services, FastAPI, and Pydantic.

A solid understanding of data architecture and database internals is important — schema design, DDL, partitioning, indexing, storage formats, and cost/performance optimization at scale.

We also require strong SQL skills for working with large BigQuery datasets, including clustering, partitioning, and query optimization.

Deep experience with modern ML and NLP is essential — PyTorch, Transformers, the Hugging Face ecosystem, multimodal models and retrieval-augmented methods.

Компания, занимающаяся разработкой облачных технологий для здравоохранения на базе искусственного интеллекта, ищет талантливого Senior ML инженера для долгосрочного сотрудничества.

Вы присоединитесь к международной команде первоклассных профессионалов, которые с энтузиазмом создают продукты, улучшающие качество медицинских услуг.

Требуемый опыт:

Более 5 лет работы в качестве инженера по машинному обучению

Подтвержденный опыт внедрения решений машинного обучения с высокой нагрузкой
Подтвержденный опыт создания систем мониторинга для решений машинного обучения
Требуется опыт работы с Cloud-Native — 3+ лет. Предпочтительно с GCP. Кандидаты с высоким уровнем владения GCP всегда будут иметь приоритет перед Azure, AWS и другими облачными платформами

Мы не рассматриваем кандидатов, специализирующихся только на устаревших технологиях больших данных. То есть недостаточно знать устаревшие технологии, такие как Teradata, Hadoop, Spark over Hadoop и т. д.
Эта должность требует работы в Unix-подобной среде разработки (например, macOS, Linux)
Навык выводить решения ML и LLM за рамки прототипов и поставлять готовые к производству системы с акцентом на задержку, надежность, наблюдаемость, экономическую эффективность и непрерывную работу в среде здравоохранения
Мы отдаем приоритет облачной архитектуре на GCP, используя управляемые и масштабируемые сервисы, такие как Vertex AI, BigQuery, Cloud Composer, GKE и Pub/Sub.
Опыт работы с высокопроизводительными механизмами вывода LLM, такими как vLLM (или аналогичными).
Для разработки ML/NLP/LLM производственного уровня требуются глубокие знания Python, включая pandas, сервисы на основе asyncio, FastAPI и Pydantic.
Глубокое понимание архитектуры данных и внутреннего устройства баз данных — проектирование схем, DDL, разбиение на разделы, индексирование, форматы хранения и оптимизация затрат/производительности в масштабе.
Глубокие знания SQL для работы с большими наборами данных BigQuery, включая кластеризацию, разбиение на разделы и оптимизацию запросов.
Глубокий опыт работы с современными ML и NLP — PyTorch, Transformers, экосистема Hugging Face, мультимодальные модели и методы с расширенным поиском.