Белград
We're looking for a Senior LLMOps Engineer to own the full lifecycle of production-grade LLM systems from research prototypes to deployment.
The company is a Tokyo-based deep-tech startup building a next-generation reasoning engine that empowers AI to solve complex, multi-step problems – not merely generate text.
Their proprietary graph-neuro-symbolic architecture translates natural language into a unified formal representation, orchestrates cognitive workflows across specialized solvers, and delivers verified, interpretable solutions. Designed for enterprise B2B clients in safety-critical domains, their on-premise system ensures transparency, auditability, and control. With a 30-person international team, they are transitioning from technical demo to pilot deployments with enterprise partners in 2026.
Key Skills & Experience:
Experience
5+ years of experience in ML or Applied AI roles.
Hands-on experience deploying ML models to production environments.
Practical experience with LLMs (open-source or proprietary) in real-world applications.
Experience operating ML systems under production constraints (latency, cost, observability, reliability).
Experience working in fast-moving startup or R&D-driven environments is a strong plus.
Technical skills
Strong Python proficiency; experience with ML frameworks.
Solid understanding of modern LLM stacks: fine-tuning, inference optimization, RAG, prompt/agent orchestration.
Experience with MLOps / LLMOps tooling: experiment tracking, evaluation pipelines, monitoring, CI/CD for models.
Familiarity with containerization and deployment (Docker, Kubernetes or equivalents).
Experience with cloud or on-prem GPU environments.
Understanding of distributed systems concepts is a plus.
Educational background:
Bachelor’s or Master’s degree in Computer Science, Machine Learning, Engineering, or a related field.
Responsibilities:
Design, build, and maintain production-grade LLM systems.
Own the end-to-end LLMOps lifecycle: data prep, fine-tuning, evaluation, deployment, monitoring, and iteration.
Build evaluation frameworks to measure: LLM quality, robustness, and regression.
Partner with researchers, engineers, and infrastructure teams to productionize research prototypes.
Shape architecture for agentic systems, RAG pipelines, and hybrid ML-symbolic solutions.
What the company offers:
Target salary range is 16-22M JPY (~$105k–145k) gross per year.
High autonomy: Colleagues trust independent, professional teammates—but collaboration is key.
Low Bureaucracy. Results > process, though metrics and documentation still matter.
Support for English and Japanese language training.
Compensation of overseas health insurance and additional medical costs. Full Japanese social security: Health, employment insurance, maternity/childcare leave. Annual health checkup
Paid Holiday - up to 25 days per year.
Full Visa Sponsorship.
Monthly commute expense.