Senior AI & Cloud Operations Engineer
About The Role: We are reimagining what it means to run Data and AI platforms. At Datatonic, Managed Services isn't about "maintenance", it’s about Continuous Engineering. We are looking for a foundational Senior Engineer to join us in Croatia to bridge the gap between high-level architecture and production excellence.
This is an engineering-first role. You won't be sitting in a traditional support queue; you will be designing the AI-driven systems that monitor, optimize, and evolve our clients’ GCP environments. You will play a central role in shaping our "XOps" (FinOps, MLOps, AIOps) strategy, building autonomous agents that ensure some of the world’s most sophisticated AI platforms stay reliable and cost-effective.
Key Responsibilities:
- Technical Ownership: Design and maintain the backend services and automation layers that power our managed platforms.
- AI-Augmented Operations: Build and deploy AI Agents and evaluation frameworks to automate anomaly detection, cost-optimization, and model health checks.
- Platform Evolution: Lead "Minor Feature" development implementing high-impact improvements and architectural tweaks to keep client platforms modern.
- The Strategic Advisor: Collaborate with architects and client stakeholders to identify "Day 2" risks and provide technical solutions before they impact the business.
- Engineering Excellence: Apply the "80/20" rule dedicating a portion of your time to internal R&D and building the automation tools that eliminate manual toil.
What We’re Looking For:
- Experience: 5+ years of software engineering experience, with a proven track record of building and running production systems.
- GCP Expertise: Deep technical knowledge of Google Cloud and Google Cloud tools (e.g., BigQuery, Dataflow, DataProc, Dataplex, Composer, Vertex, Looker, etc.). You understand how to architect for performance and cost.
- Tech Stack: Proficiency in Python, and a solid grasp of modern CI/CD, IaC, and containerization.
- AI/ML Literacy: A strong interest in MLOps and the challenges of deploying and monitoring Large Language Models (LLMs).
- Mindset: You are a versatile engineer who takes extreme ownership. You enjoy the "puzzle" of finding why a system is sub-optimal and building the code to fix it permanently.