Cloud Operations Engineer

Remote
Full Time
Mid Level

Company Overview:
iTmethods is a 20+ year-old firm specializing in managed services, enterprise AI, DevOps, and cloud integration. As an AWS MSP and SOC 2 Type II certified company, we're radically transforming from pre-AI capabilities to post-AI dominance through our iTmethodsONE platform—a central hub for AI/ML, DevOps, Cloud, Compliance, Security, and FinOps. Key offerings include:

  • AI-Assisted Development (enterprise-grade coding assistants, MLOps integration for productivity boosts)
  • Agentic AI Automation (autonomous agents for workflows, powered by partnerships for any enterprise use case—expertly managed by iTmethods),
  • Modular DevOps & AI Ops (LEGO-like secure CI/CD workflows, flexible automation, scalable pipelines with MLOps integration),
  • Self-Hosted ISV and Open Source Tools Platform (expertly managed by iTmethods for regulated industries, with hybrid/on-premise patterns),
  • Secure Deployments & Optimization (24/7 monitoring, DevSecOps, FinOps for up to 50% cost reduction, GDPR-aligned protection),
  • FinOps Capabilities (resource optimization, CapEx-to-OpEx conversion with up to 50% cost reduction, real-time visibility for ruthless efficiency).

With 50+ integrations (e.g., Coder, Sonar, GitHub, CI/CD, JFrog, AWS, Azure), we deliver up to 67% development cycle reductions and drive results for enterprises across the US, Canada, and Europe. Profitable and founder-led, we operate with warrior mindset—urgency, speed, and purpose—to seize $85.2B+ opportunities in self-hosted/Agentic AI and FinOps via strategic partnerships. Our customer-obsessed culture is shifting to a post-AI era of ruthless efficiency, elite performance, and extraordinary value creation, with no spectators—only warriors.

The Opportunity

Reporting to the Director of Platform Engineering & Operations, we are adding a Cloud Operations Engineer to the team. In this Full-Time position, you will play a critical role in eliminating vulnerabilities and optimizing uptime for our enterprise clients by implementing resilient and robust solutions leveraging automation for fault detection and a laser focus on automatically recovering when a fault is detected and continuously looking to deploy solutions which avoid previously identified faults from being generated.

Responsibilities

  • Address root causes and identify solutions. You will focus on problem-solving, automation, and ensuring a sustained focus on engineering. You will:
    • Architect and implement monitoring alarms and logging solutions.
    • Identify issues proactively, and mitigate them to improve the customer experience.
    • Audit, test, and review solutions to ensure we deliver a resilient, monitored, highly secure, and complete solution.
  • Demand forecasting and capacity planning. You will create and maintain good visibility of the demand for AWS resources, planning, and usage. You will plan and execute efficient use of resources.
  • Optimize through automation. You will eliminate manual, repetitive, tactical solutions with no enduring value. You will implement automation for a more sustained and scalable solution, services, and processes, contributing to the continuous improvement of all operations to efficiently manage and maintain deployments.
  • Plan and deploy upgrade, configuration changes and security patches.
  • Drive continuous improvement. You will research and implement best practices in DevOps. You will explore and evaluate new and emerging software tools and technologies.
  • Be on call outside of business hours, on a weekly rotation basis.
  • Take ownership of not only your deliverables but the platform and drive resiliency on the platform as per an SRE mindset.
  • Be hands-on. You will:
    • Assist in the configuration and support of customer environments.
    • Code deployments, optimization, and various tools.
    • Troubleshoot and resolve escalated software and infrastructure-related issues and challenges, acting as a customer-facing escalation point.
    • Review new tools and software prior to implementation.

Qualifications

  • Education/qualifications and experience You have a Bachelor's Degree in Computer Science, Computer Engineering, Software Engineering, and preferably certification in AWS, Ansible and Terraform and have worked on DevOps tools Git and Jenkins
  • Exceptional communication and problem solving skills
  • Have worked with collaboration tools like Jira, Confluence and Jira ServiceDesk
  • You’re driven, collaborative and motivated. You thrive on developing solutions to open-ended business problems.
  • Ability to write code.
  • Passion for automation and efficiency.
  • Ability to work autonomously and as part of a team.
  • Proficient in English.
  • 40 hours per week, with weekly on call rotation.

Our Commitment to You

  • Flexible work environment
  • Competitive compensation (OTE: Base Salary plus bonus)
  • Benefits package
  • Learning and Development
  • Career Progression
  • Culture – One team environment founded on respect and collaboration where we do not take shortcuts and are customer obsessed

Join us
Apply here or learn more on our websiteMedium or LinkedIn.
iTmethods is committed to fostering an inclusive and accessible environment where employees feel valued and respected, and where every employee has the opportunity to realize their potential. We are committed to providing reasonable accommodations, if required, and will work with you to meet your needs.

Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*