All roles

[Remote] Senior Cloud Engineer

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. Onyx Visual Effects is a company specializing in visual effects and cloud infrastructure. They are seeking a Senior Cloud Engineer to manage AWS services, optimize cloud resources for VFX workloads, and ensure compliance with security standards while collaborating with global teams.

Responsibilities

  • Proficiency in AWS core services, including EC2 for compute, EFS/S3/EBS for storage, VPC networking, Security Groups, NACLs, Route 53, and Direct Connect for low-latency remote access
  • Includes managing instance failures during long-running renders, handling multi-AZ outages with failover, optimizing for global teams, and integrating with on-premises legacy hardware
  • Specialization in MPA compliance and security-first engineering, including AWS KMS encryption, access logging, Trusted Partner Network assessments, and zero-trust models
  • Includes adapting to evolving MPA guidelines, managing sensitive IP with external studios, handling data sovereignty requirements, and responding to vulnerabilities in media workflows
  • Experience with AWS VFX solutions like Thinkbox Deadline, Deadline Cloud, Nimble Studio, and EC2 Spot/GPU instances for cost-effective rendering
  • Includes scaling farms for 8K+ projects, recovering from spot interruptions, troubleshooting custom VFX plugins, and optimizing hybrid CPU/GPU workloads
  • Identity and Access Management with role-based controls, MFA, and integration with directory services
  • Includes onboarding/offboarding remote users, federated logins from third-party IDPs, managing privilege escalation risks, and auditing access logs for anomalous behavior
  • Cost optimization using AWS Cost Explorer, Savings Plans, Reserved Instances, and auto-scaling groups for variable VFX workloads
  • Includes forecasting burst render costs, mitigating overspending from misconfigured scaling, and tracking costs across multiple projects
  • Data transfer tools like AWS Snowball and DataSync for asset migrations, plus multi-tier storage strategies such as S3 Intelligent-Tiering
  • Includes large-scale transfers, partial sync recovery, encryption integrity, and cold storage retrieval planning
  • AWS certifications such as Solutions Architect or SysOps Administrator, with the ability to apply certification knowledge to custom VFX scenarios, real-time collaboration setups, renewals, and edge deployments such as AWS Outposts
  • Expertise in Rocky Linux, Redhat-based OS, Windows, and macOS command-line and general administration, including cross-platform scripting with Bash and PowerShell
  • Includes troubleshooting Linux kernel issues, macOS driver conflicts, Windows updates, and mixed-OS fleets
  • Infrastructure as Code with Terraform, AWS CloudFormation, or Ansible for provisioning and automation
  • Includes idempotent deployments, rolling back failed IaC changes during live productions, version control collaboration, and provider quirks
  • Monitoring and logging with AWS CloudWatch, X-Ray, and integrations like ELK Stack for metrics, alarms, and proactive issue resolution
  • Includes custom alarms for GPU utilization, tracing distributed render jobs, filtering high-volume logs, and SIEM integration
  • Backup and disaster recovery using AWS Backup, S3 versioning, and multi-region replication
  • Includes testing restores for corrupted VFX assets, managing RTO/RPO in outages, automating failover drills, and handling version conflicts
  • Networking and security operations, including VPN, firewalls, AWS GuardDuty, and high-performance network-attached storage
  • Includes mobile artist VPN access, detecting network attacks, optimizing NAS for 4K/8K streaming, and securing third-party integrations
  • Virtual machine management and containerization with Docker, ECS, or Kubernetes for portable VFX applications
  • Includes bursty simulations, pod evictions during resource contention, GPU passthrough, and network policy debugging
  • Proficiency with core VFX software like Nuke, ZBrush, Maya, V-Ray, Houdini, Redshift, Arnold, RenderMan, and Octane
  • Includes optimizing for non-standard hardware, troubleshooting batch-mode plugin crashes, integrating emerging AI tools, and handling license server failures
  • Render farm management using AWS Deadline Cloud, PipelineFX Qube, or custom scripts for job distribution and optimization
  • Includes prioritizing jobs during overlapping deadlines, recovering orphaned tasks, scaling to thousands of nodes, and integrating hybrid cloud/off-cloud farms
  • Pipeline tools including asset management systems such as ShotGrid or ftrack, version control with Perforce or Git, and CI/CD for artist workflows
  • Includes merging conflicting asset versions, handling large binary files, automating plugin testing, and securing pipelines against IP leaks
  • Performance tuning for GPU/CPU workloads, memory management in simulations, and benchmarking to reduce render times
  • Includes managing OOM errors in Houdini sims, comparing instance types, and optimizing cost/performance trade-offs
  • Troubleshooting application issues, OS problems, and providing deskside, phone, and ticket support to VFX artists and production teams
  • Includes remote debugging, VPN-disrupted sessions, vendor escalation, and documenting repeatable fixes
  • Experience with HP Connect Anywhere, PCoIP desktop environments, NICE DCV, and AWS AppStream for low-latency streaming and multi-monitor support
  • Includes high-DPI displays, transcontinental latency, session security, and VR/AR review workflows
  • NVIDIA CUDA drivers, GRID/AMDGPU management in EC2 instances, and virtual workstations for color-accurate VFX work
  • Includes driver updates, CUDA version mismatches, color calibration over compressed streams, and experimental AMD setups
  • Secure file sharing via AWS Transfer Family and real-time collaboration tools such as Frame.io integrations
  • Includes enforcing upload quotas, recovering interrupted transfers, auditing shares, and custom encryption for sensitive dailies
  • WEKA Storage Solutions integration with AWS for high-I/O VFX tasks such as 4K/8K footage
  • Includes scaling IOPS for parallel artist access, handling filesystem issues, optimizing mixed read/write patterns, and migrating from legacy storage
  • Advanced storage strategies, including lifecycle policies for archiving and handling large media files
  • Includes tier transitions, retention policies, legal holds, accidental deletion recovery, snapshots, and cost optimization for growing project data
  • Scripting and programming in Python, Bash, or similar for automation, system tasks, and DevOps practices
  • Includes resilient scripts for flaky APIs, exception handling in long-running automations, VFX-specific libraries, and secure handling of user input
  • Configuration management, deployment tools, and CI/CD pipeline building
  • Includes managing config drift, zero-downtime deployments, troubleshooting branched pipeline failures, and securing secrets in CI/CD environments
  • Strong problem-solving, critical thinking, and root cause analysis for render failures and remote issues
  • Includes diagnosing cascading failures, intermittent bugs, post-mortems with non-technical stakeholders, and adapting solutions to evolving tech stacks
  • Excellent communication, teamwork, and ability to consult, train, and build relationships with remote artists, producers, and vendors
  • Includes bridging time zones, supporting high-stress deadlines, training via screen share, and negotiating SLAs during outages
  • Self-motivated, proactive, and committed to continuous learning, including AWS trends and VFX innovations like AI-assisted rendering
  • Includes self-teaching during rapid tech shifts, identifying bottlenecks before escalation, and testing beta features in sandboxes
  • Experience in vendor management and shift work flexibility for global remote operations
  • Includes managing multi-vendor ecosystems, adapting to 24/7 on-call needs, negotiating custom integrations, and handling critical vendor escalations

Skills

  • Proficiency in AWS core services, including EC2 for compute, EFS/S3/EBS for storage, VPC networking, Security Groups, NACLs, Route 53, and Direct Connect for low-latency remote access
  • Specialization in MPA compliance and security-first engineering, including AWS KMS encryption, access logging, Trusted Partner Network assessments, and zero-trust models
  • Experience with AWS VFX solutions like Thinkbox Deadline, Deadline Cloud, Nimble Studio, and EC2 Spot/GPU instances for cost-effective rendering
  • Identity and Access Management with role-based controls, MFA, and integration with directory services
  • Cost optimization using AWS Cost Explorer, Savings Plans, Reserved Instances, and auto-scaling groups for variable VFX workloads
  • Data transfer tools like AWS Snowball and DataSync for asset migrations, plus multi-tier storage strategies such as S3 Intelligent-Tiering
  • AWS certifications such as Solutions Architect or SysOps Administrator
  • Expertise in Rocky Linux, Redhat-based OS, Windows, and macOS command-line and general administration
  • Infrastructure as Code with Terraform, AWS CloudFormation, or Ansible for provisioning and automation
  • Monitoring and logging with AWS CloudWatch, X-Ray, and integrations like ELK Stack for metrics, alarms, and proactive issue resolution
  • Backup and disaster recovery using AWS Backup, S3 versioning, and multi-region replication
  • Networking and security operations, including VPN, firewalls, AWS GuardDuty, and high-performance network-attached storage
  • Virtual machine management and containerization with Docker, ECS, or Kubernetes for portable VFX applications
  • Proficiency with core VFX software like Nuke, ZBrush, Maya, V-Ray, Houdini, Redshift, Arnold, RenderMan, and Octane
  • Render farm management using AWS Deadline Cloud, PipelineFX Qube, or custom scripts for job distribution and optimization
  • Pipeline tools including asset management systems such as ShotGrid or ftrack, version control with Perforce or Git, and CI/CD for artist workflows
  • Performance tuning for GPU/CPU workloads, memory management in simulations, and benchmarking to reduce render times
  • Troubleshooting application issues, OS problems, and providing deskside, phone, and ticket support to VFX artists and production teams
  • Experience with HP Connect Anywhere, PCoIP desktop environments, NICE DCV, and AWS AppStream for low-latency streaming and multi-monitor support
  • NVIDIA CUDA drivers, GRID/AMDGPU management in EC2 instances, and virtual workstations for color-accurate VFX work
  • Secure file sharing via AWS Transfer Family and real-time collaboration tools such as Frame.io integrations
  • WEKA Storage Solutions integration with AWS for high-I/O VFX tasks such as 4K/8K footage
  • Advanced storage strategies, including lifecycle policies for archiving and handling large media files
  • Scripting and programming in Python, Bash, or similar for automation, system tasks, and DevOps practices
  • Configuration management, deployment tools, and CI/CD pipeline building
  • Strong problem-solving, critical thinking, and root cause analysis for render failures and remote issues
  • Excellent communication, teamwork, and ability to consult, train, and build relationships with remote artists, producers, and vendors
  • Self-motivated, proactive, and committed to continuous learning, including AWS trends and VFX innovations like AI-assisted rendering
  • Experience in vendor management and shift work flexibility for global remote operations

Company Overview

  • Founded in 2021, OnyxVFX is one of the first of its kind to be a fully virtual visual effects studio providing end to end content creation fully in the cloud allowing Onyx to be creative, efficient, and agile and while maintaining full control over data security. It was founded in 2021, and is headquartered in Sherman Oaks, CA, US, with a workforce of 11-50 employees. Its website is https://www.onyxvfx.com.
  • Apply To This Job

    Related roles

    [Remote] Oracle PPM Functional Consultant

    Remote · USA Full-time

    [Remote] Account Manager - Texas - Remote Work

    Remote · USA Full-time

    [Remote] Account Executive, Majors US East

    Remote · USA Full-time

    [Remote] Account Executive

    Remote · USA Full-time

    [Remote] Oracle NetSuite - Account Executive - Mid-Market- Products

    Remote · USA Full-time

    [Remote] Senior Full-stack Developer (.Net & Angular)

    Remote · USA Full-time

    [Remote] Key Account Manager

    Remote · USA Full-time

    [Remote] Associate Vice President/Regional Finance Officer

    Remote · USA Full-time

    [Remote] Experienced System Operations Manager

    Remote · USA Full-time

    [Remote] Okta Technical Consultant

    Remote · USA Full-time

    Netflix Taggers Job, Netflix Tagger.Com, Tagger Netflix

    Remote · USA Full-time

    Work From Home arenaflex Customer Service Online Chat Representative – Flexible Part‑Time Remote Support Role

    Remote · USA Full-time

    Senior Back-End Developer

    Remote · USA Full-time

    [Remote] Account Executive (EdTech Division)

    Remote · USA Full-time

    Rostering Specialist, District Success (Part-time Contract: July 2023 - May 2024, $50/hr)

    Remote · USA Full-time

    Join Today: ThirdLove, Store Leader: Dallas – Leap – Dallas, TX

    Remote · USA Full-time

    North America Strategic Business Development Director

    Remote · USA Full-time

    Director, Pharmacometrics (Remote or Hybrid)

    Remote · USA Full-time

    Customer Service Representative - On-Site at Concentrix + Webhelp: Join a Global Organization Fostering Exceptional Customer Experiences and Career Growth

    Remote · USA Full-time

    Procurement and Sourcing Leader – Project & Development Services

    Remote · USA Full-time