[Remote] Senior AI Engineer (US)
Note: The job is a remote job and is open to candidates in USA. Assail is a company focused on autonomous offensive security solutions, and they are seeking a Senior AI Engineer for their Ares platform. The role involves developing AI agents and models that enhance the platform's capabilities in security across various applications.
Responsibilities
- Design, implement, and continuously improve the behavior and prompting of Ares' named agents, including orchestration patterns, hand-offs, planning loops, tool use, and shared memory
- Contribute to the model powering Ares across data curation, SFT, preference optimization (DPO/GRPO-style), and evaluation. Own pieces of the training pipeline from dataset construction through eval
- Extend the co-evolutionary self-training system that lets Ares learn from its own engagements and improve over time
- Build false-positive detection, tiered skill learning (suppression rules, agent directives, code-patch proposals), and the infrastructure that routes proposed changes through human approval and back into the platform
- Design rigorous, security-specific evaluations covering OWASP Top 10 coverage, exploit chaining, finding accuracy, and agent reliability. Track performance over every model and agent change
- Contribute to vision capabilities, mobile (iOS/Android) coverage, and BYOK support shipping in Sidewinder and beyond
- Own latency, cost, observability, and failure-mode analysis for agents running in customer engagements. Partner with the platform team on Kubernetes-based deployment
- Contribute to the live accuracy gauge and other surfaces where model and agent quality is exposed to customers
Skills
- 5+ years building production ML/AI systems, with at least 2 years working directly on LLMs or LLM-powered agents
- Deep Python; strong, production-grade engineering practices (testing, code review, observability)
- Hands-on fine-tuning experience: SFT, preference optimization (DPO, GRPO, RLHF/RLAIF), data curation, and synthetic data generation
- Strong grasp of transformer architectures and the modern training stack (PyTorch, Hugging Face, DeepSpeed or FSDP, accelerate)
- Experience designing and shipping multi-agent or tool-using LLM systems in production — not just demos
- Rigorous eval design: building harnesses, tracking experiments, and making model/agent decisions based on data rather than vibes
- Inference optimization experience: vLLM or TensorRT-LLM, quantization, throughput/latency tradeoffs
- Comfort with retrieval pipelines, vector stores, and structured memory for agents
- Kubernetes and containerized deployment fluency
- Genuine interest in offensive security and the ability to ramp quickly on OWASP Top 10, API security, web app pentesting, and mobile pentesting concepts. Direct offensive security background is a strong plus but not required
- Offensive security background: OSCP/OSWE/OSWA, CTF, bug bounty, or prior red team work
- Research publications at NeurIPS, ICML, ICLR, USENIX Security, IEEE S&P, Black Hat, or DEFCON
- Open source contributions to agent frameworks or LLM tooling
- Experience with adversarial ML or red-teaming AI systems
- Familiarity with mobile app reverse engineering or binary analysis
Company Overview