Summary
Overview
Work History
Education
Skills
Custom Section
Websites, Portfolios and Profiles
Timeline
Generic
Henry Cullen

Henry Cullen

Dublin

Summary

Site Reliability Engineer operating in a zero-downtime, latency-sensitive trading environment. Experienced in designing, administering, and automating hybrid infrastructure across AWS cloud environments, on-premise virtualised infrastructure, and dedicated bare-metal systems. Owns high-availability observability, scheduling, and automation platforms supporting 12,000+ hosts across production, staging, and development environments. Strong Linux systems administrator with deep automation expertise and production incident leadership experience. Comfortable operating across infrastructure architecture, cluster design, workload orchestration, automation engineering, and secure multi-tenant environments.

Overview

3
3
years of professional experience

Work History

Site Reliability Engineer – Monitoring, Management & Automation Systems

Susquehanna International Group (SIG)
01.2023 - 02.2026
  • Owned and operated 16 high-availability enterprise platforms within a regulated trading environment supporting 12,000+ hosts.
  • Designed, provisioned, and maintained infrastructure across AWS cloud environments, on-premise virtualised systems, and dedicated bare-metal servers.
  • Supported monitoring, scheduling, and automation platforms deployed across mixed infrastructure models (cloud + physical).
  • Provisioned new physical hardware and virtual machines to support platform expansion and performance scaling.
  • Worked with Kubernetes environments, provisioning pods and supporting platform-level deployments for observability and automation workloads.
  • Configured and maintained load balancers, reverse proxies, SSL termination, and firewall segmentation to secure multi-tenant environments.
  • Managed LDAP integration and enterprise access controls across production systems.
  • Ensured performance optimisation in latency-sensitive trading infrastructure, minimising overhead from logging agents and monitoring collectors.
  • Architected and administered enterprise-scale ELK and Splunk clusters (Indexers, Search Heads, Heavy Forwarders, Deployment Servers).
  • Managed log ingestion from 12,000+ hosts across production, staging, and development environments.
  • Designed retention, index, and shard allocation strategies balancing performance and cost.
  • Owned Checkmk HA clusters with custom monitoring checks and alerting frameworks.
  • Designed Prometheus + Thanos architecture for resilient, long-term time-series storage.
  • Built executive and trader-facing dashboards (P&L analytics, operational KPIs, risk visibility).
  • Architected and maintained highly available TIDAL Enterprise Scheduler clusters (Fault Monitor, Primary and Secondary nodes).
  • Guided engineering teams in designing resilient job workflows and dependency chains.
  • Ensured reliability and recoverability of business-critical automated processes.
  • Automated infrastructure provisioning, scaling, and configuration using Ansible and AWX.
  • Developed modular playbooks for monitoring platform lifecycle management and remediation workflows.
  • Integrated automation platforms into CI/CD pipelines to standardise deployments.
  • Administered Octopus Deploy clusters and enterprise Artifactory repositories.
  • Supported GitLab and Bitbucket service administration.
  • Senior escalation point for infrastructure and platform incidents.
  • Led triage and resolution during live trading production incidents.
  • Executed sensitive upgrades and migrations with zero unplanned downtime.
  • Reduced recurring operational support tickets by 15–30 per week through automation initiatives.
  • Balanced product ownership responsibilities with hands-on systems administration.
  • 16 enterprise platforms under ownership
  • 12,000+ monitored hosts
  • Production / Staging / Development environments
  • Multi-tenant architecture (per desk/team clusters)
  • Hybrid infrastructure (AWS + VMs + Bare Metal)
  • High-availability across all core systems
  • Active production on-call responsibility

Education

Bachelor of Science - Bachelor of Science - BS, Business Information Systems

University College Cork
Ireland
2023

Skills

  • Hybrid Infrastructure
  • AWS / Bare Metal
  • High-Availability Cluster Architecture
  • Linux Production Systems Administration
  • Virtualised Environments & Host Provisioning
  • Kubernetes Platform Support & Pod Provisioning
  • Enterprise Monitoring & Observability Platforms
  • Infrastructure Automation
  • Enterprise Scheduling
  • CI/CD & Deployment Platforms
  • Secure Access Control
  • Incident Response & Production Change Management

Custom Section

  • Hybrid Infrastructure: AWS + On-Prem VMs + Bare Metal
  • High-Availability Cluster Architecture
  • Linux Production Systems Administration
  • Virtualised Environments & Host Provisioning
  • Kubernetes Platform Support & Pod Provisioning
  • Enterprise Monitoring & Observability Platforms
  • Infrastructure Automation (Ansible, AWX/Tower, IaC Principles)
  • Enterprise Scheduling (TIDAL HA Clusters)
  • CI/CD & Deployment Platforms
  • Secure Access Control (LDAP, Network Segmentation, SSL, Reverse Proxies)
  • Incident Response & Production Change Management

Websites, Portfolios and Profiles

linkedin.com/in/henry-cullen-web

Timeline

Site Reliability Engineer – Monitoring, Management & Automation Systems

Susquehanna International Group (SIG)
01.2023 - 02.2026

Bachelor of Science - Bachelor of Science - BS, Business Information Systems

University College Cork
Henry Cullen