Role
You will design, deploy, and operate the network infrastructure underpinning Sesterce's GPU AI factories across Europe — owning the full stack from physical cabling to BGP policies and RDMA fabric tuning.
What you will do
- Design and deploy InfiniBand (NDR 400G / HDR) and high-speed Ethernet fabrics for GPU clusters of 1,000+ nodes
- Configure and operate Arista, Juniper, and Mellanox/NVIDIA equipment; manage BGP, OSPF, and VXLAN overlays
- Tune RoCE and InfiniBand transport for collective communication workloads (NCCL, UCX)
- Maintain network automation pipelines (Ansible, Netbox, Nautobot) across all sites
- Troubleshoot performance regressions, packet loss, and congestion during live AI training runs
What we are looking for
- 4+ years of datacenter networking experience, including InfiniBand or 400G/800G Ethernet at scale
- Deep familiarity with RDMA, RoCE v2, and GPU training cluster communication patterns
- Solid command of Linux networking internals (DSCP, ECN, PFC, adaptive routing)
- Experience with network-as-code tooling (Ansible, Terraform, Netbox) and CI/CD pipelines
- Ability to interpret low-level diagnostics (tcpdump, perftest, ib_write_bw) and correlate with application performance