R&D
Tel Aviv
Job Summary
We are seeking a talented and passionate Software Engineer to join our NIC team, working at the forefront of AI datacenter infrastructure. In this role, you will develop and optimize the software stack powering high- performance NICs deployed in large-scale AI training and inference clusters. You will collaborate closely with hardware, networking, and AI framework teams to ensure our NIC stack delivers maximum throughput, minimal latency, and seamless integration with GPU computing frameworks and collective communication libraries.
Key Responsibilities
● Design, implement, and optimize L2 and RDMA features for third-party NIC hardware
● Design and develop tools for real-time NIC performance monitoring, profiling, and telemetry in AI datacenter environments. Analyze end-to-end distributed training and inference bottlenecks.
● Develop, maintain, and extend NIC drivers and firmware interfaces within the Linux kernel network subsyste
● Collaborate in the evaluation and integration of upstream kernel, vendor, open-source NIC features,
Hardware, Networking, QA, and AI Framework Integration teams
● Implement, integrate and tune RoCEv2 / lossless Ethernet transport for AI workloads requiring ultra- low latency with GPU programming frameworks (CUDA, ROCm)
● Write production-grade, well-tested, and thoroughly documented code following team engineering standards and participate actively in code reviews, architecture discussions, and technical design sessions
● 3+ years of experience in system programming (C/C++, Python, Rust/GO)
● Expert-level C/C++ and Python programming with a strong focus on performance, memory management, and low-level optimization
● Deep understanding of the TCP/IP stack, Ethernet networking protocols (L2/L3) and Linux kernel network subsystem
● Solid knowledge of NIC architecture, PCIe, and hardware/software interfaces (e.g., DPDK, RDMA verbs, ibverbs)
● Hands-on experience with network performance profiling and debugging tools (perf, ethtool, tcpdump, wireshark, iperf3, netperf)
● Strong analytical and problem-solving skills with a systematic approach to debugging complex system issues
● Experience with version code repository (Git), tickets tracking (Jira/GitHub Issues), and collaborative development workflows
● Excellent written and verbal communication skills; able to document technical designs clearly
● Team player with high motivation, attention to detail, and the ability to work effectively in a fast- paced environment
Preferred Qualifications
● Experience with RoCEv2 and lossless Ethernet transport in production environments
● Experience with DPU/SmartNIC programming and firmware development (e.g., NVIDIA BlueField, Marvell Octeon, AMD Pensando)
● Knowledge of TLS/SSL offload and in-network security features for NIC hardware
● Experience with GPU programming frameworks (CUDA, ROCm) and GPU-Direct
● Familiarity with containerization and orchestration platforms (Docker, Kubernetes, Helm) in AI/HPC environments
● Experience working in hyperscale AI datacenter or cloud networking environments