- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
eBPF / XDP firewall and packet filtering
iptables是为linux主机创建防火墙的典型工具。我们已经在facebook上使用它们在我们的服务器上跨不同的层设置主机防火墙。在这个方案中,我们引入了一个基于ebpf/xdp的防火墙解决方案,用于包过滤,并且与iptables实现具有奇偶性。我们讨论了这个问题的各个方面。以下是这些方面的简要总结,我们将在论文/演示中进一步详细介绍。
设计和实施:
我们使用bpf表(映射、lpm尝试和数组)来匹配适当的包头内容。
防火墙的核心是一个ebpf过滤器,它解析一个包并查找收集匹配值的所有相关映射。逻辑规则集应用于这些收集的值。此逻辑集的读取类似于人类可读的高级防火墙策略。对于iptable规则,在每个规则内联的所有详细匹配条件中,这样的策略级表示很难推断。
性能优势和与iptables的比较
iptables对每个规则进行线性匹配,直到找到匹配为止。在我们的建议中,我们使用包含所有规则的密钥的bpf表(映射),使得包匹配非常高效。然后,我们使用收集到的结果应用该策略,这将导致iptables的显著加速。
易于策略/配置更新和维护
网络管理员拥有防火墙,而应用程序开发人员通常需要打开其应用程序才能工作的端口。使用ebpf过滤器的方法,我们在执行策略的过滤器和表示需要过滤的特定端口和前缀的关联映射的内容之间创建逻辑分隔。该策略归网络管理员所有(例如:向Internet打开的端口、从特定前缀中打开的端口、删除所有其他端口)。数据(端口号、前缀等)现在可以属于一个单独的逻辑部分,该部分向应用程序开发人员提供一个预定的目的地来更新其数据(例如:包含打开到内部子网的端口的文件等)。这减少了两个不同功能之间的摩擦,减少了人为错误。
部署经验:
我们在边缘基础设施中部署此解决方案以实现防火墙策略。
我们更新配置,重新加载过滤器以及包含用于过滤的键和值的各种映射的内容
bpf程序数组
我们使用bpf程序阵列的能力来链接不同的程序,如速率限制器、防火墙、负载均衡器等。这些是创建一个丰富的、高性能的网络解决方案的基础。
提出一个完全通用的防火墙解决方案,将现有iptables规则迁移到基于ebpf/xdp的过滤
我们提出一个建议,可以将现有的iptables规则转换为一个性能更好的ebpf程序,主要是用户空间处理和验证。
展开查看详情
1 .INFRASTRUCTURE Confidential Use Only – Do Not Share
2 .eBPF / XDP firewall and packet filtering Anant Deepak Software Engineer, Facebook Infrastructure. Nov 2018
3 . • iptables • Implementation • Policy and Network • Firewall in bpf / XDP Agenda • Benefits • Deployment @Facebook • Performance • Prototype : iptables in bpf
4 .iptables
5 .iptables Implementation
6 .iptables INPUT Chain
7 .iptables Rules lower in the chain suffer in performance Match traffic at high pps
8 .iptables Policy and Network Bootstrap / Stateless Public dest ports Domain dest ports Network dest ports Public source ports Deny / Log
9 .Firewall in BPF / XDP @ Facebook
10 .BPF Firewall Benefits • Performance • Lower CPU Utilization for filtering • DDoS attacks on closed ports • Networking in XDP : BPF_MAP_TYPE_PROG_ARRAY • Load Balancer (katran) • Other filters or Rate-limiters • Filter past tunnel headers • More flexibility in match criteria (pcap style bytes @ offset) • Manageability • Policy - Enforce a policy with logical mapping of packet attributes • Network - Decouple network topology internals (networks, ports, prefixes,..)
11 .Revisiting : iptables Policy and Network Bootstrap / Stateless Public dest ports Domain dest ports Network dest ports Public source ports Deny / Log NETWORK POLICY
12 .Now in bpf .. Policy and Network Bootstrap / Stateless Public dest ports Domain dest ports Network dest ports Public source ports Deny / Log NETWORK POLICY
13 .Now in bpf .. C Program for policy . Maps for network topology BPF Array - TCP ports Key Value 443 TCP_PUBLIC 8080 TCP_SAMENET <...> TCP_PROD <...> TCP_SAMENET | TCP_PROD BPF LPM - IP6 Prefix Key Value dead:beef::/32 IP6_PRODNET dead:beef:dead:beef::/64 IP6_PRODNET | IP6_SAMENET NETWORK
14 .BPF Firewall Deployment @ Facebook • Stateless • Deployed on our Edge InfraStructure • Runs before our L4 Load balancer (katran) • Infact: PASS => bpf_tail_call -> katran • Tooling for loading and verifying bpf Map contents • C program (policy) rarely changes • Atomic Swap : Loaders load new maps and reattach a new program • BCC Helpers
15 .BPF Firewall Deployment @ Facebook • Prefix Matching • Loader handles overlapping prefixes • Sampling : BPF_PERF_OUTPUT • Stats : PERCPU_ARRAY • Look beyond tunneled headers BPF LPM - IP6 Prefix Key Value dead:beef::/32 IP6_PRODNET dead:beef:dead:beef::/64 IP6_PRODNET | IP6_SAMENET
16 .BPF Firewall Deployment @ Facebook • No per rule stats granularity like in iptables • But we have stats for agg. policy (eg : PASS same network traffic) • Keeping it simple : ACCEPT and DROP • REJECTS, custom chains possible via bpf_tail_call
17 .BPF Firewall Performance • iptables has a linearly increasing cpu-util as packets hit lower rules • Best when packets match earlier rules • Worst for default policy (match attempted for each rule) • Our BPF firewall performance remains practically constant irrespective of rule being matched or default drop • Packet tuple lookup is efficient w/ only 1 BPF map per tuple • Location of matching rule now only matters to the extent of few branching instructions
18 .Prototype : iptables in bpf
19 .iptables in bpf Motivation • A drop in replacement for iptables in kernel • User space helper to translate rules and load bpf maps • bpf program to do minimal work • Same functionality with: • Better performance • Customization for attach points (containers ?)
20 .Revisiting : firewall in bpf .. Policy and Network BPF Array - TCP ports Key Value 443 TCP_PUBLIC 8080 TCP_SAMENET <...> TCP_PROD <...> TCP_SAMENET | TCP_PROD BPF LPM - IP6 Prefix Key Value dead:beef::/32 IP6_PRODNET dead:beef:dead:beef::/64 IP6_PRODNET | IP6_SAMENET NETWORK
21 .Maps hold matching rules ids 1 Map per attribute : 5 tuples BPF LPM - src IP6 Prefix Key Value dead:beef::/32 1100111111.................111111 dead:beef:dead:beef::/64 1110111111.................111111 abcd:abcd::/32 1001111111.................111111 BPF Array - dest ports * (key not found) 1000111111.................111111 Key Value 443 1001111111.....1110 Algorithm • Value: 1 bit per rule (1K rules = 16 * u64 / 128Bytes) 8080 0101111111.....1110 • wild-card attribute for a rule: mark as "1" <...> 0011111111.....1110 • Logically AND values for each attribute <...> 0000111111.....1111 • First bit set, is the highest order rule matched
22 .BPF program
23 .BPF Firewall prototype Performance • Similar to our implementation with custom BPF program • Performance remains practically constant irrespective of packet stream • Packet tuple lookup is efficient w/ only 1 BPF map per tuple • Location of matching rule now only matters to the extent of finding the first bit set in an array of integers.
24 .BPF Firewall prototype Implementation • iptables-save output used as input by python loader w/ BCC • Simple parser for INPUT chain • 5 Tuples and TCP Flags • Reject rules not yet supported • Stateless firewall
25 .bpfilter Lessons learnt • Drive adoption by retaining iptables as control interface • `bpfilter` option to iptables commands • iptables-save for configuration • iptables-restore : Allow XDP / tc mode attach point • iptables : Display rules as loaded in bpfilter • stats read and reset support • Pitch : An optional high performant firewall implementation with some configuration restrictions
26 .References • bpfilter : https://lwn.net/Articles/747551 • katran : https://github.com/facebookincubator/katran • droplet : https://netdevconf.org/2.1/session.html?zhou • SIGCOMM 2018 - Accelerating Linux Security with eBPF iptables : https://dl.acm.org/citation.cfm?id=3234228
27 .Questions
28 .Thank you
29 .