Mar 23, 2018 - Network Telemetry. Are we reinventing the wheel? Self-driving Networks. High reliability. Easy-to-manage. Software-defined Networks.
An Evolution of Network Telemetry Tal Mizrahi Marvell
Telemetry BoF
Network Telemetry Self-driving Networks
Easy-to-troubleshoot
High reliability Software-defined Networks
High speed
Easy-to-manage
Network Telemetry Are we reinventing the wheel? 3/23/2018
3
Ping / Traceroute
Reinventing the wheel?
3/23/2018
4
Counters
Per port
Queue State
Per flow
Latency
Per queue
…
…
Old-School Passive Monitoring
Reinventing the wheel?
3/23/2018
5
Carrier Network OAM
Reinventing the wheel?
Higher Layers IP OAM
IETF ICMPv4
IETF ICMPv6
IETF IPPM
Layer 3 MPLS / PWE3 OAM
ITU-T Y.1711 MPLS OAM
ITU-T G.8113.1 MPLS-TP OAM
IETF MPLS-TP OAM
IETF LSP-Ping MPLS OAM
ITU-T Y.1731
Layer 2 Ethernet OAM
IETF PWE3 VCCV
IETF BFD
IEEE 802.1ag IEEE 802.3ah
Layer 1
OAM Protocols Active measurement / monitoring: Control Message
Control Message 3/23/2018
6
Piggybacked Measurement
Reinventing the wheel?
Telemetry is piggybacked onto data packets
IOAM / INT
AM-PM
3/23/2018
7
Piggybacked Metadata – IOAM / INT
Reinventing the wheel?
Analytics Server Telemetry Info
IOAM / INT Domain
Switches push local metadata into header: delay, queue state, …
IOAM (IETF) INT (P4) +SAI TAM
3/23/2018
8
AM-PM: Alternate Marking – Performance Measurement
Reinventing the wheel?
(RFC 8321)
Counts number of packets sent
Counts number of packets received
Data Center Network Servers
Servers
Packets Sent: 10,000 Packets Received: 9,500 Packets Lost:
500 Analytics Server
3/23/2018
9
Network Telemetry: An Evolution
Reinventing the wheel?
NO !
Ping Traceroute
Passive Monitoring
Carrier OAM
IOAM / INT AM-PM
3/23/2018
10
Marvell’s Network Telemetry Solutions Active Telemetry Selective Probing
Timestamping
Metadata Grab-’n-Go
Passive Telemetry
Counters
Flow-based Probes
IEEE 1588
Per-hop Telemetry
3/23/2018
11
Metadata: Grab-’n-Go
Packet Metadata
Incoming Packet
Fixed Programmable Logic Programmable
Modified Packet
Packet Processing
Marvell Prestera 3/23/2018
12
Timestamping Everything
Packet Metadata
Incoming Packet
Packet Processing
Outgoing Packet
IEEE 1588 Synchronized Clock
Marvell Prestera
✔ Periodic probing ✔ AM-PM ✔ TimeFlip
3/23/2018
13
AM-PM using TimeFlip Marvell Prestera Switch
Periodic range
... timestamp value
TCAM header / timestamp metadata
*…*
field2 field3 field4 …
Time.Sec
1
action
1 second
*…*
Time.Frac
3/23/2018
14
Per-Flow Congestion Detection using AM-PM
loss and delay è congestion is detected
3/23/2018
15
Selective Exporting
Analytics Server
Telemetry Info
✔ Drop ✔ Congestion ✔ 1 of N ✔ Periodic
Marvell Prestera
3/23/2018
16
IOAM/INT: Periodic Exporting Analytics Server
Export one packet per second
Telemetry Info
Per flow / per port Export last ms of every second IOAM / INT Domain
3/23/2018
17
IOAM/INT: Adaptive Exporting Analytics Server Event-driven: ✔ On drop ✔ On congestion ✔ On high rate
IOAM / INT Domain
3/23/2018
18
Combining IOAM/INT with AM-PM Analytics Server
Challenge: export telemetry info without the expensive overhead of IOAM/INT.
AM-PM as a trigger for exporting telemetry info.
Per-hop Telemetry
IOAM / INT Domain
3/23/2018
19
The Network Telemetry Toolset
Selective Probing
Counters
Timestamping
Metadata Grab-’n-Go
Flow-based Probes
IEEE 1588
Marvell Prestera 3/23/2018
20
References [1]
Mizrahi, T., Vovnoboy, V., Nisim, M., G. Navon, and A. Soffer, “Network Telemetry Solutions for Data Center and Enterprise Networks”, Marvell white paper, 2018.
[2]
Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P., Chang, R. and D. Bernier "Data Fields for In-situ OAM", draft-ietf-ippm-ioam-data-00, work in progress, 2017.
[3]
C. Kim et al., “In-band network telemetry (INT)”, P4 consortium, 2015.
[4]
Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, “Alternate Marking method for passive and hybrid performance monitoring”, RFC 8321, 2018.
[5]
Mizrahi, T., Rottenstreich, O. and Y. Moses, “TimeFlip: Scheduling Network Updates with Timestamp-based TCAM Ranges”, IEEE INFOCOM, 2015.
3/23/2018
22