You could
1) Filter the broken message
2) Drop the broken message
3) Ignore the broken attributes but pass them on
4) Break with the broken attributes
To me, only 4 (Arista) is the really unacceptable behaviour. 3 (Juniper) isn't desirable but it's not a devastating behaviour.
EDIT: Actually rereading it, Arista did 2 rather than 4. I think it just closed the connection as being invalid rather than completely crash. That's arguably acceptable, but not great for the users.
One, if everyone is doing something different from the spec it is hard to figure out what they are really doing and what they mean. Long term you have confidence things will continue to work even when someone else writes their own version which otherwise might also deviate from the spec.
Two, it is easier to modify the spec as more features are dreamed up if you have confidence that the spec is boss meaning someone else didn't already use that field for something different (which you may not have heard about yet).
Three, if you agree to a spec you can audit it (think security), if nobody even knows what the spec is that is much harder.
Following the spec is harder in the early days. You have to put more effort into the spec because you can't discover a problem and just patch it in code. However the internet is far past those days. We need a spec that is the rule that everyone follows exactly.
You're imagining a world where things get specified and implemented completely correctly. Which does not exist and probably can't!
There's a difference between unknown extensions following a known format, and data that's simply broken (e.g. offset pointer past end of data).
The internet is ossified because middleboxes stick their noses where they shouldn't. If they just route IP packets, we could have had nice things like SCTP...
It's not a concession you want to make unless you really have to.
And that's because the end-user is at the mercy of, but not party to, an over the air interface between the producer and consumer that you can't verify ahead of time.
So if you're consuming a stream of supposed xhtml `<p>foo<p>bar</p>`, you have to decide if you want to screw the user for the producer's mistake for a single fuck up in the website's footer.
A BGP Update message is mostly just a container of Type-Length-Value attributes. As long as the TLV structure is intact, you should be able to just pass on those TLVs without problems to any peers that the route is destined for.
The problem fundamentally is three things:
1. The original BGP RFC suggests tearing down the connection upon receiving an erroneous message. This is a terrible idea, especially for transitive attributes: you'll just reconnect and your peer will resend you the same message, flapping over and over, and the attribute is likely to not even be your peer's fault. The modern recommendation is Treat As Withdraw, i.e. remove any matching routes from the same peer from your routing table.
2. A lack of fuzz testing and similar by BGP implementers (Arista in this case)
3. Even for vendors which have done such testing, a number of have decided (IMO stupidly) to require you to turn on these robustness features explicitly.
In practice there are many ways to allow a protocol to evolve, and being liberal in what you accept is just about the worst way to achieve that. The most obvious alternative is to version the protocol, and have each node support multiple versions.
Old nodes will simply not receive messages for a version of the protocol they do not speak. The subset of nodes supporting a new version can translate messages into older versions of the protocol where it makes sense, and they can do this because they speak the new protocol, so can make an intelligent decision. This allows the network to function as a single entity even when only a subset is able to communicate on the newer protocol.
With strict versioning and compliance to specification, reference validators can be built and fitted as barriers between subnetworks so that problems in one are less likely to spread to others. It becomes trivial for anyone to quickly detect problems in the network.
Now, we're experiencing the downside of this "feature"
In fact it's a bit strange just how lenient Juniper's software was here. If a session is configured as IBGP on one end and EBGP on the other end, it should never get past the initial message. Juniper not only let it get past the connection establishment but forwarded obviously wrong routes.
Rather than the inverse where you only forward things explicitly and by default do not forward.
You can't configure it like that, most of the BGP implementations I'm familiar with automatically treat the a same-AS neighbor as iBGP and a different-AS neighbor as eBGP.
Juniper explicitly has 'internal' and 'external' neighbors, but you can't configure a different peer AS than your own on an internal neighbor or the same peer AS on an external neighbor.
BGP sessions also have the AS of the neighbor specified in the local config, and will not bring up the session if it's not what's configured.
IMHO, just drop the broken attributes in the message and log them, and pass on the valid data if there's any left. If not, pretend you did not receive an UPDATE message from that particular peer.
Monitoring will catch the offending originator and people can deal with this without having to deal with any network instability.
(I haven't downvoted your comment, but I can see why others would — you're making very simple and definite statements about very complicated problems, and you don't seem to be aware of the complications involved. Hence: your calibration is a bit off.)
My point is that:
- if you drop a connection, especially one through which you announce the full routing table, it is going to create a lot of churn to your downstreams. Depending on the kind of routers they use, it can create some network instability for quite a while. And if you drop it again when you receive that malformed route, the instability continues
- removing only the malformed attribute maybe changes the way you treat traffic but you still route it. OK, you send it to maybe another interface, but no biggie
- if you’re using a DFZ setup, dropping that single route could blackhole traffic to that destination if you’re the only upstream to another router
Yes, but then again since you have logs of why it was dropped (like I suggested in my first post, to log everything dropped), you can easily troubleshoot the problem. A much better outcome than flapping a BGP session for no good reason and creating route churn and network instability.
And I'm TSC emeritus and >10 year maintainer on FRRouting, and active at IETF. Yet I hugely respect the other people there, all of whom have areas of expertise where they far outrank my own.
I have very strong opinions about some subjects, one of them being BGP.
I believe sessions should not be tear down just because you receive malformed data. You should be able to remove just the corrupt data. Or treat as a withdraw message like one of the RFC recommends.
I for one one would like knobs to match on any attribute and value and remove/rewrite them at will. Imagine something akin to a very smart HTTP proxy.
At a glance this “feature” seems like an incredibly bad idea, as it allows possibly unknown information to propagate blindly through systems that do not understand the impact of what they are forwarding. However this feature has also allowed widespread deployment of things like Large Communities to happen faster, and has arguably made deployment of new BGP features possible at all.
The most common approach is 'treat-as-withdraw', i.e. handle the update (announcement of a route) as if it was a withdraw (removal of previously announced route). You should not just drop the broken message as that whould lead to keeping old, no longer valid state.
What you're paraphrasing here is the so-called "robustness principle", also known as "Poestel's law". It is an idea from the ancient history of the 1980s and 09s Internet. Today, it's widely understood that it is a misguided idea that has led to protocol ossification and countless security issues.
I'd only expect security issues to result from being overly liberal but 1. I wouldn't expect it to be very common and 2. I'm not at all convinced that's a compelling argument to reduce the robustness of an implementation.
open source something?
For example for networking you can have packets sent using TCP or UDP, but actually there could be any number of protocols used. But for decades it was literally only ever those two. Then when QUIC came about, they couldn't implement it at the layer it was meant to be because all the routers and software were not built to accept anything other than TCP or UDP.
There's been a bunch of thought in to how to stop this stuff like making sure anything that can change, regularly does. Or using encryption to hide everything from routers and software that might want to inspect and tamper with it.
It's not related to open source software. The seemingly matching prefix is coincidence :-)
Something like "this page is best viewed in Internet Explorer" as applied to HTML.
Suppose you get a message that violates the standard. It has a length field for a subsection that would extend beyond the length of the entire message. Should you accept this message? No, burn it with fire. It explicitly violates the standard and is presumably malicious or a result of data corruption.
Now suppose you get a you don't fully understand. It's a DNS request for a SRV record but your DNS cache was written before SRV records existed. Should you accept this message? Yes. The protocol specifies how to handle arbitrary record types. The length field is standard regardless of the record type and you treat the record contents as opaque binary data. You can forward it upstream and even cache the result that comes back without any knowledge of the record format. If you reject this request because the record type is unknown, you're the baddies.
There are many cases where the RFC is not at all ambiguous about what you're supposed to do, and then some implementation doesn't do it. What should you do in response to this?
If you accept their garbage bytes, things might seem less broken in the short term, but then every implementation is stuck working around some fool's inability to follow directions forever, and the protocol now contains an artificial ambiguity because the bytes they put there now mean both what they're supposed to mean, and also what that implementation erroneously uses them to mean, and it might not always be detectable which case it is. Which breaks things later.
Whereas if you hard reject explicit violations of the standard then things break now and the people doing the breaking are subject to complaints and required to be the ones who stop doing that, rather than having their horkage silently and permanently lower the signal to noise ratio by another increment for everyone else.
One of the main problems here is that people want to be on the side of the debate that allows them to be lazy. If the standard requires you to send X and someone doesn't want to do the work to be able to send X then they say the other side should be liberal in what they accept. If the standard requires someone to receive X and they don't want to do the work to be able to process X then they say implementations should be strict in what they accept and tack on some security rationalization to justify not implementing something mandatory and thereby break the internet for people who aren't them.
But you're correct that there is no IETF court, which is why we need something in the way of an enforcement mechanism. And what that looks like is to willingly cause trouble for the people who violate standards, instead of the other side covering for their bad code.
And, if your project is on GitHub, gets your Issues page absolutely clowned on because you're choosing to do the right thing technically and the leeching whiners shitting up the Issues don't want to contribute a goddamn thing other than complaints, and they definitely don't want to go to the authors of the thing that doesn't work with your stuff and try and get that fixed either.
Widely claimed by some but certainly not "widely understood" because such phrasing implies a lack of controversy regarding the claim that follows it.
The sane approach is to be strict and provide great error messages.
A good HTML might not even look like HTML.
Remember the definition of this "drop the message I think is broken" not inherently "drop the broken message," it's entirely plausible that the message is fine but you have a bug which makes you THINK it's a broken message.
There is also a huge difference between considering it a broken message and a broken session, which is what Arista did.
I can "play" with TCP/IP at home in dummy projects and learn more about it... but I have no idea how to "play" with BGP. In that regard, how does one learn about it at home?
Another fun thing is to log into publicly available looking glass servers. Most ISPs (including very, very, very large ones) operate routers that have their full view of the BGP routing tables. They either run web interfaces that let you query those tables (more common) or make public ssh or telnet credentials to log in with roles that have very limited access to the available commands, but have read rights to those tables.
https://www.routeviews.org/routeviews/about/ https://stat.ripe.net/docs/02.data-api/
Or did you mean multipoint?
Border Gateway Protocol (BGP) has the primary purpose of sharing routes between routers managed by different organizations. It can be used within an organization too. It has a lot more control over how and which routes it sends and receives.
Now it seems that teams seem to be far more specialised and there's less cross-specialist learning.
Don't let my ignorance color your opinion of the youth of today.
It depends on what you study.
I did more of a sysadmin track, you (probably?) did pure comp sci/dev and would not encounter OSPF in a dev job (probably).
Check out this blog (not me, I just remember it from years back): https://blog.thelifeofkenneth.com/2017/11/creating-autonomou...
If you like guided tutorials, https://blog.ipspace.net/2023/08/bgp-labs-basic-setup/ is rather good and has been extended to somewhat advanced topics. Everything needed to follow along is free software.
It lets you setup multiple containers with direct connections between them in whatever topology you want. It allows you to run both Linux containers (with FRR for example) and emulated versions of popular router platforms (some of the ones mentioned in the article).
DESCRIPTION
bgpd is a Border Gateway Protocol (BGP) daemon which manages the network
routing tables. Its main purpose is to exchange information concerning
"network reachability" with other BGP systems. bgpd uses the Border
Gateway Protocol, Version 4, as described in RFC 4271.
To experiment with BGP you could use a network simulator like what the author of this blog did. In my class we used something called gini[1] which I think my profs grad student wrote but the author apparently used gns3 which seems to be a cisco specific ns3 version. I used ns3 once and found it had a steep learning curve. The gini simulator has a more basic user interface but is probably less powerful.
[1] https://citelab.github.io/gini5/ [2] https://docs.gns3.com/docs/
GNS3 is probably the easiest way to get hands on experience with any networking technologies.
BGP runs the internet routing "in the background" and you only need to know it if you're an internet service provider or work in a large org managing the network. If you didn't learn network routing, you aren't going to learn BGP.
Put two or three VMs (OpenBSD has OpenBGPD daemon) onto a shared virtual switch and addresses in 172.31.255.0/24, connect the VMs. Also each of the VMs should have at least one other interface onto unique virtual switches with their own network (172.31.1.0/24, 172.31.2.0/24, etc).
Then set up BGP to redistribute connected routes.
One way to play with it is something like this: https://www.eve-ng.net/
The other is to make a couple of virtual machines with a couple of network interfaces, make some sort of network betweeen them and then use some bgp routing deamon, eg:
https://www.nongnu.org/quagga/
etc.
The first widespread incident I found was from 1997 [1], but I didn't look too hard.
I don't think there's really a satisfying way to play with BGP as a small network. Traffic engineering is where I think the fun would be, but you've got to have lots of traffic and many connections for it to be worthwhile. Then you'd be trying to use your announcements to coax the rest of the internet to push packets through the connections you want. As well as perhaps adjusting your outgoing packets to travel the connections you prefer when possible. Sadly, nobody lets me play with their setup.
One of the ways to get a sense of emergent routing behavior is if you have hosting in many places, you'll likely see a lot of differences in routes when you start going to far off countries. If you run traceroutes (or mtr) from your home internet and your cell phone and various hosting, and if you can trace back... you'll likely see a diversity of routes. Sometimes you'll see things like going from west coast US to Brazil, where one ISP will send your packets to florida, then Brazil, and one ISP will send your packets to Spain, then Brazil, with a lot more latency.
/Live in Ericsson lands
A lab wont ever reflect the complexity of a carrier environment.
That said, just bang a couple of mikrotiks together if you want to play with it.
Cisco offers some simulator tooling. It basically virtualizes a lot of networking devices and allows you to play LEGO/SimCity with them: Cisco Packet Tracer
https://www.netacad.com/learning-collections/cisco-packet-tr...
Now, we built toy networks from scratch while I was working toward my certification. Surely larger-scale simulation files could be loaded into Packet Tracer. And perhaps, vendors have simulators on a larger scale than the free downloads?
https://developer.cisco.com/modeling-labs/
When I worked at a regional ISP, my supervisor was the BGP wizard. He referred to exterior routing as "a black art". Even more, the telcos were deploying their own technologies like Frame Relay and SMDS, which are Layer 1/Layer 2 protocols beyond the standard "point-to-point" leased lines.
We once experienced a fiber cut on our T-3 backbone (construction workers didn't dial 811). So my supervisor arranged the BGP routes to send everything over a 56k line, IIRC. He gloated about it. The packet loss rate was absurd, but our customers had connectivity!
Yep this seems like a very common experience. I tend to find most environments have one guy making BGP changes outside of project work.
>We once experienced a fiber cut on our T-3 backbone (construction workers didn't dial 811). So my supervisor arranged the BGP routes to send everything over a 56k line, IIRC. He gloated about it. The packet loss rate was absurd, but our customers had connectivity!
The modern version of this: At a small national ISP, we had our intercarrier lines cut. Megaport has this billing model where you only pay for the capacity you use, so our backup intercapital was a 1MB megaport service. Intercapital goes down, everyone kicks over to the megaport and we just log on to the megaport portal and raise the bandwidth to a few gig temporarily. Cost almost nothing to keep it sitting there ready for use. And yeah the engineer responsible was extremely and deservedly smug.
>And perhaps, vendors have simulators on a larger scale than the free downloads?
My experience is that you need both the exact hardware/firmware AND the exact config to perfectly simulate some of the weird and wonderful stuff. Largely because so much of the protocols issues, like the OP suggests, is down to individual vendor implementations of the protocol.
For instance, I used to consult for a small ISP that had a very unreliable peer. That peer would send them routes for everything, but occasionally their PE's routing plane would collapse and stop forwarding traffic to/from their other peers.
We still received enough packets to not trip any failover, and routes were still being advertised. So until they realised and rebooted their hardware, we had to withdraw our routes.
This is the specific behaviour between (IIRC) Cisco IOS-XR on our end, their predominantly mikrotik environment, and their other peers who I believe were mostly juniper.
I cant imagine simulating that without the relevant hardware and configs.
It depends on what you study.
I did more of a sysadmin track, you (probably?) did comp sci/dev and would not encounter BPG in a dev job (probably).
Still think BGP is too complex and people keeping add new features and vendors keeping implement it based on RFC standard or draft.
And it seems BGP will never be deprecated so this sort of bugs will continue be found again and again...
Makes me wonder if the BGP protocol is properly fuzzed. Perhaps its one of those things that everyone is scared to try to knock over given it's so important.
I suppose it would be easy to write a fuzzer for bgp but very hard to diagnose crashes?
Yes, this is exactly what I did in the post I linked to: https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-er...
This is great research!
May 20th was a Tuesday, just sayin'
Fix that, so that only those actually interested in academia per se and not just because they need a checkbox to tick remain, and the problem with cheating in academia will collapse.
CVE-2023-4481 (Juniper) CVE-2023-38802 (FRR) CVE-2023-38283 (OpenBGPd) CVE-2023-40457 (EXOS)
Arista was not affected then.
I guess the author of the article here has written a fuzzer with some coverage, and has come across similar issues before. Astonishing that the vendors don't pick up on this work hungrily.
Is the real issue that each vendor wants lock-in, so won't standardise?
DISCLAIMER: My understanding of BGP is hollow and shallow, I am not an expert.
mgaunard•1d ago
I thought BGP was only for private networks.
FL410•1d ago
Maybe you are thinking of iBGP or something like OSPF?
Alifatisk•1d ago
bc569a80a344f9c•1d ago
BGP also doesn't use multicast, you may be thinking of OSPF on multiaccess networks. BGP uses tcp/179 unicast to the IP addresses of its configured peers.
That said, multicast works just fine over the Internet. It's not commonly used, certainly not by home users and not very often by enterprise users, and was phased out on Internet2 by 2021 (I think?), but there's absolutely nothing in principle that would make it not work.
ai-christianson•1d ago
bc569a80a344f9c•1d ago
Unicasts, multicasts, and broadcasts all actually work differently underneath and require specific handling by network equipment. Anycast is just a special case of unicast and generally speaking network equipment is completely unaware of it.
rnxrx•1d ago
toast0•1d ago
In principle, no. In practice, I don't think many ISPs have equipment configured to forward multicast, except for those using multicast for TV and those probably don't interconnect with others.
ta1243•1d ago
https://www.bbc.co.uk/multicast/tv/channels.shtml
Brandon Butterworths note about "why"
https://support.bbc.co.uk/multicast/why.html
Shows the growth of the backbone and CDNs:
> The Olympic audience is expected to be around 50K streams, delivering 10Gbit+ is on the limit of sensible unicast delivery.
In 2020 the BBC's internal CDN was delivering 100 times that [0] for 250k users, and 5 years later I suspect it's another order of magnitude given that iplayer does 5 million concurrent live views quite frequently [1,2]
[0] https://medium.com/bbc-product-technology/bbc-online-2020-in...
[1] https://www.bbc.co.uk/mediacentre/2024/audiences-flock-to-bb...
[2] https://www.bbc.co.uk/mediacentre/2022/england-v-iran-bbc-li...
By 2035 and TV turnoff there's no reason to believe that the infrastructure won't have been able to scale another 100 fold and handle 500 million concurrent live streams. Makes no sense to multicast out 30 different formats, to people on phones and tablets and TVs hanging off wifi. It's a very different consumer experience than a PC wired into an ISP like it was in 2007.
Bluecobra•4h ago
immibis•1d ago
AStonesThrow•3h ago
https://en.wikipedia.org/wiki/Multicast_address
There were more than a few people who spotted how disused this range had become after mbone experiments, and sometimes suggested reclaiming the range as IPv4 address space was being exhausted.
Interestingly, there are reserved multicast addresses (yes, addresses, not ports) not only for OSPF, but for many other interior routing protocols, as well as mDNS, LLMNR, and NTP. Conspicuously absent is any reservation for BGP.
tonetegeatinst•1d ago
BGP is the only way I know of that autonomous Systems can talk to each other and negotiate.
patmorgan23•1d ago
Hikikomori•1d ago
There was something called the mbone back in the day. Nowadays you can't really send random multicast, but its very much in use by ISPs for IPTV.