Facebook has announced that Fabric Aggregator, the company’s data centre interconnect solution, would be available through the Open Compute Project (OCP).
The company made the announcement at this week’s OCP Summit in San Jose, where the Fabric Aggregator design and methodology was discussed in detail.
Fabric Aggregator is a distributed, disaggregated network system made up of Facebook-designed Wedge100S switches, running on Facebook Open Switching Software (FBOSS). The team designed single-rack and multi-rack versions of the assembly, with four cabling configurations.
The Facebook Fabric Aggregator was created as a way to manage traffic for data-intensive applications including artificial intelligence and machine learning, and has been in use at Facebook for approximately nine months. Sree Sankar, technical product manager for Facebook, said that over a year ago, the company realized that it was rapidly running out of capacity, and that a scaling solution would require a 300% increase in ports. A straight capacity increase solution was unworkable due to the need for greater energy efficiency.
Ms. Sankar noted that while data processing occurs throughout the network, most of the workload was handled in the data centre. And the interaction between compute and servers, and multiple buildings in a region, required “a fabric aggregation layer, comprised of several aggregation nodes.”
The Fabric Aggregator provides for scalable aggregation-layer capacity that improves energy efficiency and network resiliency. It uses the Wedge100S switches as building blocks, in a two-layer cross-connect architecture. Downstream traffic is moved within a region; upstream traffic moves between regions, and the two types of traffic move on two different aggregation layers. This way, subswitches can be added or removed as traffic patterns change, providing for a flexible, scalable data interconnect solution.
The Wedge100S switches operate using Border Gateway Protocol (BGP) between subswitches, allowing for a distributed design without the need for a central controller. Subswitches are independent, and can be added or removed without affecting the viability of the system overall; subswitches are interchangeable between upstream and downstream traffic, and the system is redundant so that several nodes can be removed at once without affecting overall network performance.