Software-Defined Infrastructure at Uber

  • Software-Defined Infrastructure

    “The magic of the Uber app today is powered by a highly distributed software architecture,’’ says Uber’s Justin Dustzadeh.

    The only way for Uber to deliver the required level of network performance and availability is through software and automation, said Justin Dustzadeh, the head of global network and software platform at Uber. The ride sharing company relies heavily on software to automate its infrastructure and thoroughly tests not only its software but also the test environment itself, Dustzadeh said, speaking at the recent Open Networking Summit.

    “Our approach … is to create a test environment that can not only provide the capabilities needed to do the traditional software test cycles — such as feature testing, regression testing, integration testing — but also enables us to deploy and use the tested software to provision, monitor and configure the test environment itself,’’ he told the audience.

    To give an idea of just how vast a network Uber has, the company, which started in 2010, logged over five billion trips in 2016, and about 15 million rides occur every day in over 600 cities and in 78 countries, he said.

    Software architecture

    “The magic of the Uber app today is powered by a highly distributed software architecture that relies on a fault-tolerant and highly available infrastructure,’’ Dustzadeh said. To fully achieve the benefits of software-based automation, he said they always strive to use open standards-based technologies and avoid dependency on a single vendor across the entire infrastructure stack.

    At Uber, a key enabler is to build real-time or near real-time visibility into the infrastructure state, and then leveraging that information and augment it with additional insights from analytics and machine learning, Dustzadeh said. Then IT can push the desired state of the infrastructure through programmatic interfaces.

    In terms of real-life examples, he said they use software to automate many areas, from delivering forecasting models to doing capacity planning, provisioning infrastructure and managing all the changes that IT performs. Additionally, software is used to automate detecting incidents and for mitigating and remediating when things fail.


    “For provisioning across our server and network environments we leverage a number of homegrown software platforms to automate and orchestrate the entire provisioning process,” in areas like auto discovery, Dustzadeh said. On the network side, for example, IT pushes intelligence to the devices to enable a distributed self-discovery model and enable zero-touch provisioning, he noted. This includes auto validation of the state of the hardware, for example, to prevent bad devices from going into production, he added.

    Uber’s IT group utilizes a distributed and highly available platform for auto-detection, he said. On the network side, they do both active and passive monitoring, leveraging streaming telemetry. This gives officials near real-time visibility into the state of the network, including network reachability, network latency, packet losses, and link utilization, he said.

    Auto-mitigation and auto-remediation are other areas where Uber heavily leverages software to improve its operational efficiencies, he said. “So when hardware fails, not only do we have to ensure that the issue is mitigated quickly before it becomes a service impacting incident, we also automate the back-end workflows to automatically generate troubleshooting and/or RMA tickets.”

    If necessary, he said, they can also do auto-diagnostic tests, auto-remediation tests and perform failure prediction functions, for example, by monitoring specific metrics or by running specific playbooks.


    Uber views its network as a key enabler of its business, Dustzadeh said. “Such network resiliency with the focus on deterministic failure behavior is one of our top design principles. Operational efficiency is also a key objective, meaning that the network has to be simple to build and also be flexible and cost effective.”

    On the backbone side and in the WAN space, Uber is moving away from static and long-term contract models toward a more flexible approach, preferably SDN-controlled, on-demand spectrum-as-a-service, he said. “We are also exploring ideas and future models where regional and long-haul bandwidth could be more on demand and usage based like cloud services where carriers would serve as spectrum brokers.”

    On the data center side, in addition to the software-defined capabilities Dustzadeh outlined, the company is also looking into server OEMs and a modular rack design to support multiple server types, for example, across compute, storage, and AI, and machine learning with GPU and FPGA, he said. They are also looking at network disaggregation in the data center.

    “There is a great opportunity, especially in the data center space, to look into the disaggregated model to separate network hardware and network software,’’ he said. This could enable a much faster pace of innovation and faster development of new features, he noted.

    Watch the complete presentation below:

    The post Software-Defined Infrastructure at Uber appeared first on The Linux Foundation.

Tmux Commands

screen and tmux

A comparison of the features (or more-so just a table of notes for accessing some of those features) for GNU screen and BSD-licensed tmux.

The formatting here is simple enough to understand (I would hope). ^ means ctrl+, so ^x is ctrl+x. M- means meta (generally left-alt or escape)+, so M-x is left-alt+x

It should be noted that this is no where near a full feature-set of either group. This - being a cheat-sheet - is just to point out the most very basic features to get you on the road.

Trust the developers and manpage writers more than me. This document is originally from 2009 when tmux was still new - since then both of these programs have had many updates and features added (not all of which have been dutifully noted here).

Action tmux screen
start a new session tmux OR
tmux new OR
tmux new-session
re-attach a detached session tmux attach OR
tmux attach-session
re-attach an attached session (detaching it from elsewhere) tmux attach -d OR
tmux attach-session -d
screen -dr
re-attach an attached session (keeping it attached elsewhere) tmux attach OR
tmux attach-session
screen -x
detach from currently attached session ^b d OR
^b :detach
^a ^d OR
^a :detach
rename-window to newname ^b , <newname> OR
^b :rename-window <newn>
^a A <newname>
list windows ^b w ^a w
list windows in chooseable menu ^a "
go to window # ^b # ^a #
go to last-active window ^b l ^a ^a
go to next window ^b n ^a n
go to previous window ^b p ^a p
see keybindings ^b ? ^a ?
list sessions ^b s OR
tmux ls OR
tmux list-sessions
screen -ls
toggle visual bell ^a ^g
create another window ^b c ^a c
exit current shell/window ^d ^d
split window/pane horizontally ^b " ^a S
split window/pane vertically ^b % ^a |
switch to other pane ^b o ^a <tab>
kill the current pane ^b x OR (logout/^D)
collapse the current pane/split (but leave processes running) ^a X
cycle location of panes ^b ^o
swap current pane with previous ^b {
swap current pane with next ^b }
show time ^b t
show numeric values of panes ^b q
toggle zoom-state of current pane (maximize/return current pane) ^b z
break the current pane out of its window (to form new window) ^b !
re-arrange current panels within same window (different layouts) ^b [space]
Kill the current window (and all panes within) ^b killw [target-window]
  • Use the same script for updating/ upgrading

    Make sure to change the versions to the latest releases:

    #!/bin/bash set -e bpcver=4.2.1 bpcxsver=0.57 rsyncbpcver=

    Scroll through the script, know what you are doing.

    Uncomment the upgrade section(s) and comment out the install section(s)

    read more
  • Again running smartctl after all is said and done:

    smartctl --all /dev/sda

    ddrescue-smartctl-after-rescue.png ddrescue-smartctl-2.png

    Yet an old drive in itself, I run the wheels off of them, and monitor regularly as anyone should.

    read more