Build a network emulator using Libvirt and KVM

February 1, 2019, 8:15 pm

≫ Next: Vrnetlab: Emulate networks using KVM and Docker

≪ Previous: Python: the seven simple things network engineers need to know

In this post, I demonstrate how to create a network emulation scenario using Libvirt, the Qemu/KVM hypervisor, and Linux bridges to create and manage interconnected virtual machines on a host system. As I do so, I will share what I have learned about network virtualization on a Linux system.

Libvirt provides a command-line interface that hides the low-level virtualization and networking details, enabling one to easily create and manage virtual networking scenarios. It is already used as a basis for some existing network emulators, and other applications and tools. It is available in almost every Linux distribution.

The network emulation scenario

As you work through the examples in this post, you will create a very simple network topology which is intended to demonstrate the use of Libvirt and other virtualization tools to build a network emulator and is not intended to emulate a real-world network. However, once you understand its operation, you may use Libvirt to create large, complex network topologies intended to emulate real-world network scenarios.

The example I created for this post consists of three virtual machines serving as routers connected to each other in a ring topology. On each side of this emulated network, you will create virtual machines acting as a user and as a server, so you can test the emulated network’s operations. Each node in the virtual network is also connected to an “out of band” management network so you can configure and manage it. See the network diagram below.

A simple network emulation scenario

Conventions used in this post

I log into, and run commands on, the host system and on multiple virtual machines while I work through the examples in this post. To make it clear which machine I am currently using, I show the bash prompts in most of the command line listings I show below.

My host system is named T420 so when I am logged into it, the prompt is:

brian@T420:~$

When I am logged into the virtual machines, each will show its unique hostname in the bash prompt. For example, if I am logged into router r01, the prompt is:

sim@r01:~$

Please carefully note which node you are supposed to be working with for each command or output shown in the examples, below.

Prepare the host system

The host system could be your personal computer, such as a laptop, or it could be a cloud instance that supports nested virtualization. In this example, the host system is running Ubuntu 18.04. To prepare your system to support network emulation using Libvirt, do the following:

Install virtualization and guest tools software
Add your userid to the correct groups
Fix the Linux kernel file permissions
Create a directory structure in which you will store your disk images and other files
- And, add the directory to the libvirt group
Enable Libvirt’s NSS plugin so you can use SSH to connect to Libvirt VMs using their hostnames

Install virtualization software

Verify your computer, or cloud instance, can support accelerated virtualization. Enter the following commands in your computer’s terminal:

brian@T420:~$ grep -cw vmx /proc/cpuinfo

It should return a value equal to the number of CPU threads available on the computer or cloud instance. If it returns 0, something is wrong. In my case, I am using an old Lenovo Thinkpad T420 laptop computer with a dual-core intel i5 processor which supports hyper-threading so I see the value 4 when I run the above command because the processor has two physical CPU cores which each support two “virtual CPU” threads.

Install the virtualization software:

brian@T420:~$ sudo apt-get update
brian@T420:~$ sudo apt install qemu-kvm libvirt-clients \
          libvirt-daemon-system virt-manager bridge-utils \
          libguestfs-tools libnss-libvirt

Add your userid to groups

In addition to the libvirt group, which is configured by the installer, you should also add your username to the other Libvirt groups and to the kvm group.

brian@T420:~$ sudo adduser `id -un` libvirt-qemu
brian@T420:~$ sudo adduser `id -un` kvm
brian@T420:~$ sudo adduser `id -un` libvirt-dnsmasq

Logout and log back in to activate the group ownership changes for your user, or restart your system:

brian@T420:~$ logout

After logging back in, check that the libvirtd systemd service is installed and running:

brian@T420:~$ systemctl status libvirtd

You should see that the libvirtd service is in the active (running) state.

Also verify your userid is part of the libvirt and libvirt-qemu groups using the groups command.

brian@T420:~$ groups
brian adm cdrom sudo dip plugdev lpadmin sambashare kvm libvirt libvirt-dnsmasq libvirt-qemu

Fix the Linux kernel file permissions

Many of the virtualization tools used in this example require read access to the host’s Linux kernel file but, in Ubuntu 18.04, the Canonical developers decided to make the Linux kernel readable only by the root user or by users running sudo. Canonical says they did this to improve security but others strongly disagree with them. The libguestfs tools install guide refutes Canonical on this point¹ so I set the Linux kernel file to be readable by all users on my host system, instead of running virtualization tools with root privileges. I suggest you do the same.

brian@T420:~$ sudo chmod 0644 /boot/vmlinuz*

Libvirt configuration files

There are two configuration files that impact how Libvirt functions with KVM. we are concerned that the permissions are set correctly. Usually, all the default settings are OK.

Check the file, /etc/libvirt/libvirtd.conf and verify that the libvirt group is defined with read and write permissions:

brian@T420:~$ nano /etc/libvirt/libvirtd.conf

Scroll down through the file and look for the two lines below. Ensure that the group is set to libvirt and the permissions is set to 0770 or 0777.

unix_sock_group = "libvirt"
unix_sock_rw_perms = "0770"

Check the file /etc/libvirt/qemu.conf and verify it is set to all default values. That is, everything should be commented out.

Create a directory for emulation scenarios

You need a directory to store your disk images, XML files, and scripts. Create a directory in your home folder. I chose to name mine simulator. I place it outside my $HOME directory because the default permissions on $HOME are typically too restrictive to allow access.

brian@T420:~$ sudo mkdir /simulator

Configure the directory’s permissions so that it is owned by your userid and the libvirt group. Set the permissions so that groups may write to the directory and other users may list files in the directory

brian@T420:~$ sudo chown -R brian:libvirt /simulator
brian@T420:~$ chmod g+w /simulator
brian@T420:~$ chmod o+x /simulator

I used Linux Access Control Lists so that disk images created by Libvirt, which are owned by the root user and group, are also owned by the libvirt group and tools like virt-sysprep can write to them. Linux ACLs lets you create a second level of permissions so you do not have to keep changing ownership of Libvirt disk files to the libvirt group.

brian@T420:~$ sudo setfacl -m g:libvirt:rw /simulator
brian@T420:~$ sudo setfacl -dm g:libvirt:rw /simulator

For your information, you can check the ACL setting on a file or directory with the getfacl command. For example:

brian@T420:~$ getfacl /simulator

Create a sub-directory for your network emulation scenario. I called mine sim01. Set the same owners and permissions as the parent directory:

brian@T420:~$ cd /simulator
brian@T420:~$ mkdir sim01

Enable Libvirt’s NSS plugin

When you start a virtual machine (VM) for the first time, the Libvirt default network will assign it an IP network configuration using DHCP. However, you cannot address the virtual machine by its hostname because the DHCP server on a Libvirt NAT network does not share any information with the host system. There are multiple ways to solve this issue² and I suggest you use the Libvirt NSS module, which plugs the Libvirt network information into the information sources consulted by your system’s Name Service Switch.

You already installed the libnss-libvirt package so you just need to edit the host system’s NSS configuration file so it will consult the Libvirt NSS modules, libvirt and libvirt_guest to get the IP address of any running virtual machine attached to a Libvirt-managed NAT network.

Edit the file /etc/nsswitch.conf:

brian@T420:~$ sudo nano /etc/nsswitch.conf

In the file’s hosts: line, add in the libvirt and libvirt_guest modules, in the order in which you want the system to consult its available name services. The file should look like the listing below when you are completed:

passwd:         compat systemd
group:          compat systemd
shadow:         compat
gshadow:        files

hosts:          files libvirt libvirt_guest mdns4_minimal [NOTFOUND=return] dns$
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

When your system is looking up a hostname, the above configuration will cause the system to first consult the available name services in the following order: the /etc/hosts file, the static IP addresses configured in the Libvirt network’s XML file, the Libvirt-managed VMs’ names listed in the DHCP server, and the dnsmasq service. It chooses the first match it finds.

Save the file.

Now, you can use SSH to login to a virtual machine without knowing its IP address. For example, to login to the sim account on a virtual machine named server, run the following command:

brian@T420:~$ ssh sim@server
sim@server:~$

Plan your network emulation topology

Create a network plan that you can use to guide your configurations and to help you troubleshoot problems. Draw a detailed diagram of the virtual network you want to create. This will help you visualize how all the bridges, nodes, and ports connect to each other.

In the network diagram I created below, you should see that there are five VM-to-VM connections in the high-level network diagram. You will implement each connection using a Linux bridge³ connected to virtual interfaces on the two virtual machines at either end of the connection.

libvirt virtual network emulation

From the diagram, you see which ports on each virtual machine are connected to which bridge. Convert the information in the drawing into a table that will help you map ports to bridges when you are building your virtual network. On each line in the table, add the MAC addresses and IP addresses you chose for each port.

For this example, I created the following network planning table.

VM name	VM port	MAC address	IP address	Bridge name
user	1	assigned by Libvirt	DHCP	virbr0
user	2	02:00:aa:0a:01:02	10.10.100.1/24	br_user_r1
r01	1	assigned by Libvirt	DHCP	virbr0
r01	2	02:00:aa:01:0a:02	10.10.100.2/24	br_user_r1
r01	3	02:00:aa:01:02:03	10.10.12.1/24	br_r1_r2
r01	4	02:00:aa:01:03:04	10.10.13.1/24	br_r1_r3
r02	1	assigned by Libvirt	DHCP	virbr0
r02	2	02:00:aa:02:03:02	10.10.23.1/24	br_r2_r3
r02	3	02:00:aa:02:01:03	10.10.12.2/24	br_r1_r2
r03	1	assigned by Libvirt	DHCP	virbr0
r03	2	02:00:aa:03:01:02	10.10.13.2/24	br_r1_r3
r03	3	02:00:aa:03:02:03	10.10.23.2/24	br_r2_r3
r03	4	02:00:aa:03:0b:04	10.10.200.2/24	br_r3_serv
server	1	assigned by Libvirt	DHCP	virbr0
server	2	02:00:aa:0b:03:02	10.10.200.1/24	br_r3_serv

I chose arbitrary MAC addresses, using a convention that helps me remember which MAC address I assigned to which node and port. I also chose arbitrary IP addresses from private IP address space, again following a convention that helps me quickly determine which node and port are associated with each address.

As you can imagine, this can get very complex when you add more nodes and links. I suggest you plan carefully, build your virtual network a little bit at a time, and test connectivity on new links as you build them.

Create the base VMs for the PC nodes and router nodes

Create a base, or template, virtual machine for each type of node you will deploy in your network emulation scenario. This enables you to quickly create new network nodes by cloning new virtual machines from one of your base VMs. Your new virtual machines will come with all the default software and configurations you staged on the base VMs.

In this network emulation scenario, you have two node types: a PC and a router. To create base VMs for each of them, install Ubuntu Server 18.04 on a new VM and configure it as a base PC VM. Clone the base PC VM to create a second VM, which you will configure and use as the base router VM.

Create the PC base virtual machine

Get a Linux disk image from the appropriate distribution and use it to build the base PC VM. I used the Ubuntu Server distribution to build the network nodes. Find an Ubuntu mirror closest to you.

Enable serial console access by inserting extra arguments into the virtual machine’s boot process so you can use its command-line-interface installer in your terminal. You must use the virt-install command’s location option instead of the cdrom option because the cdrom option does not allow us to insert extra arguments into the VM at boot time.⁴ The location option does not support an ISO file as the install media; it requires access to a repository directory. Find a mirror that offers the Ubuntu repository directory.

brian@T420:~$ virt-install \
  --name pc-base \
  --virt-type=kvm --hvm --ram 1024 \
  --disk path=/simulator/sim01/pc-base.qcow2,size=4 \
  --vcpus 1 --os-type linux --os-variant ubuntu18.04 \
  --network bridge=virbr0 \
  --graphics none \
  --location 'http://mirror.math.princeton.edu/pub/ubuntu/dists/bionic/main/installer-amd64/' \
  --extra-args='console=ttyS0'

I configured the system with userid sim, and password sim. When asked to select software, I chose OpenSSH Server and Basic Ubuntu Server options.

Fix the serial interface

When the installation process ends, the new virtual machine will reset. Your local terminal will show a blank screen because the extra arguments you passed to the virtual machine’s boot configuration during system installation were not permanently saved in the VM’s boot configuration. You cannot access the new VM’s serial interface.

To fix this, stop the virtual machine and use the guestmount utility to configure a serial interface on the its disk.

Press Ctrl-] to get back to your host system’s terminal.

Shutdown the guest VM, as follows:

brian@T420:~$ virsh shutdown pc-base

Mount guest’s disk and enable a serial port (by manually enabling a getty service) using the following commands:

brian@T420:~$ sudo mkdir /mnt/pc
brian@T420:~$ sudo guestmount --domain pc-base \
    --inspector /mnt/pc
brian@T420:~$ sudo ln -s \
    /mnt/pc/lib/systemd/system/getty@.service \
    /mnt/pc/etc/systemd/system/getty.target.wants/getty@ttyS0.service
brian@T420:~$ sudo umount /mnt/pc

Start virtual machine again:

brian@T420:~$ virsh start pc-base

After the virtual machine starts, test that the console works. Try to access the pc-base VM from your host system using the virsh console command:

brian@T420:~$ virsh console pc-base

You should see a login prompt. Login to the virtual machine and configure it.

Configure the pc-base VM

The first virtual machine will be used as the template from which you will clone other “host” VMs in your network emulation scenario. Stage it with the basic configurations that you want any other machine cloned from it to have. For example, configure some basic network tools:

sim@pc-base:~$ sudo apt update
sim@pc-base:~$ sudo apt install -y traceroute tcpdump nmap

Stop the pc-base VM

So that you can clone it, shut down the virtual machine you created and configured. Exit its console with the CTRL-] key combination and stop it:

sim@pc-base:~$ 
CTRL-]
brian@T420:~$ virsh shutdown pc-base

If you want to see how Libvirt defines the virtual machine in an XML file, run the dumpxml command. The dumpxml command may also be used in scripts when you want to check some of its attributes — which I will demonstrate later.

brian@T420:~$ virsh dumpxml pc-base

If you need to tweak the virtual machine’s Libvirt settings in the future, you can edit its Libvirt XML file with the command:

brian@T420:~$ virsh edit pc-base

Fix permissions on VM disk image files

Use libguestfs-tools to manipulate the VM’s disk images. First, though, fix the file permissions of the disk images you created. Libvirt creates the disk images so that they can only be accessed by the root user and group. Fix them so that users in the libvirt group may also access them.

Remember, you previously added your userid to the libvirt group and configured Linux ACLs so all files created in the /simulator/ directory and its subdirectories were also owned by the libvirt group. However, you still need to fix the file permissions so users that are members of other groups can access the disk files.

brian@T420:~$ sudo chmod g+rw /simulator/sim01/*

The libguestfs developers recommend that you do not use root privileges when you run libguestfs tools, like virt-sysprep, on your VM disk images. That means you will need to run the chmod command, above, every time you clone a new VM⁵.

Minimizing disk storage size

I am keeping things simple in this example, so I created an independent disk image for each virtual machine. This uses a lot of disk space. If you are creating many virtual machines, you may wish to investigate using copy-on-write images while using the a QCOW2 image as backing file, or master disk image. This will greatly reduce the storage space consumed by disk images.

In this case, one way to improve disk usage is to sparsify the VM disk image. The virt-sparsify tool converts free space inside the VM’s disk image back to free space on the host’s filesystem.

brian@T420:~$ virt-sparsify pc-base.qcow2 pc-base-sparse.qcow2
brian@T420:~$ mv pc-base.qcow2 pc-base-fixed.qcow2
brian@T420:~$ mv pc-base-sparse.qcow2 pc-base.qcow2
brian@T420:~$ sudo chmod g+rw /simulator/sim01/*

You will see that the virtual machine’s disk image, which used to (apparently) take up 4.1 GB on the host system’s filesystem, now consumes only 1.9 GB.⁶

Clone the pc-base VM to create the router-base VM

Three of the nodes in this network emulation scenario are routers so you need to create a base router VM from which you can clone other router VMs. Start by cloning the pc-base VM to create the router-base VM. Log in to the router-base VM and configure it to operate as a router.

brian@T420:~$ virt-clone --original pc-base \
    --name router-base \
    --file /simulator/sim01/router-base.qcow2
brian@T420:~$ sudo chmod g+rw /simulator/sim01/*

Individualize the cloned router-base VM

When you clone a VM, you copy all configurations from the source VM to the new cloned VM. This creates problems because every node on a network is expected to have a unique hostname and machine-id.⁷

One way to individualize each cloned VM is to start each one, log into it, change the required settings, and shut it down. However, a better way is to directly manipulate the cloned VM’s disk image using libguestfs tools like virt-sysprep.

Use the virt-sysprep command to configure the router-base VM with a unique hostname and machine ID, and to clear the DHCP client state:

brian@T420:~$ virt-sysprep --domain router-base \
    --enable customize,dhcp-client-state,machine-id \
    --hostname 'router-base'

Configure the router-base VM

Start the router-base VM and connect to it via SSH or via its console. Remember that the VMs’ userid and password are both set to “sim”:

brian@T420:~$ virsh start router-base
brian@T420:~$ ssh sim@router-base
sim@router-base:~$

Install the Free Range Routing protocol suite. This involves many steps, as shown below. The following instructions show how to build FRR on Ubuntu 18.04.

Get the latest version of FRR from the FRR Releases page at: https://github.com/FRRouting/frr/releases. Be sure to check for the latest release. The examples below may no be the latest release.

sim@router-base:~$ mkdir ~/software
sim@router-base:~$ cd ~/software
sim@router-base:~$ wget https://github.com/FRRouting/frr/releases/download/frr-6.0.2/frr_6.0.2-0.ubuntu18.04.1_amd64.deb
sim@router-base:~$ wget https://github.com/FRRouting/frr/releases/download/frr-6.0.2/frr-pythontools_6.0.2-0.ubuntu18.04.1_all.deb
sim@router-base:~$ wget https://github.com/FRRouting/frr/releases/download/frr-6.0.2/frr-doc_6.0.2-0.ubuntu18.04.1_all.deb

Install the FRR packages from the downloaded packages:

sim@router-base:~$ sudo apt -y install ./frr_6.0.2-0.ubuntu18.04.1_amd64.deb
sim@router-base:~$ sudo apt -y install ./frr-doc_6.0.2-0.ubuntu18.04.1_all.deb
sim@router-base:~$ sudo apt -y install ./frr-pythontools_6.0.2-0.ubuntu18.04.1_all.deb

Get the latest Libyang packages at: https://ci1.netdef.org/browse/LIBYANG-YANGRELEASE/latestSuccessful/artifact, and install them:

sim@router-base:~$ wget https://ci1.netdef.org/artifact/LIBYANG-YANGRELEASE/shared/build-1/Ubuntu-18.04-x86_64-Packages/libyang-dev_0.16.46_amd64.deb
sim@router-base:~$ wget https://ci1.netdef.org/artifact/LIBYANG-YANGRELEASE/shared/build-1/Ubuntu-18.04-x86_64-Packages/libyang_0.16.46_amd64.deb
sim@router-base:~$ sudo apt -y install ./libyang_0.16.46_amd64.deb
sim@router-base:~$ sudo apt -y install ./libyang-dev_0.16.46_amd64.deb

To enable IPv4 & IPv6 forwarding, edit the /etc/sysctl.conf file

sim@router-base:~$ sudo nano /etc/sysctl.conf

Uncomment the following lines (ignore the other settings):

net.ipv4.ip_forward=1

net.ipv6.conf.all.forwarding=1

Save the file.

To enable MPLS on the router, edit the /etc/modules-load.d/modules.conf file:

sim@router-base:~$ sudo nano /etc/modules-load.d/modules.conf

Add the following lines to /etc/modules-load.d/modules.conf:

# Load MPLS Kernel Modules
mpls_router
mpls_iptunnel

Save the file. Run sysctl -p to apply the new config to the running system.

sim@router-base:~$ sudo sysctl -p

To enable the protocol daemons, edit the /etc/frr/daemons file:

sim@router-base:~$ sudo nano /etc/frr/daemons

Change the each daemon’s value from “no” to “yes” if you want it to start when the VM starts. For example, I suggest you start OSPF on every router to create a simple, single-area IGP domain:

bgpd=no
ospfd=yes
ospf6d=no
ripd=no
ripngd=no
isisd=no
pimd=no
ldpd=no
nhrpd=no
eigrpd=no
babeld=no
sharpd=no
pbrd=no
bfdd=no

Save the file.

You can’t enable MPLS forwarding on interfaces because, at this point, there are no interfaces available. you will be able to enable MPLS forwarding on each router instance after you add interfaces to them — after you plug them into the network emulation scenario.

Stop the router-base VM

Like the pc-base VM earlier, you need to shut down the router-base VM so you can clone router instances from it. Exit the VM’s console with the CTRL-] key combination and stop the VM.

sim@router-base:~$ logout
brian@T420:~$ virsh shutdown router-base

Create the VMs for the network emulation scenario

In the previous section, you created the pc-base or router-base VMs that will serve as “golden master” images. Clone these virtual machines to create the virtual machines that run in your network emulation scenarios.

Clone the pc-base VM to create the user and server VMs

In this example, you are creating a network of five nodes, connected together on the same virtual network. Two of those nodes emulate a user and a server on the network. Create those VMs from clones of the pc-base VM:

brian@T420:~$ virt-clone --original pc-base \
    --name user \
    --file /simulator/sim01/user.qcow2
brian@T420:~$ sudo chmod g+rw /simulator/sim01/*
brian@T420:~$ virt-sysprep --domain user \
    --enable customize,dhcp-client-state,machine-id \
    --hostname 'user'

brian@T420:~$ virt-clone --original pc-base \
    --name server \
    --file /simulator/sim01/server.qcow2
brian@T420:~$ sudo chmod g+rw /simulator/sim01/*
brian@T420:~$ virt-sysprep --domain server \
    --enable customize,dhcp-client-state,machine-id \
    --hostname 'server'

Clone the router-base VM to create router instances

In this example, you are creating a network of three routers, connected together. Create the router VMs by cloning the router-base VM, and individualizing each instance:

brian@T420:~$ virt-clone --original router-base \
    --name r01 \
    --file /simulator/sim01/r01.qcow2
brian@T420:~$ sudo chmod g+rw /simulator/sim01/*
brian@T420:~$ virt-sysprep --domain r01 \
    --enable customize,dhcp-client-state,machine-id \
    --hostname 'r01'

brian@T420:~$ virt-clone --original router-base \
    --name r02 \
    --file /simulator/sim01/r02.qcow2
brian@T420:~$ sudo chmod g+rw /simulator/sim01/*
brian@T420:~$ virt-sysprep --domain r02 \
    --enable customize,dhcp-client-state,machine-id \
    --hostname 'r02'

brian@T420:~$ virt-clone --original router-base \
    --name r03 \
    --file /simulator/sim01/r03.qcow2
brian@T420:~$ sudo chmod g+rw /simulator/sim01/*
brian@T420:~$ virt-sysprep --domain r03 \
    --enable customize,dhcp-client-state,machine-id \
    --hostname 'r03'

Check Libvirt definitions

Verify the virtual machines are defined in Libvirt:

brian@T420:~$ virsh list --all

You should see the following output:

 Id    Name                           State
----------------------------------------------------
 -     pc-base                        shut off
 -     r01                            shut off
 -     r02                            shut off
 -     r03                            shut off
 -     router-base                    shut off
 -     server                         shut off
 -     user                           shut off

Start the VMs in the network emulation scenario

Run the VM’s you created for the planned network emulation scenario so you can build network links one-by-one on live VMs and test them as you go along. Run the following commands to start the VMs you recently created.

brian@T420:~$ virsh start r01
brian@T420:~$ virsh start r02
brian@T420:~$ virsh start r03
brian@T420:~$ virsh start user
brian@T420:~$ virsh start server

Verify that all the VMs are running:

brian@T420:~$ virsh list
 Id    Name                           State
----------------------------------------------------
 3     r01                            running
 4     r02                            running
 5     r03                            running
 6     user                           running
 7     server                         running

Create the first Libvirt network

So you can understand how Libvirt creates and manages networks and connections to virtual machines, I will walk through a detailed example in which you will create your first network that connects the user VM to the r01 VM.

Refer to the network diagram, above. See the network bridge and the VM interfaces that that must be added up to create the network link between the user VM to the r01 VM.

From the original network planning table, I pulled out the specific connections and configurations you need to create for the first Libvirt network, and listed them below:

VM name	VM port	MAC address	IP address	Bridge name
user	2	02:00:aa:0a:01:02	10.10.100.1/24	br_user_r1
r01	2	02:00:aa:01:0a:02	10.10.100.2/24	br_user_r1

To create a Libvirt-managed network, first create an XML file that defines the network attributes so you can import that file into Libvirt. Create and edit the file /tmp/r1user.xml:

brian@T420:~$ nano /tmp/r1user.xml

Follow the guidelines in the Libvirt XML format documentation. Configure only the minimum information needed to define the bridge and the network. Libvirt will fill in all other details with default values.

Enter the following XML code into the file:

<network>
  <name>net_user_r1</name>
  <bridge name="br_user_r1" stp='off' macTableManager="libvirt"/> 
  <mtu size="9216"/>
</network>

I gave the network and the bridge it creates the name net_user_r1, which helps you understand its place in the planned network topology. I wanted to create a point-to-point “virtual wire” and did not want the network bridge to interact with the devices connected to it so I turned off Spanning Tree Protocol, set Libvirt to manage the MAC forwarding table, and set the MTU size to the jumbo-frame size.⁸

https://www.linux.com/learn/intro-to-linux/2018/4/how-compile-linux-kernel-0

Save the file. Import the network XML file into Libvirt with the following command. It will cause Libvirt to define the network described in the XML file, which in this case is the network named net_user_r1.

brian@T420:~$ virsh net-define /tmp/r1user.xml

Now, the network net_user_r1 is managed by Libvirt. Verify this with the following command:

brian@T420:~$ virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 net_user_r1          inactive   no            yes

The network is not started so the bridge does not exist, yet. Start the network to create the bridge:

brian@T420:~$ virsh net-start net_user_r1

Now, the network is started and you can verify the bridge exists with the brctl command:

brian@T420:~$ brctl show br_user_r1
bridge name  bridge id          STP enabled  interfaces
br_user_r1   8000.525400e1f59a  no           br_user_r1-nic

Connect the user and r01 VMs to the new network

Connect the VMs r01 and user to the new bridge br_user_r1 by using Libvirt to create new interfaces on each virtual machine and connect those interfaces to the bridge. Libvirt makes this easy.

Get information about the existing interfaces on r01 and user.

brian@T420:~$ virsh domiflist user
Interface  Type     Source    Model    MAC
-------------------------------------------------
vnet3      bridge   virbr0    virtio   52:54:00:9b:4e:16

brian@T420:~$ virsh domiflist r01
Interface  Type     Source    Model    MAC
-------------------------------------------------
vnet0      bridge   virbr0    virtio   52:54:00:be:5c:3a

You see that PC user‘s and router r01‘s management interfaces, vnet3 and vnet0, are attached to the management bridge virbr0

Add a new interface on user connected to the new network net_user_r1 which, practically, connects it to the bridge br_user_r1. Use the MAC interface from the value I listed in the network planning table, above.

brian@T420:~$ virsh attach-interface \
          --domain user \
          --type network \
          --source net_user_r1 \
          --model virtio \
          --mac 02:00:aa:0a:01:02 \
          --config --live

Do the same on the other side of the “wire”. Create a new interface on VM r01 and connect it to the same network net_user_r1.

brian@T420:~$ virsh attach-interface \
          --domain r01 \
          --type network \
          --source net_user_r1 \
          --model virtio \
          --mac 02:00:aa:01:0a:02 \
          --config --live

Check the bridges again. See that the new interfaces, named vnet5 and vnet6, have been added to the bridge br_user_r1:

brian@T420:~$ brctl show br_user_r1
bridge name  bridge id          STP enabled  interfaces
br_user_r1   8000.525400e1f59a  no           br_user_r1-nic
                                             vnet5
                                             vnet6

Also, check the VM interfaces. See interface the vnet5 on VM user and interface vnet6 on VM r01:

brian@T420:~$ virsh domiflist user
Interface  Type     Source      Model    MAC
-------------------------------------------------------
vnet3      bridge   virbr0      virtio   52:54:00:9b:4e:16
vnet5      network  net_user_r1 virtio   02:00:aa:0a:01:02

brian@T420:~$ virsh domiflist r01
Interface  Type     Source      Model    MAC
-----------------------------------------------------
vnet0      bridge   virbr0      virtio   52:54:00:be:5c:3a
vnet6      network  net_user_r1 virtio   02:00:aa:01:0a:02

Test the connection

Test the new connection between VMs user and r01. Log into each of the virtual machines and look for the new interface. Configure the new interfaces with IP addresses and test that you can ping from one node to the other.

Log into VM user:

brian@T420:~$ virsh console user
sim@user:~$ sudo su
sim@user:~# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:9b:4e:16 brd ff:ff:ff:ff:ff:ff
3: ens6: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 02:00:aa:0a:01:02 brd ff:ff:ff:ff:ff:ff

A new interface, ens6, is the second interface (not including the loopback interface) on the VM. Interface ens6 is “port 2”, as shown in the network diagram and planning table. Configure this interface with the commands:

sim@user:~# ip addr add 10.10.100.1/24 dev ens6
sim@user:~# ip link set dev ens6 up
sim@user:~# 
Ctrl-[
brian@T420:~$

Configure the corresponding interface on r01:

brian@T420:~$ ssh sim@r01
sim@r01:~$ sudo su
sim@r01:~# ip addr add 10.10.100.2/24 dev ens6
sim@r01:~# ip link set dev ens6 up

Test the connection to the VM user with the commands:

sim@r01:~# ping -c 1 10.10.100.1
PING 10.10.100.1 (10.10.100.1) 56(84) bytes of data.
64 bytes from 10.10.100.1: icmp_seq=1 ttl=64 time=1.82 ms

--- 10.10.100.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.820/1.820/1.820/0.000 ms
sim@r01:~# exit
sim@r01:~$ exit
brian@T420:~$

You have demonstrated that the link you created between VM user and VM r01 is working.

Create the remaining networks

Connect the remaining VMs to each other by creating an XML file that defines each network according to your network plan. After you save the XML file, run the virsh net-define command to import it into Libvirt, run the virsh net-start command for each network.

You have already set up five virtual machines and one network named net_user_r1 between two of those nodes. The following commands will create the rest of the virtual networks, according to the plan:

brian@T420:~$

cat >> /tmp/r1r2.xml << EOF
<network>
  <name>net_r1_r2</name>
  <bridge name="br_r1_r2" stp='off' macTableManager="libvirt"/> 
  <mtu size="9216"/> 
</network>
EOF

cat >> /tmp/r2r3.xml << EOF
<network>
  <name>net_r2_r3</name>
  <bridge name="br_r2_r3" stp='off' macTableManager="libvirt"/> 
  <mtu size="9216"/> 
</network>
EOF

cat >> /tmp/r1r3.xml << EOF
<network>
  <name>net_r1_r3</name>
  <bridge name="br_r1_r3" stp='off' macTableManager="libvirt"/> 
  <mtu size="9216"/> 
</network>
EOF

cat >> /tmp/r3serv.xml << EOF
<network>
  <name>net_r3_serv</name>
  <bridge name="br_r3_serv" stp='off' macTableManager="libvirt"/> 
  <mtu size="9216"/> 
</network>
EOF

virsh net-define /tmp/r1r2.xml
virsh net-define /tmp/r2r3.xml
virsh net-define /tmp/r1r3.xml
virsh net-define /tmp/r3serv.xml

virsh net-start net_r1_r2
virsh net-start net_r2_r3
virsh net-start net_r1_r3
virsh net-start net_r3_serv

Connect the VMs to the networks

Use the virsh attach-interface command to create interfaces on the VMs and connect them to each network according to the plan. Use the MAC address information you listed in your plan. Test each connection after you create it, to ensure you are following the plan.

Be careful about the order in which you run the attach-interface commands. Interfaces are created on virtual machines in the order in which you attach them. For example, if you want “port 4” on r03 connected to bridge br_r3_serv, ensure that you make that connection the third time you use the attach-interface command on r03 (since “port 1” already exists).

Let’s walk through the process, interface-by-interface.

You previously connected user and r01 to bridge br_user_r1. So you know that port ens6 (which is “port 2” in the diagram) on user and port ens6 (also “port 2”) on r01 have been assigned by Libvirt. So when you run the following command, you know it will attach “port 3” (knows as ens7 on the router) on r01 to bridge br_r1_r2.

Connect “port 3” (ens7) on r01 to bridge net_r1_r2.

virsh attach-interface --domain r01 \
    --type network --source net_r1_r2 \
    --model virtio --mac 02:00:aa:01:02:03 \
    --config --live

Connect “port 4” (ens8) on r01 to bridge net_r1_r3.

virsh attach-interface --domain r01 \
--type network --source net_r1_r3 \
--model virtio --mac 02:00:aa:01:03:04 \
--config --live

Connect “port 2” (ens6) on r02 to bridge net_r2_r3.

virsh attach-interface --domain r02 \
--type network --source net_r2_r3 \
--model virtio --mac 02:00:aa:02:03:02 \
--config --live

Connect “port 3” (ens7) on r02 to bridge net_r1_r2.

virsh attach-interface --domain r02 \
--type network --source net_r1_r2 \
--model virtio --mac 02:00:aa:02:01:03 \
--config --live

Connect “port 2” (ens6) on r03 to bridge net_r1_r3.

virsh attach-interface --domain r03 \
--type network --source net_r1_r3 \
--model virtio --mac 02:00:aa:03:01:02 \
--config --live

Connect “port 3” (ens7) on r03 to bridge net_r2_r3.

virsh attach-interface --domain r03 \
--type network --source net_r2_r3 \
--model virtio --mac 02:00:aa:03:02:03 \
--config --live

Connect “port 4” (ens8) on r03 to bridge net_r3_serv.

virsh attach-interface --domain r03 \
--type network --source net_r3_serv \
--model virtio --mac 02:00:aa:03:0b:04 \
--config --live

Connect “port 2” (ens6) on server to bridge net_r3_serv.

virsh attach-interface --domain server \
--type network --source net_r3_serv \
--model virtio --mac 02:00:aa:0b:03:02 \
--config --live

Configure each network node

At this point, a researcher would install product software and configure networking on each virtual machine in the virtual network and begin testing the network emulation scenario’s behaviour.

The following commands, run on each network node, will create a minimal network configuration that uses OSPF routing protocol to distribute node reachability information and enable on each node to ping any other node in the network.

Run the following commands:

VM user:

Log into VM user:

brian@T420:~$ ssh sim@user
sim@user:~$ sudo su
sim@user:~#

Copy and paste the following text into the user VM’s terminal:

bash <<EOF2
rm /etc/netplan/01-netcfg.yaml
cat >> /etc/netplan/01-netcfg.yaml << EOF
network:
  version: 2
  renderer: networkd
  ethernets:
    ens2:
      dhcp4: yes
    ens6:
      addresses:
        - 10.10.100.1/24
      #gateway4:    
      routes:
        - to: 10.10.0.0/16
          via: 10.10.100.2
          metric: 100
EOF
chmod 644 /etc/netplan/01-netcfg.yaml
netplan apply
EOF2

Exit the VM:

sim@user:~# exit
sim@user:~$ exit
brian@T420:~$

VM r01:

Log into VM r01.

brian@T420:~$ ssh sim@r01
sim@r01:~$ sudo su
sim@r01:~#

Copy and paste the following text into the r01 VM’s terminal:

bash <<EOF2
cat >> /etc/frr/frr.conf << EOF
frr version 6.0.2
frr defaults traditional
hostname r01
service integrated-vtysh-config
!
interface ens6
 ip address 10.10.100.2/24
!
interface ens7
 ip address 10.10.12.1/24
!
interface ens8
 ip address 10.10.13.1/24
!
router ospf
 ospf router-id 1.1.1.1
 redistribute connected
 passive-interface ens6
 network 10.10.12.0/24 area 0
 network 10.10.13.0/24 area 0
 network 10.10.100.0/24 area 0
!
line vty
!
EOF
systemctl reload frr
EOF2

Exit the VM:

sim@r01:~# exit
sim@r01:~$ exit
brian@T420:~$

VM r02:

Log into VM r02.

brian@T420:~$ ssh sim@r02
sim@r02:~$ sudo su
sim@r02:~#

Copy and paste the following text into the r02 VM’s terminal:

bash <<EOF2
cat >> /etc/frr/frr.conf << EOF
frr version 6.0.2
frr defaults traditional
hostname r02
service integrated-vtysh-config
!
interface ens6
 ip address 10.10.23.1/24
!
interface ens7
 ip address 10.10.12.2/24
!
router ospf
 ospf router-id 2.2.2.2
 redistribute connected
 network 10.10.23.0/24 area 0
 network 10.10.12.0/24 area 0
!
line vty
!
EOF
systemctl reload frr
EOF2

Exit the VM:

sim@r02:~# exit
sim@r02:~$ exit
brian@T420:~$

VM r03:

Log into VM r03.

brian@T420:~$ ssh sim@r03
sim@r03:~$ sudo su 
sim@r03:~#

Copy and paste the following text into the r03 VM’s terminal:

bash <<EOF2
cat >> /etc/frr/frr.conf << EOF
frr version 6.0.2
frr defaults traditional
hostname r03
service integrated-vtysh-config
!
interface ens6
 ip address 10.10.13.2/24
!
interface ens7
 ip address 10.10.23.2/24
!
interface ens8
 ip address 10.10.200.2/24
!
router ospf
 ospf router-id 3.3.3.3
 redistribute connected
 passive-interface ens8
 network 10.10.13.0/24 area 0
 network 10.10.23.0/24 area 0
 network 10.10.200.0/24 area 0
!
line vty
!
EOF
systemctl reload frr
EOF2

Exit the VM:

sim@r03:~# exit
sim@r03:~$ exit
brian@T420:~$

VM server

Log into VM server.

brian@T420:~$ ssh sim@server
sim@server:~$ sudo su
sim@server:~#

Copy and paste the following text into the server VM’s terminal:

bash <<EOF2
rm /etc/netplan/01-netcfg.yaml
cat >> /etc/netplan/01-netcfg.yaml << EOF
network:
  version: 2
  renderer: networkd
  ethernets:
    ens2:
      dhcp4: yes
    ens6:
      addresses:
        - 10.10.200.1/24
      #gateway4:    
      routes:
        - to: 10.10.0.0/16
          via: 10.10.200.2
          metric: 100
EOF
chmod 644 /etc/netplan/01-netcfg.yaml
netplan apply
EOF2

Exit the VM:

sim@server:~# exit
sim@server:~$ exit
brian@T420:~$

The above commands complete the minimum configuration, which is also stored in each VM’s configuration files so the network emulation scenario will start in this state whenever the VM’s and networks are started.

Network Emulation scripts

You may simplify the setup and tear-down of your network emulation scenarios by creating a few simple scripts. Libvirt handles all the complexity of creating the networks and virtual machines so you need create just two simple scripts: startlab.sh and stoplab.sh.

startlab.sh

Create a new file named startlab.sh by pasting the following text into your terminal:

brian@T420:~$

cat >> startlab.sh << EOF
#!/usr/bin/env bash
set -e
set -x
virsh net-start net_r1_r2
virsh net-start net_r1_r3
virsh net-start net_r2_r3
virsh net-start net_r3_serv
virsh net-start net_user_r1
virsh start R01
virsh start R02
virsh start R03
virsh start server
virsh start user
EOF

stoplab.sh

Create a new file named stoplab.sh by pasting the following text into your terminal:

brian@T420:~$

cat >> stoplab.sh << EOF
#!/usr/bin/env bash
set -e
set -x
virsh destroy R01
virsh destroy R02
virsh destroy R03
virsh destroy server
virsh destroy user
virsh net-destroy net_r1_r2
virsh net-destroy net_r1_r3
virsh net-destroy net_r2_r3
virsh net-destroy net_r3_serv
virsh net-destroy net_user_r1
EOF

Set the files so they are executable:

brian@T420:~$ chmod +x *.sh

In the future, when you want to start a lab, navigate the the lab’s directory and run the startlab.sh script. Stop the lab by running the stoplab.sh script.

Conclusion

I showed you that you can use Libvirt and libguestfs tools to create a simple network emulation scenario consisting of virtual machines and virtual networks.

I performed many of the operations that created the network nodes and the virtual networks by running commands on the host system, but I configured each node by logging into it and making local changes. According to the libguestfs documentation, it should be possible to configure every guest VM in the network emulation scenario using libguestfs tools running on the host system. This would eliminate the need to log in to each VM to configure it, and would enable scripts to build and configure a complex network scenario.

I think that manually building complex network emulation scenarios with Libvirt’s command-line-interface is difficult and the risk of making a configuration error at some point is high. However, one could write a program that use the Libvirt API to create complex network emulation scenarios based on data read in from some sort of topology file, like a dot file.

Appendix A: Why not use Libvirt storage pools?

Libvirt users may store virtual machine disk images in a storage pool managed by Libvirt. Users can define a directory as the storage pool and store disk images in that directory, or they can use a wide variety of different file system technologies or remote repositories.

Storage pools may be useful in more complex Libvirt use-cases, and they are a building block for applications build on top of Libvirt. However, network emulation scenarios is relatively simple (as a virtualization technology) and some of the other virtualization tools I plan to use, such as virt-clone and other libguestfs tools, do not support Libvirt storage pools. For this reason, I did not define a Libvirt storage pool and used the –file or –disk options in Libvirt to point to disk images in a directory.

Appendix B: Other distributions for guests

I used Ubuntu Server to create the guest VMs in this network emulation scenario. You may wish to use different Linux distributions to create the VMs. Be aware, however, that some Linux distributions — especially network appliances, and distributions tuned to be as small as possible or to offer a high degree of security — may not be compatible with the some of tools I used in this post, such as virt-sysprep.

To see what the libguestfs development team really thinks, search for “completely stupid” in the libguestfs frequently-asked-questions web page ↩
See the following links for information about configuring the default Libvirt network to assign static IP addresses, and maintaining the /etc/hosts file in sync with virtual machines’ static IP addresses, and automating DHCP and Libvirt domains. ↩
Create a new bridge for each “wire” that will connect nodes together, not counting management connections, which all go to the same management bridge. ↩
See the LOCATION section in the virt-install man pages for more information. ↩
I am looking for a permanent fix for the virt-clone disk file ownership and permissions issue. If you have one, please post it in the comments, below. Thanks! ↩
Note that, if you run the du -sh pc-base-old.fixed command, you will see that the original VM disk image really only consumed 2.8 GB on the host filesystem. Still, this is a 30% reduction in disk usage. ↩
The machine-id identifies the VM as a unique node for DHCP network interface configuration. If you see strange DHCP behaviour on your management network, verify that all your VMs have a unique machine-id. ↩
This is good enough for most scenarios. If you need fully transparent flooding across the bridge (hub emulation), you need to use a more complex setup with Open vSwitch or macvtap with Libvirt. If you want to continue using Linux bridges, you may enable LLDP frame forwarding by tweaking the system settings and enable LACP and STP frame forwarding by patching and re-compiling the Linux kernel. ↩

↧

Vrnetlab: Emulate networks using KVM and Docker

March 15, 2019, 4:00 am

≫ Next: The Wistar network emulator

≪ Previous: Build a network emulator using Libvirt and KVM

Vrnetlab, or VR Network Lab, is an open-source network emulator that runs virtual routers using KVM and Docker. It supports developers and network engineers who use continuous-integration processes for testing network provisioning changes. Researchers and engineers may also use the vrnetlab command line interface to create and modify network emulation labs in an interactive way. In this post, I review vrnetlab’s main features and show how to use it to create a simple network emulation scenario using open-source routers.

Vrnetlab implementation

Vrnetlab users create Docker images for each type of router that will run in their network. They package the router’s disk image together with KVM software, Python scripts, and any other resources required by the router into the Docker image. Vrnetlab uses KVM to create and run VMs based on router software images, and uses Docker to manage the networking between the network nodes.

Virtual nodes

Vrnetlab users create Docker images that incorporate the router’s qemu disk image, along with software packages such as qemu-kvm, and the other resources needed by the router, such as a launch script and license files. The new Docker image represents a “virtual router” that comes with all the software and scripts needed to start the router and connect to the virtual network.

For example, a container created from an OpenWRT Docker image could logically be represented as shown below, if you do not modify the launch script that comes with it:

The router VM receives some “bootstrap” configurations from the launch script and is connected to the rest of the network via the Docker container’s open TCP ports. We’ll discuss in the next chapter, below, how the interfaces use TCP ports to implement network links.

The launch script is unique to each router type. For example, the launch script bundled with the OpenWRT Docker image will poll the router VM until it completes its startup, then it will log in to the router, change the password, and configure the LAN port. Users may need to modify the launch script if they have special requirements.

Vrnetlab simplifies network emulation for complex commercial routers, especially in the cases where commercial routers require multiple VMs that implement different parts of the virtual router’s functionality, such as control or forwarding functions. Packaging these multiple VMs into a single container along with a launch script that defines how they are interconnected to each other and hides their complexity from the rest of the virtual network. Each virtual router appears to the rest of the network as a single node, regardless of how many VMs are needed to implement it and regardless of how complex the networking requirements are between the router’s internal VMs.

You can see in the figure below how using Docker as a package format and as the interconnection layer greatly simplifies the user’s view of the network emulation scenario using a complex commercial virtual router. The developer will spend effort creating a Docker image and writing the launch script, but the user only needs to know which ports map to which interfaces on the virtual router.

The vrnetlab GitHub repository does not include any commercial router images so vrnetlab users must provide qemu disk images that they have obtained themselves.

Virtual network connections

Vrnetlab uses a cross-connect program named vr-xcon to define connections between node interfaces and to collect and transport data packets between those interfaces. All traffic passes through the standard Docker0 management bridge, but the cross-connect program creates an overlay network of point-to-point TCP sessions on top of the management bridge. If the user stops the cross-connect script, the network connections between virtual nodes stop transporting packets.

The vr-xcon Python script runs in a Docker container so you need to download the vr-xcon Docker image from the vrnetlab repository on Docker Hub, or build it locally. When you create a container from the image and run it, the Python script starts collecting and forwarding TCP packets between TCP ports on each node’s Docker container. Vrnetlab uses TCP potr numbers to create the point-to-point connections between interfaces.

The vr-xcon script can take in a list of all point-to-point connections in the network and handle forwarding for all of them. If you set up your virtual network this way, then all connections will stop if you stop the script. You may also run many different instances of the script — each in its own Docker container — to create links one-by-one or in smaller groups. This way, you can “disconnect” and “reconnect” individual links by stopping and starting the container that runs the script for each link.

Helper scripts

Users run all vrnetlab operations using Docker commands. Some operations require the user to create complex commands combining Docker and Linux commands. Fortunately, the vrnetlab author created a set of shell functions that run the most common vrnetlab functions. The functions are contained in a single shell script named vrnetlab.sh and are loaded into your Bash shell using the source command.

Open-source routers

Vrnetlab supports many commercial routers but currently supports only one open-source router, OpenWRT. OpenWRT supports a limited number of use-cases — mostly related to performing the role of gateway between a LAN and a WAN so the scenarios you can create with just OpenWRT are very limited. You need more node types that can represent core routers and users that run open-source software.

It is possible to extend vrnetlab and add more open-source router types. Maybe I’ll cover that in another post, in the future. For now, this post will cover using OpenWRT to create a two-node network.

Prepare the system

Vrnetlab is designed to run on an Ubuntu or Debian Linux system. I tested vrnetlab on a system running ubuntu 18.04 and it worked well. ^[See the documentation about vrnetlab system requirements and how vrnetlab works on other operating systems that support Docker.]

Before you install vrnetlab on an Ubuntu 18.04 LTS system, you must install some prerequisite software packages, such as Docker, git, Beautiful Soup, and sshpass. You may install them using the commands shown below:

T420:~$ sudo apt update
T420:~$ sudo apt -y install python3-bs4 sshpass make
T420:~$ sudo apt -y install git
T420:~$ sudo apt install -y \
    apt-transport-https ca-certificates \
    curl gnupg-agent software-properties-common
T420:~$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
T420:~$ sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
T420:~$ sudo apt update
T420:~$ sudo apt install -y docker-ce docker-ce-cli containerd.io

Install vrnetlab

To install vrnetlab, clone the vrnetlab repository from GitHub to your system. In this example, I cloned the repository to my home directory, as follows:

T420:~$ cd ~
T420:~$ git clone https://github.com/plajjan/vrnetlab.git

Go to the vrnetlab directory:

T420:~$ cd ~/vrnetlab

Now, you see the vrnetlab scripts and directories. Notice that there is a separate directory for each router type vrnetlab supports

T420:~$ ls
CODE_OF_CONDUCT.md  config-engine-lite        openwrt           vr-bgp
CONTRIBUTING.md     csr                       routeros          vr-xcon
LICENSE             git-lfs-repo.sh           sros              vrnetlab.sh
Makefile            makefile-install.include  topology-machine  vrp
README.md           makefile-sanity.include   veos              vsr1000
ci-builder-image    makefile.include          vmx               xrv
common              nxos                      vqfx              xrv9k

Create a router image

Every router supported by vrnetlab has unique configuration and setup procedures. For the OpenWRT router, the vnetlab author created a makefile that will download the latest version of OpenWRT to the vrnetlab/openwrt directory and then build an OpenWRT Docker image.

However, the download script fails. It seems that the directory structure on the OpenWRT downloads web site changes since the script was written. The workaround is easy: use the wget command to download the latest version of OpenWRT to the ~/vrnetlab/openwrt directory, then run the sudo make build command again, as follows:

T420:~$ cd ~/vrnetlab/openwrt
T420:~$ wget https://downloads.openwrt.org/releases/18.06.2/targets/x86/64/openwrt-18.06.2-x86-64-combined-ext4.img.gz
T420:~$ sudo make build

This provides a lot of output. At the end of the output, you see text similar to the following lines:

Successfully built 0c0eef5fb556
Successfully tagged vrnetlab/vr-openwrt:18.06.2
make[2]: Leaving directory '/home/brian/vrnetlab/openwrt'
make[1]: Leaving directory '/home/brian/vrnetlab/openwrt'

From ee the docker image is named vrnetlab/vr-openwrt:18.06.2. See all docker images with the docker images command:

T420:~$ sudo docker images
REPOSITORY            TAG       IMAGE ID       CREATED          SIZE
vrnetlab/vr-openwrt   18.06.2   0c0eef5fb556   20 seconds ago   545MB
debian                stable    c04b519eaefa   8 days ago       101MB

Install the cross-connect program

Vrnetlab has two programs for building connections between virtual routers: vr-xcon and topo-machine:

vr-xcon is a cross-connect program that adds point-to-point links between nodes. it is suitable for adding links one-by-one, or for building small topologies. I recommend using vr-xcon if you want to be able to “disconnect” and “reconnect” individual links in the network. We will use the vrbridge shell function in this post, which uses vr-xcon to build links between nodes
topo-machine creates virtual network nodes and links between nodes, where the nodes and links are described in a json file. I may write about topo-machine in the future but I do not discuss it in this post. Topo-machine is suitable for building complex topologies and would be especially useful to developers and testers who want to manage the network topology in a CI pipeline and/or source control repository.

vr-xcon point-to-point cross-connect program.

The cross-connect program, vr-xcon is packaged in a Docker image. To install it, pull the vr-xcon image from the vrnetlab Docker repository.

First, you need to login to Docker Hub as follows:

T420:~$ sudo docker login

Enter your Docker Hub userid and password. If you do not have one yet, got to https://hub.docker.com/ and sign up. It’s free.

Pull the vr-xcon image using te following commands:

T420:~$ cd ~/vrnetlab
T420:~$ sudo docker pull vrnetlab/vr-xcon

This will download and install the vr-xcon image in your local Docker system.

Tag images

Tag your Docker images to simplify using them in Docker container commands. You must tag the image vrnetlab/vr-xcon:latest as, simply, vr-xcon so the helper shell scripts in vrnetlab.sh will work. They expect an image named vr-xcon exists in your repository.

T420:~$ sudo docker tag vrnetlab/vr-xcon:latest vr-xcon

You may also choose to tag the image vrnetlab/vr-openwrt:18.06.2 with a shorter name like openwrt:

T420:~$ sudo docker tag vrnetlab/vr-openwrt:18.06.2 openwrt

Check the docker images and verify that the shorter tags have been added to the Docker repository:

T420:~$ sudo docker images
REPOSITORY            TAG       IMAGE ID       CREATED          SIZE
openwrt               latest    0c0eef5fb556   18 minutes ago   545MB
vrnetlab/vr-openwrt   18.06.2   0c0eef5fb556   18 minutes ago   545MB
debian                stable    c04b519eaefa   8 days ago       101MB
vr-xcon               latest    0843f237b02a   2 months ago     153MB
vrnetlab/vr-xcon      latest    0843f237b02a   2 months ago     153MB

Install the vrnetlab.sh shell functions

The vrnetlab.sh script loads some bash shell functions file into the current shell that you can use to manage your virtual routers. Go to the vrnetlab directory:

Change to the root user so you can source the shell script as root. You want to source it as the root user because you need to run all your Docker commands launched by the script as root.

T420:~$ sudo su
T420:~#

Then source the vrnetlab.sh script:

T420:~# cd /home/brian/vrnetlab
T420:~# source vrnetlab.sh

You need to stay as the root user to use the helper commands. You cannot go back to your normal user and use sudo. Also, you need to source the script, again, every time you change back to root user or if you login again.

Plan the network topology

Because vrnetlab does not support any open-source routers except for OpenWRT, a vrnetlab network consisting of only open-source routers will necessarily be very small. Connect two OpenWRT routers together via their WAN ports and then ping from one WAN interface to the other. The figure below show the network topology.

Start the openwrt containers

Start two new containers from the openwrt Docker image. You must use the --privileged option because we are starting a KVM VM in the container and KVM requires elevated privileges. Each container is a different router. Name the routers openwrt1 and openwrt2:

T420:~# docker run -d --privileged --name openwrt1 openwrt
T420:~# docker run -d --privileged --name openwrt2 openwrt

Get information about the running containers:

T420:~# docker container ls
CONTAINER ID  IMAGE    COMMAND       CREATED             STATUS                       PORTS                                                         NAMES
6695d10206a2  openwrt  "/launch.py"  About a minute ago  Up About a minute (healthy)  22/tcp, 80/tcp, 830/tcp, 5000/tcp, 10000-10099/tcp, 161/udp  openwrt2
2edcf17b07dd  openwrt  "/launch.py"  About a minute ago  Up About a minute (healthy)  22/tcp, 80/tcp, 830/tcp, 5000/tcp, 10000-10099/tcp, 161/udp  openwrt1

To check the logs output by the container’s bootstrap script, use the docker logs command as shown below. I removed a lot of the output to make the listing shorter but you can see the logs show the commands that were run as the router was started and configured by the bootstrap script.

T420:~# docker logs openwrt2
2019-03-13 22:45:07,899: vrnetlab   DEBUG    Creating overlay disk image
2019-03-13 22:45:07,917: vrnetlab   DEBUG    Starting vrnetlab OpenWRT
    ...cut text...
2019-03-13 22:45:23,522: vrnetlab   DEBUG    writing to serial console: mkdir -p /home/vrnetlab
2019-03-13 22:45:23,566: vrnetlab   DEBUG    writing to serial console: chown vr netlab /home/vrnetlab
2019-03-13 22:45:23,566: launch     INFO     completed bootstrap configuration
2019-03-13 22:45:23,566: launch     INFO     Startup complete in: 0:00:15.642478

Configure the routers

The bootstrap script configured each OpenWRT router so users can login to it via its LAN/management interface using SSH. To create a network we can test, we need to add more configuration to each node in the network.

configure the router openwrt1

Run the vrcons command (from the vrnetlab.sh script} to use Telnet to log into the console port of the router represented by container openwrt1. Run the command as follows:

T420:~# vrcons openwrt1
Trying 172.17.0.2...
Connected to 172.17.0.2.
Escape character is '^]'.

root@OpenWrt:/#

Check the active configuration of the LAN/management interface. We know from the OpenWRT documentation that the LAN interface is implemented on a bridge named br-lan.

root@OpenWrt:/# ifconfig br-lan
br-lan  Link encap:Ethernet  HWaddr 52:54:00:9C:BF:00
        inet addr:10.0.0.15  Bcast:10.0.0.255  Mask:255.255.255.0
        inet6 addr: fd1a:531:2061::1/60 Scope:Global
        inet6 addr: fe80::5054:ff:fe9c:bf00/64 Scope:Link
        UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
        RX packets:47 errors:0 dropped:0 overruns:0 frame:0
        TX packets:57 errors:0 dropped:0 overruns:0 carrier:0
        collisions:0 txqueuelen:1000
        RX bytes:5472 (5.3 KiB)  TX bytes:9886 (9.6 KiB)

The LAN interface’s IP address is 10.0.0.15 which is required on any router connecting to the management port on the container. However, the persistent configuration is found using the uci command (or by listing the file */etc/config/network) as follows:

root@OpenWrt:/# uci show network.lan

The command lists the following persistent configuration for the LAN interface

network.lan=interface
network.lan.type='bridge'
network.lan.ifname='eth0'
network.lan.proto='static'
network.lan.ipaddr='192.168.1.1'
network.lan.netmask='255.255.255.0'
network.lan.ip6assign='60'

The LAN interface’s persistent configuration does not match the active configuration. If we restart the router VM or restart the networking service, we will revert the LAN interface’s IP address to the persistent configuration of 192.168.1.1, which will break the router VM’s connection to the Docker container’s management port.

Fix the problem by setting the IP address using the uci utility:

root@OpenWrt:/# uci set network.lan.ipaddr='10.0.0.15'

Also, configure the WAN interface with a static IP address. use the IP address 10.10.10.1. First, check the existing WAN interface configuration:

root@OpenWrt:/# uci show network.wan

This lists the configuration below:

network.wan=interface
network.wan.ifname='eth1'
network.wan.proto='dhcp'

Change the WAN interface configuration with the folloing uci set commands.

root@OpenWrt:/# uci set network.wan.proto='static'
root@OpenWrt:/# uci set network.wan.ipaddr='10.10.10.1'
root@OpenWrt:/# uci set network.wan.netmask='255.255.255.0'

Finally, commit the configuration changes so they are saved on the router’s filesystem:

root@OpenWrt:/# uci commit network

Activate the changes by restarting the network service:

root@OpenWrt:/# service network restart

Verify that the WAN interface eth1 has an IP address:

root@OpenWrt:/# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 52:54:00:41:C3:01
          inet addr:10.10.10.1  Bcast:10.10.10.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe41:c301/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5968 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:2001262 (1.9 MiB)

Exit the router’s VM using the Ctrl-] key combination. Then, quit the telnet connection to the container:

CTRL-]
telnet> quit
Connection closed.
T420:~#

configure the router openwrt2

Configure the virtual router openwrt2 the same way as shown above so its WAN interface has IP address 10.10.10.2/24. Login to the router’s serial port as follows:

T420:~# vrcons openwrt1
root@OpenWrt:/#

Configure the router’s LAN and WAN interfaces:

root@OpenWrt:/# uci set network.lan.ipaddr='10.0.0.15'
root@OpenWrt:/# uci set network.wan.proto='static'
root@OpenWrt:/# uci set network.wan.ipaddr='10.10.10.2'
root@OpenWrt:/# uci set network.wan.netmask='255.255.255.0'
root@OpenWrt:/# uci commit network
root@OpenWrt:/# service network restart

Exit the router’s VM using the Ctrl-] key combination. Then, quit the container’s telnet connection:

CTRL-]
telnet> quit
Connection closed.
T420:~#

Connect routers together

Run the vrbridge** command, which is a shell function from the *vrnetlab.sh script. Connect interface 1 on openwrt1 to interface 1 on openwrt2:

T420:~# vrbridge openwrt1 1 openwrt2 1

Remember, the vrbridge command is a shell function that take the parameters you give it and builds a command that runs a vr-xcon container. The vr-xcon command is sent to the host system to be executed. For example, the vr-xcon command created by the vrbridge function we ran above is shown below.
T420:~# docker run -d --privileged --name bridge-openwrt1-1-openwrt2-1 --link openwrt1 --link openwrt2 vr-xcon --p2p openwrt1/1--openwrt2/1

You can see the container running using the docker ps command. The vrbridge function use the router names and port numbers to create a name for the new container, bridge-openwrt1-1-openwrt2-1.

T420:~# docker ps
CONTAINER ID  IMAGE    COMMAND                 CREATED      STATUS                PORTS                                                        NAMES
2e1e0298ff66  vr-xcon  "/xcon.py --p2p open…"  1 min ago    Up About a minute                                                                  bridge-openwrt1-1-openwrt2-1
6695d10206a2  openwrt  "/launch.py"            2 hours ago  Up 2 hours (healthy)  22/tcp, 80/tcp, 830/tcp, 5000/tcp, 10000-10099/tcp, 161/udp  openwrt2
2edcf17b07dd  openwrt  "/launch.py"            2 hours ago  Up 2 hours (healthy)  22/tcp, 80/tcp, 830/tcp, 5000/tcp, 10000-10099/tcp, 161/udp  openwrt1

Test the connection by logging into openwrt1 and pinging openwrt2:

# vrcons openwrt1
root@OpenWrt:/# ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2): 56 data bytes
64 bytes from 10.10.10.2: seq=0 ttl=64 time=1.255 ms
64 bytes from 10.10.10.2: seq=1 ttl=64 time=0.860 ms
64 bytes from 10.10.10.2: seq=2 ttl=64 time=1.234 ms
^C
--- 10.10.10.2 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.860/1.116/1.255 ms
root@OpenWrt:/#

Exit the router:

CTRL-]
telnet> quit
Connection closed.
T420:~#

You may add the --debug option to the vr-xcon command when you run it. The vrbridge shell function you previously ran does not include the debug command so, to demonstrate the debug option, start another container running vr-xcon.

First, stop and delete the existing container bridge-openwrt1-1-openwrt2-1 as follows:

T420:~# docker rm -f bridge-openwrt1-1-openwrt2-1

Then use the following docker run command to start the new container with the --debug option enabled:

T420:~# docker run -d --privileged --name vr-xcon-1 --link openwrt1 --link openwrt2 vr-xcon --p2p openwrt1/1--openwrt2/1 --debug

This time, we named the container vr-xcon-1, just to make the command shorter. If you are building links one-by-one, you will create many containers running vr-xcon: one per link. In that case, I suggest you use more meaningful names like bridge-openwrt1-1-openwrt2-1 for each container running vr-xcon.

Run the ping command again from openwrt1 to openwrt2 and check the logs on the bridge-openwrt1-1-openwrt2-1 container:

T420:~# docker logs vr-xcon-1
2019-03-14 05:19:23,446: xcon  DEBUG  00172 bytes openwrt2/1 -> openwrt1/1
2019-03-14 05:19:23,884: xcon  DEBUG  00102 bytes openwrt1/1 -> openwrt2/1
2019-03-14 05:19:23,884: xcon  DEBUG  00102 bytes openwrt2/1 -> openwrt1/1
2019-03-14 05:19:24,884: xcon  DEBUG  00102 bytes openwrt1/1 -> openwrt2/1
2019-03-14 05:19:24,885: xcon  DEBUG  00102 bytes openwrt2/1 -> openwrt1/1
2019-03-14 05:19:25,884: xcon  DEBUG  00102 bytes openwrt1/1 -> openwrt2/1
2019-03-14 05:19:25,885: xcon  DEBUG  00102 bytes openwrt2/1 -> openwrt1/1

You see the instance of vr-xcon running in container vr-xcon-1 is posting a log entry for each packet it handles. The --debug option and the Docker logs function is useful for basic debugging, such as when you want to verify if the vr-xcon process is working properly.

Stop the network emulation

When you are done, you may stop all running containers.

T420:~# docker stop $(docker ps -a -q)

If you wish to delete the network emulation scenario, including all chnages to configuration files on the router VMs, use the prune command to delete all stopped containers and unused networks.

T420:~# docker system prune

Data persistence

Vrnetlab VMs save changes made in the router configuration files or to data files on their disks. These changes will persist in the qemu disk images after the container is stopped. For example, when you want to work on something else, you may stop the containers in your network emulation scenario and turn off your server. Then, when you are ready to start work again, you can start your server and start all the containers associated with your network emulation scenario, including all vr-xcon containers. Your configuration changes will still exist on the network nodes.

However, the state saved in a node’s disk is lost when you delete the container. If you want to re-run the network emulation scenario, new containers start from the original Docker images.

To create a network emulation scenario that starts up in a fully configured state every time, you would need to write a complex launch script that pulls in configuration files and applies them to each node in the network when that node’s container is started.

Conclusion

While vrnetlab is positioned mainly as a tool to support developers working with commercial routers, I think it is also usable by researchers who will create labs interactively, using vrnetlab’s command-line interface.

I want to create more complex network emulation scenarios using open-source routers in vrnetlab. It seems possible to extend vrnetlab and add in support for a generic Linux VM running FRR, or some other routing software. I plan to try that in the future, when I have the time.

↧

The Wistar network emulator

April 25, 2019, 7:01 pm

≫ Next: Install the Antidote (NRE Labs) network emulator on a Linux system

≪ Previous: Vrnetlab: Emulate networks using KVM and Docker

Wistar is an open-source network emulator originally developed by Juniper Networks and released under the Apache license. It simplifies the presentation of Juniper products on its graphical user interface by making the multiple VMs that make up each JunOS virtual router appear as one node in the network topology.

Wistar also supports Linux virtual machines and, interestingly, uses cloud-init to configure Linux routers from the Wistar user interface. Wistar also supports generic virtual appliances, in a basic way. In this post, I will install Wistar and use it to work through two examples using open source routers.

Wistar Documentation

The Wistar installation procedure is documented in the Wistar GitHib page. The Wistar user guide is available at the Read the Docs website and some unpublished chapters are available on GitHub. Juniper published a presentation about using Wistar. In addition, there are a few other other blog posts available about using Wistar and comparing Wistar to other network emulators.

Wistar GitHub page: https://github.com/Juniper/wistar
Wistar user guide: https://wistar.readthedocs.io/en/latest/

Wistar documentation is good enough to get started, but seems to be incomplete.

Install Wistar

I installed Wistar on my laptop computer running Ubuntu 18.04 LTS. I modified the Wistar installation procedure a little bit to make Wistar run on my Ubuntu Linux 18.04 laptop. Mainly, I incorporated Python virtual environments into my installation process.

You must first install Python 2 on Ubuntu 18.04. Wistar is written in Python 2. ^[Python 2 will reach the end of life on January 1st, 2020. I asked on the Wistar Slack channel about plans to upgrade the code base to Python 3 but received no response by the time I published this post.]

$ sudo su
# apt update
# apt -y upgrade
# apt -y install python-pip python-dev

Install development tools:

# apt -y install build-essential libz-dev \
    libxml2-dev libxslt1-dev libffi-dev \
    libssl-dev libxml2-dev libxslt1-dev git

Install KVM and Libvirt:

# apt -y install qemu-kvm libvirt-clients \
    libvirt-daemon-system virt-manager \
    bridge-utils libguestfs-tools \
    libnss-libvirt libvirt-bin socat \
    genisoimage dosfstools unzip libvirt-dev

Create the directory structure for Wistar:

# mkdir -p /opt/wistar/user_images/instances
# mkdir -p /opt/wistar/seeds
# mkdir -p /opt/wistar/media
# cd /opt/wistar

Create a Python2 virtual environment and install additional Python modules in it.

# pip install virtualenv
# virtualenv .
# source /opt/wistar/bin/activate
(wistar) #
(wistar) # pip install pexpect libvirt-python netaddr markupsafe mtools
(wistar) # pip install pyvbox junos-eznc pyYAML Django==1.9.9
(wistar) # pip install cryptography websocket-client

Libvirt already creates an external bridge named virbr0 so I skipped the step in the installation guide that asks you to install a new bridge named br0.

Finally, download the Wistar Python program and install it in the virtual environment you previously created.

(wistar) # git clone https://github.com/juniper/wistar.git wistar
(wistar) # cd wistar/
(wistar) # ./manage.py migrate

The migrate command, above, sets up databases and other Wistar infrastructure.

Start Wistar web server

Start a basic webserver, built into the Django framework used by Wistar:

(wistar) # cd /opt/wistar/wistar
(wistar) # ./manage.py runserver 0.0.0.0:8080

Optionally, you may consult the Wistar instalation directions to set up more robust web server, like Apache.

Now, a basic web server is running. This terminal window will run the web server and I like to keep it available so I can see the requests it is serving.

Open a new terminal window for other commands that need to be run on the Wistar host PC.

Wistar GUI

Point your browser to the IP address on which the Wistar web server is running. Use TCP port number 8080. The IP is the IP address of the VM you created to run Wistar or, if you installed Wistar on your local machine, it is http://localhost:8080.

http://localhost:8080

In the browser window, the Wistar GUI starts at the Topologies page, as shown below. If this is your first time using Wistar, it will be blank.

Configure Wistar

Edit the /opt/wistar/wistar/wistar/configuration.py file to include the username, SSH key, and default instance password you wish configured in the Ubuntu Cloud Instances, or any other node type that can be initiaized using cloud-init. All cloud-init-enabled images will have the following configurations automatically installed on boot up.

# vi /opt/wistar/wistar/wistar/configuration.py

Set default userid and password

Note the default userid and password. You can use these or change them to a value you prefer.

default_instance_password = 'Clouds123'

ssh_user = "wistar"

Add your public SSH key

Change the public key to one you will use to login to the nodes you start in Wistar. (Note that the private key needs to be installed in the Wistar system’s /root/.ssh folder, since you need to run Wistar with elevated permissions).

ssh_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6ZPy25MpxGnThisIsFakekeyGbKhClud5BAdrp8mE5aMYzif3g+XNRG1+KhoThisIsFakekeyGTb27mDmur37vS7JKeBThisIsFakekeyoInk1yS9CNb8GeOS2+iB/4w86SBU1IhThisIsFakekeySZ6mLpctzNFWveqcdXrL+E5UBX28hNUDPl+/JgwPB5ai8z root@ubuntu"

Change the default bridge

Change the default external bridge from br0 to virbr0. This will make Wistar automatically connect each new VM to the default Libvirt bridge named virbr0, which is a simple way to ensure every VM can access the Internet to download packages and files.

kvm_external_bridge = "virbr0"

Add images

Get some images that you will use to demonstrate a Wistar simulation scenario. In this example, use the Ubuntu Cloud Image, which supports cloud-init and can be configured directly by Wistar, and Brezular’s TinyCore router image, which comes ready to use as a router and can be used to demonstrate the “other” category of devices in Wistar.

Ubuntu Cloud image

Get the Ubuntu Cloud Image from the Ubuntu web site. Run the following commands in a terminal window on the Wistar host computer:

$ sudo su
# cd /opt/wistar/user_images/
# wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img

Brezular’s TinyCore router image

Brezular’s TinyCore Linux router with FRR is stored on Google’s Drive storage service. Download it from Brezular’s Linux Core Appliances page, or download it directly from Google Drive using the Python gdown script, as shown below:

# apt install python3-pip qemu-utils
# pip install gdown
# cd /opt/wistar/user_images/
# gdown https://drive.google.com/uc?id=1VGr3JGwDVGJVxoGZoisBtoANozKVvRnM

After you download Brezular’s TinyCore router image, convert it to QCOW2 image format:

# qemu-img convert -f vmdk -O qcow2 corepure64-9.0-ffr5.0.1.vmdk coreos-9.0.1.qcow2

Brezular userids and passwords

The Brezular router comes with a user already set up. The username is tc, with no password. The system’s root password is tc.

Wistar Image Types

When you import or upload an image into Wistar you must define the image type. Wistar lists many Juniper image types in the drop-down menu but there are only three types that support Linux VMs: “Other”, “Ubuntu 16”, and “Linux”.

Other

The Other type supports QCOW2 disk images. It also tells Wistar to use the traditional interface naming scheme (eth0, eth1) when referring to ports on instances created from the image. It also assumes the image does not support cloud-init.

Instances created from the Other image type can support many interfaces. You must use the Other image type for open-source routers that will need more than two interfaces.

Ubuntu 16

The Ubuntu 16 image type is compatible with Ubuntu 18.04 Cloud images. The Ubuntu 16 type tells Wistar to use the predictable interface naming scheme (ens3, ens4) for instances created from that image, and also enables Wistar to perform basic configuration from the user interface. I suspect the Ubuntu 16 type will work for any Linux distribution that supports configuration via cloud-init.

Instances created from the Ubuntu 16 image type only support two interfaces: one management interface and one interface that connects to a router in the topology. This is an unfortunate restriction, because any Linux node can be configured as a router.

Linux

The Linux type is used for Linux distributions that use the traditional interface naming scheme, and that support cloud-init. You may also define SSH, Netconf, or console scripts that run when an instance is created from a Linux image. See the Templates menu for more options for scripts.

Instances created from the Linux image type only support two interfaces: one management interface and one interface that connects to a router in the topology.

Import images into Wistar

In the Wistar user interface, click on the Images link.

If you are running Wistar on your local Linux PC and you already copied image files to your local disk, as I have done, then click on Define Local Image. If you are working on a VM or with a remote server, you may use the Upload Image menu command to copy image files from your local PC to the VM or server.

Brezular router image (Other type)

Import the Brezular TinyCore router image into Wistar. Click on the Images link, then click on Define Local Image.

In the New Local Image screen, give the image a name, choose the type, and enter the location of the image file. In the example below, you import the Brezular router image as type Other:

Click on Submit Query. When Wistar registers the local image into its database.

Ubuntu 18.04 Cloud Image (Ubuntu16 type)

Repeat the Define Local Image process for the Ubuntu 18.04 Cloud Image. This time, choose the type Ubuntu 16.

In the Images screen, you should see all the images you imported into Wistar:

Create a network topology

Create a network topology. Click on Topologies and then choose Create Topology from the menu.

A blank canvas appears. You may add virtual machine instances or bridges to the network and connect them together. You may also annotate the network topology by adding labels and shapes.

Add a VM

Add the first node to the topology by clicking on the Add VM link.

In the VM configuration panel, name the VM instance and choose its base image — which is one of the two images you previously imported. In the example below, we chose the Ubuntu Cloud Image.

After you click on the Add and Close button, you will see a new server icon on the canvas.

Bridges

Because I configured the default external bridge to be virbr0 in Wistar’s configuration.py file, Wistar will connect each new VM created to the bridge virbr0. So, I do not need to add a bridge to the topology to enable external connectivity.

I might add a private bridge later, to test how they work.

Save and run the topology

Click on the Save link to save the topology. In the dialog box, give the topology a name and description, then click Save.

The next screen that appears is full of options. Frankly, the user interface is a bit of a mess, with buttons all over the place and, if your browser window is too small, you will not see some of them, like the Re-Deploy button on the bottom right of the screen.

To define the VMs in the host’s hypervisor, click on the Deploy to Hypervisor button. This will define the Ubuntu VM you created in Libvirt.

The screen changes to show more information and options. For example, you can now see the KVM status of each element in the topology. You may start individual VMs by clicking on the buttons next to them in the KVM Deployment Status section, or you may start all VMs in the topology by clicking on the Start Topology button, as shown below:

You will be asked to enter the number of seconds between starting domains. Change the number of seconds between starting domains to a low number, like 10 seconds, when working with small numbers of open-source routers.

Next, it asks you if you are sure you want to start all routers at the same time. This warning must be hard-coded into Wistar. I just click OK:

You can now see that the VM has been defined in Libvirt and is running.

You can confirm this by opening a terminal window on the Wistar host PC and running the virsh list command:

# virsh list
 Id    Name                State
----------------------------------------
 13    t5_ubuntu1          running

Login to network nodes

Wistar offers many ways to send commands to running instances. In the Wistar user interface, you may click on an icon next to any node to open a VNC console. You may also run commands on nodes directly from the Wistar GUI. In addition, you may access nodes from a terminal on the Wistar host PC using either SSH or the Libvirt console. I’ll descibe all the access methods below.

Access methods differ depending on image type

Instances created from the Ubuntu 16 and Other image types support VNC access, terminal access, and the Linux CLI Automation function. The Linux image type supports two additional tools to run CLI commands from the Wistar GUI: Execute CLI and Instance Tools. I think these additional options are missing from the Ubuntu 16 and Other types due to a bug in Wistar. This is a bit of a shame, since they make it easier to quickly interact with nodes in the topology.

VNC Console

To log into an instance using the VNC console, click on the VNC icon next to the instance name in the Wistar GUI, as shown below:

This will open up a pop-up window in your browser that displays the VNC console of the node:

NOTE: VNC console does not work if you run Wistar in a VM or on a remote server. It seems that Wistar does not allow for firewall or NAT issues.

SSH

Wistar does not support SSH directly from its user interface. You must open a terminal window on the Wistar host PC to use SSH. I prefer to use SSH to connect to the VMs created by Wistar because it allows me to use copy-and-paste.

To use SSH, you must first determine the management IP address assigned to the VM. You should see the IP address of the VM displayed in the Wistar topology canvas. In this example, you should see that the Ip address of the node ubuntu1 is 192.168.122.12.

We already know the ubuntu1 node’s userid is wistar and the password is Clouds123 so, we login with the command:

# ssh wistar@192.168.122.12

Libvirt console

There is no built-in console function in Wistar but, since Wistar uses Libvirt to implement the instances in the topology, you can open a terminal window and use the virsh console command. If SSH will not connect to a node (for example, the Brezular router needs some extra configuration to enable an OpenSSH server), then the virsh console command will work, as long as the network appliance has a serial interface configured.

First, find the VM name created by Wistar:

# virsh list
 Id    Name                 State
------------------------------------------
 1     t5_ubuntu1           running

We see Wistar named our VM t5_ubuntu1. I suspect Wistar appended the suffix t5_ to the VM name because this is the third topology I created using this installation of Wistar. Then connect to the VM’s console:

# virsh console t5_ubuntu1

Login to ubuntu1 using userid wistar and password Clouds123. Then you should see the instance’s command line prompt:

ubuntu1:~$

Linux CLI Automation – all nodes

Another way to run commands on Linux instances managed by Wistar is to click on the Linux CLI Automation button, as shown below:

This opens up a dialog box that allows you to enter in a command, or string of commands separated by semicolons, that will run on all linux VMs in the topology. I can see this being useful for checking status of VM features or to start a script on every node at the same time, but it would not be suitable for configuration.

To run the command, enter it in the command window at the top of the dialog box, then click the Execute button. See an example, below, where we see output from the one running Instance: ubuntu1.:

Access methods for the Linux image type

The following access methods work only for instances created using the Linux image type. Again, I think that’s a bug. If you create nodes using the linux image type you will see the following access methods available in the Wistar GUI:

Instance Tools

The Instance Tools appear when you click on a Linux node. You may need to scroll down to see them. The Execute Script tool allows you to upload a script from your local PC and execute it on the selected node. This may be useful for pushing and running configuration or test scripts to the selected node.

Execute CLI

You may run Linux commands on instances managed by Wistar using the Execute CLI feature in the Wistar GUI. Click on the node to select it. You should see the option in the GUI change and you will have options related to the node you selected.

Under the Execute cli section, enter the command you wish to run and click on the Go button. Wistar runs this command on the node and returns the result in the window below.

This is useful for simple configurations but we need a real interactive terminal to perform most actions on a Linux node.

Testing the node’s Internet connection

Once logged into the VM instance, test its connectivity to the Internet. I tried pinging the Google DNS server and running the apt update command. Both worked.

ubuntu1:~$ ping -c 1 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=51 time=8.64 ms
... output deleted ...

ubuntu1:~$ sudo apt update
[sudo] password for wistar:
Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease
... output deleted ...

Also, I checked the bridge and virtual connections by running the following commands on the Wistar host PC:

# virsh domiflist t5_ubuntu1
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vnet0      bridge     virbr0     virtio      52:54:00:05:00:00

# brctl show virbr0
bridge name  bridge id          STP enabled  interfaces
virbr0       8000.52540067f842  yes          virbr0-nic
                                             vnet0

From the results above, I see Wistar connected the node to the virbr0 bridge.

You may also verify the IP address assigned to the VM by the DHCP server running on Libvirt’s default network, which manages the bridge virbr0. Use the virsh net-dhcp-leases command:

# virsh net-dhcp-leases default

This will provide output similar to the following:

# virsh net-dhcp-leases default
 Expiry Time          MAC address        Protocol  IP address         Hostname    Client ID or DUID
--------------------------------------------------------------------------------------------------------
 2019-04-23 17:55:15  52:54:00:05:00:00  ipv4      192.168.122.12/24  t5_ubuntu1  ff:b5:5e:67:ff:00:02:00:00:ab:11:69:67:57:54:ae:d3:12:a2

You can see the the IP address of VM t5_ubuntu1 is: 192.168.122.12.

Add a node to the topology

One node all by itself is not very interesting. Wistar allows you to add nodes to an existing topology so let’s add a router and connect it to our ubuntu1 VM.

Click on the Add Instance button at the top of the canvas.

In the Add VM dialog box, choose the TinyCore-Router image and name it router1.

The router icon appears on the canvas. Move it beside the ubuntu1 node. Then, create a link between nodes by clicking on the circle in the middle of one node and dragging the circle to the middle of the other node.

To save the new node and link in the topology and to define them in the hypervisor, click on the Re-Deply button in the lower right corner of the Wistar user interface.

You will see a warning that tells you the VMs must be restarted. Wistar will not restart the running VMs for you. The warning box is telling you that you need to restart the VMs, yourself.

It’s a bit confusing, but the warning box is really a call to action. You must destroy and start any running VM that has a link added to it or the interface will not be set up properly on that VM.

We added a link to the running VM ubuntu1. We must stop it, then start it again. Just logging into it and rebooting is not sufficient.

To stop the VM, find it in the KVM Deployment Status and click on the green button. This will power off the VM.

Wait a couple of minutes for the ubuntu1 VM to power down. Then, start the topology by clicking on the Start Topology button. This will start both nodes and, when they start, the link will be enabled at each end.

Next, we need to configure the interfaces on each VM. Click on the link between VMs to get the interface information. See the link details appear on the left side of the Wistar screen.

In this case, we see that ubuntu1 is connected on interface ens4 and router1 is connected on interface eth0. Interface ens4 refers to the second interface on ubuntu1 and eth0 refers to the first interface on router1.

The Wistar GUI option to configure IP addresses on links between source and target devices only works if they were created from the Linux or Ubuntu 16 image type. Since one of the nodes in our example was created from Other image type, we cannot set up IP addresses this way. Also, configurations made from the Wistar GUI are not permanent and will disappear if the nodes are rebooted. It is better to make configurations manually by logging into each node and editing configuration files.

Check the names of each VM. On the Wistar host PC, run the command:

# virsh list

You should see two VMs running:

 Id    Name                           State
----------------------------------------------------
 15    t5_ubuntu1                     running
 16    t5_router1                     running

Log into ubuntu1.

# virsh console t5_ubuntu1

ubuntu1~$ sudo su
ubuntu1~# ip addr add 10.1.1.2/24 dev ens4
ubuntu1~# ip link set dev ens4 up

Exit from the console using the CTRL – ] key combination. Then, log into router1:

# virsh console t5_router1

router1~$ sudo su
router1~# ip addr add 10.1.1.3/24 dev eth0
router1~# ip link set dev eth0 up

Then test the connection to ubuntu1 by pinging the far-end IP address:

router1~# ping 10.1.1.2
PING 10.1.1.2 (10.1.1.2): 56 data bytes
64 bytes from 10.1.1.2: seq=0 ttl=64 time=0.684 ms
64 bytes from 10.1.1.2: seq=1 ttl=64 time=0.640 ms

Exit from the console using the CTRL – ] key combination.

You verified that the connection is set up and passes traffic between the two nodes in the Wistar network emulator topology.

Stopping an emulation

To stop the nodes running in a topology, click on the Pause Topology button or shut them down one at a time by clicking on the Status icon next to each instance in the KVM Deployment Status section.

After a while, the nodes will be stopped. The KVM Deployment status will change to the color red.

Stopping Wistar

The only part of Wistar that is running as a daemon is the web server. You can stop the web server by returning to the terminal window in which it is running and pressing the CTRL – C key combination.

To exit the Wistar virtual environment, run the deactivate command in the same terminal window, as shown below:

(wistar) # deactivate

Create more complex topologies

Let’s see what happens if we create a complex topology. Since we observed that adding nodes to a save topology does not work well, we’ll build a new topology that includes six nodes: three routers and three hosts.

Start Wistar

First, start Wistar’s again (assuming you deactivated it in the previous chapter). On the Wistar host PC, open a terminal window and run the following commands:

# source /opt/wistar/bin/activate
(wistar) #
(wistar) # cd /opt/wistar/wistar
(wistar) # ./manage.py runserver 0.0.0.0:8080

Then, open your web browser and enter the URL, http://localhost:8080.

Create new topology

You will see the Topology page and the list of available topologies. Click on Create Topology in the Topologies menu:

Then, click on Add VM. Choose the Ubuntu18 image and name the VM ubuntu1. Next, click on the Add Another Instance button. This will create a VM named ubuntu1 and then increment the form so a new VM named ubuntu2 is ready to be created. Click the Add Another Instance button again and see the form increment to ubuntu3. Finally, click on the Add and Close button to create the third instance and close the Add VM dialog box.

Repeat the process for the router1 VM and choose the TinyCore-Router image. Again, click the Add Another Instance button twice and finish by clicking on the Add and Close button. When you are done, you should see six nodes on the canvas, as shown below.

Reposition the nodes by clicking on them and dragging them around. Connect the nodes together by clicking and dragging the circles in the center of each node to the other nodes. I created a simple network of routers with an ubuntu VM connected to each router, as shown below. Click on the Save link.

Give the topology a name and description in the dialog box that appears.

Record port numbers

At this point, it will be helpful to understand which interfaces on each node are connected to which link. Click on each link in the topology and write down which interface are connected to the links between nodes.

I made a table I will use later when I need to configure the nodes:

node	interface	to node	interface
ubuntu1	ens4	router1	eth2
ubuntu2	ens4	router2	eth2
ubuntu3	ens4	router3	eth2
router1	eth0	router2	eth0
router2	eth1	router3	eth0
router3	eth1	router1	eth1

Start the network emulation

Define the nodes and networks that comprise the network emulation scenario by clicking on the Deploy to Hypervisor button. Then, start the network emulation by clicking on the Start Topology button.

Again, choose 10 seconds as the time to wait between starting nodes.

Configure the VMs

We’ll be using the console command to log into each VM and configure it. First, get a list of the names for each VM, according to Libvirt. On the Wistar host PC, run the virsh list command:

# virsh list
 Id    Name                 State
-------------------------------------
 18    t6_router1           running
 19    t6_router2           running
 20    t6_router3           running
 21    t6_ubuntu1           running
 22    t6_ubuntu2           running
 23    t6_ubuntu3           running

Since this is the sixth topology I created, Wistar added the suffix t6_ to each node in the topology.

Network plan

I want to create configuration changes that will persist after a node is rebooted. As I’ve discussed in other posts where I use the same three-router topology, you need to plan your IP address and interfaces before you start configuring nodes.

The network planning table I created for this topology is shown below.

VM name	Interface name	IP address
ubuntu1	ens3	DHCP
ubuntu1	ens4	10.1.1.2/24
ubuntu2	ens3	DHCP
ubuntu2	ens4	10.2.2.2/24
ubuntu3	ens3	DHCP
ubuntu3	ens4	10.3.3.2/24
router1	eth0	10.10.10.11/24
router1	eth1	10.30.30.11/24
router1	eth2	10.1.1.1/24
router1	eth3	DHCP
router2	eth0	10.10.10.12/24
router2	eth1	10.20.20.11/24
router2	eth2	10.2.2.1/24
router2	eth3	DHCP
router3	eth0	10.20.20.12/24
router3	eth1	10.30.30.12/24
router3	eth2	10.3.3.1/24
router3	eth3	DHCP

Node configurations

From the network plan, configure each node’s interfaces with the appropriate IP addresses. On the router nodes, enable a routing protocol.

Ubuntu1 configuration

On each ubuntu node, we will install test software, configure the interface connected to the router, and modify the SSh configuration so we can log in to the wistar account without using a password.

# virsh console t6_ubuntu1

ubuntu1~$ sudo su
ubuntu1~# apt update
ubuntu1~# apt install -y traceroute tcpdump nmap

Copy and paste the following text into the ubuntu1 VM’s console:

bash <<EOF2
hostname ubuntu1
rm /etc/hostname
echo "ubuntu1" > /etc/hostname
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
systemctl restart sshd
rm /etc/netplan/01-netcfg.yaml
cat >> /etc/netplan/01-netcfg.yaml << EOF
network:
  version: 2
  renderer: networkd
  ethernets:
    ens3:
      dhcp4: yes
    ens4:
      addresses:
        - 10.1.1.2/24
      #gateway4:    
      routes:
        - to: 10.0.0.0/8
          via: 10.1.1.1
          metric: 100
EOF
chmod 644 /etc/netplan/01-netcfg.yaml
netplan apply
EOF2

Then exit the console with the CTRL – ] key combination.

Ubuntu2 configuration

# virsh console t6_ubuntu2

ubuntu2~$ sudo su
ubuntu2~# apt update
ubuntu2~# apt install -y traceroute tcpdump nmap

Copy and paste the following text into the ubuntu1 VM’s console:

bash <<EOF2
hostname ubuntu2
rm /etc/hostname
echo "ubuntu2" > /etc/hostname
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
systemctl restart sshd
rm /etc/netplan/01-netcfg.yaml
cat >> /etc/netplan/01-netcfg.yaml << EOF
network:
  version: 2
  renderer: networkd
  ethernets:
    ens3:
      dhcp4: yes
    ens4:
      addresses:
        - 10.2.2.2/24
      #gateway4:    
      routes:
        - to: 10.0.0.0/8
          via: 10.2.2.1
          metric: 100
EOF
chmod 644 /etc/netplan/01-netcfg.yaml
netplan apply
EOF2

Then exit the console with the CTRL – ] key combination.

Ubuntu3 configuration

# virsh console t6_ubuntu3

ubuntu3~$ sudo su
ubuntu3~# apt update
ubuntu3~# apt install -y traceroute tcpdump nmap

Copy and paste the following text into the ubuntu1 VM’s console:

bash <<EOF2
hostname ubuntu3
rm /etc/hostname
echo "ubuntu3" > /etc/hostname
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
systemctl restart sshd
rm /etc/netplan/01-netcfg.yaml
cat >> /etc/netplan/01-netcfg.yaml << EOF
network:
  version: 2
  renderer: networkd
  ethernets:
    ens3:
      dhcp4: yes
    ens4:
      addresses:
        - 10.3.3.2/24
      #gateway4:    
      routes:
        - to: 10.0.0.0/8
          via: 10.3.3.1
          metric: 100
EOF
chmod 644 /etc/netplan/01-netcfg.yaml
netplan apply
EOF2

Then exit the console with the CTRL – ] key combination.

router1 configuration

The router nodes are bsed on TinyCore so the configuration steps are different than normal Linux. For more information about how configuration on TinyCore works, see my post about persistent configuration changes in TinyCore Linux.

For each router, set up SSH and use FRR to configure networking and routing. The Wistar network emulator did not set up the SSH keys on the router nodes because they are the “Other” node type and are not compatible with cloud-init. In the listing below, replace the SSH public key with your own public key or SSH will not work for you.

Log into router1:

# virsh console t6_router1

router1~$ sudo su
router1~#

Then, paste the following script into the terminal window:

sh <<EOF2
sed -i "2a echo 'router1' > /etc/hostname" /opt/bootlocal.sh
sed -i "3a hostname -F /etc/hostname" /opt/bootlocal.sh
sed -i "4a sed -i 's/box/router1/g' /etc/hosts" /opt/bootlocal.sh
ssh-keygen -A
cp /usr/local/etc/ssh/sshd_config.orig /usr/local/etc/ssh/sshd_config
sed -i 's/#ChallengeResponseAuthentication yes/ChallengeResponseAuthentication no/g' /usr/local/etc/ssh/sshd_config
sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/g' /usr/local/etc/ssh/sshd_config
sed -i 's/#UsePAM no/UsePAM no/g' /usr/local/etc/ssh/sshd_config
echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6ZPy25MpxGnThisIsFakekeyGbKhClud5BAdrp8mE5aMYzif3g+XNRG1+KhoThisIsFakekeyGTb27mDmur37vS7JKeBThisIsFakekeyoInk1yS9CNb8GeOS2+iB/4w86SBU1IhThisIsFakekeySZ6mLpctzNFWveqcdXrL+E5UBX28hNUDPl+/JgwPB5ai8z root@ubuntu" > /home/tc/.ssh/authorized_keys
chown tc /home/tc/.ssh/authorized_keys
chmod 600 /home/tc/.ssh/authorized_keys
/usr/local/sbin/sshd
rm /usr/local/etc/frr/babeld.conf
touch /usr/local/etc/frr/babeld.conf
rm /usr/local/etc/frr/bgpd.conf
touch /usr/local/etc/frr/bgpd.conf
rm /usr/local/etc/frr/eigrpd.conf
touch /usr/local/etc/frr/eigrpd.conf
rm /usr/local/etc/frr/isisd.conf
touch /usr/local/etc/frr/isisd.conf
rm /usr/local/etc/frr/ldpd.conf
touch /usr/local/etc/frr/ldpd.conf
rm /usr/local/etc/frr/ospf6d.conf
touch /usr/local/etc/frr/ospf6d.conf
rm /usr/local/etc/frr/ospfd.conf
cat >> /usr/local/etc/frr/ospfd.conf << EOF
frr version 5.0.1
frr defaults traditional
log stdout
router ospf
 ospf router-id 1.1.1.1
 passive-interface eth2
 network 10.1.1.0/24 area 0
 network 10.10.10.0/24 area 0
 network 10.30.30.0/24 area 0
line vty
EOF
rm /usr/local/etc/frr/pimd.conf
touch /usr/local/etc/frr/pimd.conf
rm /usr/local/etc/frr/ripd.conf
touch /usr/local/etc/frr/ripd.conf
rm /usr/local/etc/frr/ripngd.conf
touch /usr/local/etc/frr/ripngd.conf
rm /usr/local/etc/frr/vtysh.conf
touch /usr/local/etc/frr/vtysh.conf
rm /usr/local/etc/frr/zebra.conf
cat >> /usr/local/etc/frr/zebra.conf << EOF
frr version 5.0.1
frr defaults traditional
hostname router1
password zebra
enable password zebra
interface eth0
 ip address 10.10.10.11/24
interface eth1
 ip address 10.30.30.11/24
interface eth2
 ip address 10.1.1.1/24
line vty
EOF
filetool.sh -b
EOF2

Then reboot the router.

# sudo reboot

Exit the VM using the CTRL – ] key combination.

router2 configuration

Log into router2:

# virsh console t6_router2

router2~$ sudo su
router2~#

Then, paste the following script into the terminal window:

sh <<EOF2
sed -i "2a echo 'router2' > /etc/hostname" /opt/bootlocal.sh
sed -i "3a hostname -F /etc/hostname" /opt/bootlocal.sh
sed -i "4a sed -i 's/box/router2/g' /etc/hosts" /opt/bootlocal.sh
ssh-keygen -A
cp /usr/local/etc/ssh/sshd_config.orig /usr/local/etc/ssh/sshd_config
sed -i 's/#ChallengeResponseAuthentication yes/ChallengeResponseAuthentication no/g' /usr/local/etc/ssh/sshd_config
sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/g' /usr/local/etc/ssh/sshd_config
sed -i 's/#UsePAM no/UsePAM no/g' /usr/local/etc/ssh/sshd_config
echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6ZPy25MpxGnThisIsFakekeyGbKhClud5BAdrp8mE5aMYzif3g+XNRG1+KhoThisIsFakekeyGTb27mDmur37vS7JKeBThisIsFakekeyoInk1yS9CNb8GeOS2+iB/4w86SBU1IhThisIsFakekeySZ6mLpctzNFWveqcdXrL+E5UBX28hNUDPl+/JgwPB5ai8z root@ubuntu" > /home/tc/.ssh/authorized_keys
chown tc /home/tc/.ssh/authorized_keys
chmod 600 /home/tc/.ssh/authorized_keys
/usr/local/sbin/sshd
rm /usr/local/etc/frr/babeld.conf
touch /usr/local/etc/frr/babeld.conf
rm /usr/local/etc/frr/bgpd.conf
touch /usr/local/etc/frr/bgpd.conf
rm /usr/local/etc/frr/eigrpd.conf
touch /usr/local/etc/frr/eigrpd.conf
rm /usr/local/etc/frr/isisd.conf
touch /usr/local/etc/frr/isisd.conf
rm /usr/local/etc/frr/ldpd.conf
touch /usr/local/etc/frr/ldpd.conf
rm /usr/local/etc/frr/ospf6d.conf
touch /usr/local/etc/frr/ospf6d.conf
rm /usr/local/etc/frr/ospfd.conf
cat >> /usr/local/etc/frr/ospfd.conf << EOF
frr version 5.0.1
frr defaults traditional
log stdout
router ospf
 ospf router-id 2.2.2.2
 passive-interface eth2
 network 10.2.2.0/24 area 0
 network 10.20.20.0/24 area 0
 network 10.30.30.0/24 area 0
line vty
EOF
rm /usr/local/etc/frr/pimd.conf
touch /usr/local/etc/frr/pimd.conf
rm /usr/local/etc/frr/ripd.conf
touch /usr/local/etc/frr/ripd.conf
rm /usr/local/etc/frr/ripngd.conf
touch /usr/local/etc/frr/ripngd.conf
rm /usr/local/etc/frr/vtysh.conf
touch /usr/local/etc/frr/vtysh.conf
rm /usr/local/etc/frr/zebra.conf
cat >> /usr/local/etc/frr/zebra.conf << EOF
frr version 5.0.1
frr defaults traditional
hostname router2
password zebra
enable password zebra
interface eth0
 ip address 10.10.10.12/24
interface eth1
 ip address 10.20.20.11/24
interface eth2
 ip address 10.2.2.1/24
line vty
EOF
filetool.sh -b
EOF2

Then reboot the router.

# sudo reboot

Exit the VM using the CTRL – ] key combination.

router3 configuration

Log into router3:

# virsh console t6_router3

router3~$ sudo su
router3~#

Then, paste the following script into the terminal window:

sh <<EOF2
sed -i "2a echo 'router3' > /etc/hostname" /opt/bootlocal.sh
sed -i "3a hostname -F /etc/hostname" /opt/bootlocal.sh
sed -i "4a sed -i 's/box/router3/g' /etc/hosts" /opt/bootlocal.sh
ssh-keygen -A
cp /usr/local/etc/ssh/sshd_config.orig /usr/local/etc/ssh/sshd_config
sed -i 's/#ChallengeResponseAuthentication yes/ChallengeResponseAuthentication no/g' /usr/local/etc/ssh/sshd_config
sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/g' /usr/local/etc/ssh/sshd_config
sed -i 's/#UsePAM no/UsePAM no/g' /usr/local/etc/ssh/sshd_config
echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6ZPy25MpxGnThisIsFakekeyGbKhClud5BAdrp8mE5aMYzif3g+XNRG1+KhoThisIsFakekeyGTb27mDmur37vS7JKeBThisIsFakekeyoInk1yS9CNb8GeOS2+iB/4w86SBU1IhThisIsFakekeySZ6mLpctzNFWveqcdXrL+E5UBX28hNUDPl+/JgwPB5ai8z root@ubuntu" > /home/tc/.ssh/authorized_keys
chown tc /home/tc/.ssh/authorized_keys
chmod 600 /home/tc/.ssh/authorized_keys
/usr/local/sbin/sshd
rm /usr/local/etc/frr/babeld.conf
touch /usr/local/etc/frr/babeld.conf
rm /usr/local/etc/frr/bgpd.conf
touch /usr/local/etc/frr/bgpd.conf
rm /usr/local/etc/frr/eigrpd.conf
touch /usr/local/etc/frr/eigrpd.conf
rm /usr/local/etc/frr/isisd.conf
touch /usr/local/etc/frr/isisd.conf
rm /usr/local/etc/frr/ldpd.conf
touch /usr/local/etc/frr/ldpd.conf
rm /usr/local/etc/frr/ospf6d.conf
touch /usr/local/etc/frr/ospf6d.conf
rm /usr/local/etc/frr/ospfd.conf
cat >> /usr/local/etc/frr/ospfd.conf << EOF
frr version 5.0.1
frr defaults traditional
log stdout
router ospf
 ospf router-id 3.3.3.3
 passive-interface eth2
 network 10.3.3.0/24 area 0
 network 10.20.20.0/24 area 0
 network 10.30.30.0/24 area 0
line vty
EOF
rm /usr/local/etc/frr/pimd.conf
touch /usr/local/etc/frr/pimd.conf
rm /usr/local/etc/frr/ripd.conf
touch /usr/local/etc/frr/ripd.conf
rm /usr/local/etc/frr/ripngd.conf
touch /usr/local/etc/frr/ripngd.conf
rm /usr/local/etc/frr/vtysh.conf
touch /usr/local/etc/frr/vtysh.conf
rm /usr/local/etc/frr/zebra.conf
cat >> /usr/local/etc/frr/zebra.conf << EOF
frr version 5.0.1
frr defaults traditional
hostname router3
password zebra
enable password zebra
interface eth0
 ip address 10.20.20.12/24
interface eth1
 ip address 10.30.30.12/24
interface eth2
 ip address 10.3.3.1/24
line vty
EOF
filetool.sh -b
EOF2

Then reboot the router.

# sudo reboot

Test the network

Now you can run some simple network tests to ensure your setup is correct. List the route table on the routers. Ping from ubuntu1 to ubuntu2 and ubuntu3. Then, you may configure more complex network scenarios.

You may also test that SSH is working by connecting to nodes using ssh.

Test failure scenarios by stopping one of the routers and using the traceroute command to see that the routers converges on a new topology.

You might emulate a broken link by stopping the related bridge but the Wistar GUI does not show you which bridge is related to which link. You would have to do perform some checking at the command line. You can see which bridge is connected to a node using the virsh domiflist command. For example, if you see a bridge is connected to both router2 and router3, then you know it is the bridge related to the link between them.

Once you map out the bridges, you can use the Wistar user interface to stop and restart them.

For example, run the following commands on the Wistar host PC:

# virsh domiflist t6_router2
Interface Type    Source  Model   MAC
---------------------------------------------------
vnet4     bridge  t6_br1  virtio  52:54:00:06:00:04
vnet5     bridge  t6_br2  virtio  52:54:00:06:00:06
vnet6     bridge  t6_br5  virtio  52:54:00:06:00:10
vnet7     bridge  virbr0  virtio  52:54:00:06:00:16

# virsh domiflist t6_router3
Interface  Type   Source  Model   MAC
---------------------------------------------------
vnet8     bridge  t6_br2  virtio  52:54:00:06:00:07
vnet9     bridge  t6_br3  virtio  52:54:00:06:00:0a
vnet10    bridge  t6_br6  virtio  52:54:00:06:00:13
vnet11    bridge  virbr0  virtio  52:54:00:06:00:17

You see that the bridge t6_br2 is the common bridge, so it the link between router2 and router3.

We can verify that with the brctl show command:

# brctl show t6_br2
bridge name  bridge id          STP enabled  interfaces
t6_br2       8000.525400060008  yes          t6_br2-nic
                                             vnet5
                                              vnet8

Power off the bridge in Wistar and see that you can no longer send traffic on that link.

Libvirt detaches the bridge’s virtual interfaces when you stop it and Wistar does not re-connect them to the bridge when you start the bridge again.

The brctl show command shows us that the virtual interfaces are no longer attached to the bridge:

# brctl show t6_br2
bridge name  bridge id           STP enabled  interfaces
t6_br2       8000.525400060008   yes          t6_br2-nic

You will need to manually re-attach the virtual interfaces. We know, from the data we collected above, that bridge t6_br2 was connected to vnet5 on t6_router2 and to vnet8 on t6_router3. We can reconnect them with the commands:

# brctl addif t6_br2 vnet5
# brctl addif t6_br2 vnet8

Or, you may use Libvirt commands:

# virsh detach-interface t6_router3 bridge --mac 52:54:00:06:00:07
# virsh detach-interface t6_router2 bridge --mac 52:54:00:06:00:06
# virsh attach-interface t6_router3 bridge t6_br2 --mac 52:54:00:06:00:07 --model virtio
# virsh attach-interface t6_router2 bridge t6_br2 --mac 52:54:00:06:00:06 --model virtio

This makes me wonder why Wistar provides any control of Libvirt networks in the user interface. It seems kind of useless to me.

Wistar and switching software

Wistar uses the KVM and the Linux kernel to build virtual machines and emulate network links between them. The Linux kernel blocks some switching protocols, like LLDP. This does not affect most users but, for those who are investigating Layer 2 protocols, Robin Gilijamse, a network architect and blogger working in the Netherlands, wrote a couple of posts that describe how to modify the Linux kernel to allow LLDP and other layer-2 signaling protocols to pass between VMs over a Linux bridge. I list both of his relevant blog posts, below:

Saving node configurations

Wistar creates a separate disk image for each instance in a topolgy and stores them in the /opt/wistar/user-images/intances directory. If you make configuration changes to nodes in your topology, they are saved on the instance disk images.

After you stop a topology and exit Wistar, the topology disk images are preserved. When you start the topology again, it starts the nodes from their saved disk images in the instances directory.

Exporting configurations

Commercial routers like Juniper’s may be configured by importing a text file containing configuration commands. Wistar supports exporting this file from Juniper images in the topology. Click on the Create button under Saved Config Sets in the lower left corner of the Wistar user interface.

However, Wistar does not provide any way to export Linux VM or Linux router configurations in a text format. This makes it difficult to share a topology consisting of open-source routers with colleagues working on other systems.

Conclusion

I showed you how to Wistar to emulate a network of open-source routers. I demonstrated Wistar features and showed how it integrates Libvirt as its virtualization back end.

While using Wistar, I found myself using the Libvirt command line and other terminal commands fairly often. Wistar is probably more impressive when using it with Juniper router images. The Wistar network emulator works well with open-source router images, but it has some rough edges.

↧

Install the Antidote (NRE Labs) network emulator on a Linux system

May 13, 2019, 9:39 am

≫ Next: Create lab lessons for the NRE Labs Antidote network emulator

≪ Previous: The Wistar network emulator

Antidote is a network emulator combined with a presentation framework designed to create and deliver networking technology training. Its user interface operates in a web browser, including the terminals that students use to run commands on emulated network devices and servers.

Antidote is the engine that runs the Network Reliability Labs web site. Antidote is an open-source project, released under the Apache license. A standalone version of Antidote may be installed and run on your personal computer using the selfmedicate script. In this post, I will install Antidote on my Linux laptop and make a few changes that improve Antidote performance on my Linux system.

Antidote documentation

The Antidote documentation is being expanded regularly but, at the time I am writing this, the most helpful information is in the NRE Labs blog and in the videos produced by the developers. Most of these are accessible from the NRE Labs Community Resources page.

Also, Antidote is in active development and it is changing quickly as the developers create new features and content. Keep that in mind when following this blog post. Some things may already have changed about the way Antidote installs or operates.

Install prerequisite software

Antidote requires that you install some prerequisite software before you try to run Antidote. I am changing the documented installation procedure a bit and installing KVM instead of VirtualBox so Antidote can use nested virtualization on my laptop. Then, I install Minikube and Kubectl so I can mimic a Kubernetes system on my laptop. Don’t worry; you do not need to know Kubernetes to use Antidote.

Check for virtualization support

Verify your computer has hardware support for virtualization. Enter the following commands in the VM’s terminal.

grep -cw vmx /proc/cpuinfo

It should return a value equal to the number of virtual cores on your processor. If it returns 0, then something is wrong.

We need to use nested virtualization because we are running network nodes inside a VM created by Antidote. Check the nested virtualization settings on the host computer:

cat /sys/module/kvm_intel/parameters/nested

The output should be “Y”. If it is “N”, run the following commands to enable nested virtualization:

sudo modprobe -r kvm_intel
sudo modprobe kvm_intel nested=1

To make the changes persistent after a reboot, edit the file /etc/modprobe.d/kvm.conf and modify the following line to:

echo "options kvm_intel nested=1" | sudo tee /etc/modprobe.d/kvm.conf

Install virtualization software

Install KVM and Libvirt on your Linux system:

sudo apt update
sudo apt -y install libvirt-clients libvirt-daemon-system qemu-kvm
sudo systemctl enable libvirtd.service
sudo systemctl start libvirtd.service
sudo usermod -a -G libvirt $USER
newgrp libvirt

Docker KVM2 Driver

Install the Docker KVM driver:

curl -LO https://storage.googleapis.com/minikube/releases/latest/docker-machine-driver-kvm2
sudo install docker-machine-driver-kvm2 /usr/local/bin/
rm docker-machine-driver-kvm2

Docker (Optional)

I need Docker so I can build my own images for Antidote. If you do not plan to create new images, you do not need to install Docker on your PC so the following section is optional.

Install Docker using the following commands:

sudo apt -y install docker.io
sudo usermod -aG docker $USER
newgrp docker

Kubernetes

Antidote uses Kubernetes to orchestrate the containers in its network emulation labs. You can run Kubernetes on a laptop computer using Minikube and Kubectl.

Install Kubectl. The Antidote-selfmedicate instructions require version v1.13:

curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.13.0/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo cp ./kubectl /usr/local/bin
rm kubectl

Install Minikube. The Antidote-selfmedicate instructions require version v0.34.1

curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.34.1/minikube-linux-amd64
chmod +x minikube
sudo cp minikube /usr/local/bin
rm minikube

Download and install Antidote

Antidote requires you to create a lessons directory before you install it. Download the nrelabs-curriculum directory from the NRE Labs Github repository.

cd ~
git clone https://github.com/nre-learning/nrelabs-curriculum

Download the antidote-selfmedicate directory. You must install this directory at the same point in your filesystem as the nrelabs-curriculum directory. That is, each of these should be sub-directories of the same directory.

cd ~
git clone https://github.com/nre-learning/antidote-selfmedicate
cd antidote-selfmedicate

Change Selfmedicate hypervisor to KVM

Antidote-selfmedicate uses Minikube’s default hypervisor, Virtualbox, but the latest version of VirtualBox supports nested virtualization only on AMD processors but my laptop uses an Intel processor. On a Linux system, the KVM hypervisor supports nested virtualization on both Intel and AMD processors.

I will use the KVM hypervisor, which supports nested virtual network nodes on the Antidote VM on my Linux laptop. When using nested KVM virtual machines, you should expect to see less than a ten percent performance penalty, compared to normal virtual machines.

To use KVM, I changed the Minikube start options in the selfmedicate.sh script using the command shown below:

sed -i 's/minikube start/minikube start --vm-driver kvm2/g' selfmedicate.sh

Optional step: Modify the size of the Antidote VM

I am using my 9-year old Lenovo T420 laptop, which only has two hyper-threaded CPU cores and 8 GB memory. It is not powerful enough to run the standard Antidote VM so I edit the selfmedicate.sh script to make the VM smaller. If you have a modern laptop with 16 GB of memory, you may skip this section.

After some experiments¹, I found that running an idle Antidote VM with 4 GB of RAM, alongside the other tools I use, results in the general performance of my laptop staying in an acceptable range.

I also need to respect the KVM Performance Limits for virtual CPU cores. I reduced the VM requirements from 4 CPU and 8 GB of memory to 2 CPU and 4 GB of memory:

sed -i 's/--cpus 4 --memory 8192/--cpus 2 --memory 4096/g' selfmedicate.sh

This reduced configuration will not run the VQFX images that come pre-installed with Antidote, so I will not be able to experiment with the labs that require them. This is OK because I am not interested in working with commercial router images. Instead, I am planning to build and install my own images into Antidote and run open-source routers. Open-source routers require fewer resources.

Start Antidote

After I modified it to meet my requirements, I ran the selfmedicate.sh script to start Antidote:

$ cd ~/antidote-selfmedicate
$ ./selfmedicate.sh start

This takes a long time because it downloads some large files. Be patient. The output will look like the screenshot below:

After it is running, you may access the Antidote web interface running on your PC by opening a web browser and entering the following URL:

http://antidote-local:30001

You should see a web page that looks and operates like the NRE Labs web page:

At this point, you can experiment with the existing NRE Labs lessons on your laptop. Remember, if you modified the size of the Minikube VM in the selfmedicate.sh script, you may not be able to run the lessons that use the VQFX routers.

Loading lessons

The selfmedicate.sh script does not load the lesson content into the Antidote VM when it starts. You must run the reload command to load in the lessons for the first time.

$ cd ~/antidote-selfmedicate
$ ./selfmedicate.sh reload

When you update lessons or add new lessons, you must run the reload command, again. Then, refresh your browser window to see the new lessons.

Stopping and restarting Antidote

You should not need to run the start option, again. From now on, only use the stop and resume options.

To stop the Antidote environment, run:

./selfmedicate.sh stop

To restart Antidote, run:

./selfmedicate.sh resume

If you use the start option, the selfmedicate.sh script will delete your existing Antidote VM and re-install everything. This is a good way to recover from a problem, but it is not something you need to do every time.

Creating labs for Antidote

In a future post, I’ll work through the process of creating a new lesson, using open-source router images instead of commercial router images. I still need to spend some time figuring this out.

At the time I wrote this post, the Antidote documentation does not cover much information about how to create new Antidote labs. The NRE Labs blog and videos on the NRE Labs Community Resources page provide some good information about building labs for Antidote. For example, read the post on the NRE Labs blog about how NRE Labs implements curriculum as code. You can also reverse-engineer existing lab lessons.

I found some useful information about making Docker images for Antidote on the OpenJNPR-Container-vMX project on GitHub and there’s also a VQFX image to reverse-engineer on the nre-learning GitHub page.

Conclusion

Antidote is a promising new platform upon which users can create classroom-style labs that are combined with documented procedures. It provides an easy-to-use web-based interface that makes it easy for students to log into the lab nodes using only a web browser. It is available as a free web service and can also be run on your laptop, as shown in this post.

Interestingly, Antidote is focused on the tools used to manage and support a network and used the network nodes only as a target for management. So, it does not offer “traditional” tools that allow manipulation or inspection of the network, such as a network graph tool or Wireshark. You are expected to define the network nodes and their connections in a YAML file and then use a Utility VM to run various network automation tools.

Antidote seeks to make it easy for users to create “next generation” network lab scenarios, where the focus is more on automation and management, instead of on protocols and configuration.

When I increased the Antidote VM size to 4 GB, 95% of my system’s memory was consumed because I also needed a web browser and a text editor running at the same time.I was using Firefox and VScode on Ubuntu 18.04 desktop. Maybe if I used lighter-weight tools like Midori and vim on an XFCE desktop, I could squeeze out enough space to run a larger VM for Antidote VM. ↩

↧

Create lab lessons for the NRE Labs Antidote network emulator

May 28, 2019, 7:56 pm

≫ Next: Video chat about NRE Labs

≪ Previous: Install the Antidote (NRE Labs) network emulator on a Linux system

The Antidote network emulator, part of the Network Reliability Engineering project, offers a web interface that presents network emulation scenarios to users as documented lessons. Each lesson is presented in a window running Jupyter Notebooks and contains commands that the user can click on to run them on the virtual nodes in the network emulation scenario.

nrelabs lessons

The NRE Labs developers intend for Antidote to be used as an educational tool. Its lesson-focused user interface supports students’ learning progress. This post is a tutorial showing how to create and test two simple, but different, Antidote lessons.

Lab documentation

At the time I wrote this post, the Antidote documentation does not provide enough practical information about how to create new Antidote labs. However, useful information is spread around in a few different locations, which I list below:

Read the documentation about Lesson Endpoints
The NRE Labs blog and videos on the NRE Labs Community Resources page provide some useful information about building labs for Antidote
The Tips and FAQs Page that includes useful information about building lessons
Read the post on the NRE Labs blog about how NRE Labs implements curriculum as code
You can also reverse-engineer existing lab lessons
Look at the source code to see the allowable values for attributes in the lesson files

Warning: Fast moving project!

Antidote is under active development. Its configuration procedures and operation may change significantly from one week to the next. If the procedures in this post don’t work, it is likely that the developers have changed some functionality.

For example, as I write this post, the developers are working on a change to the way nodes are defined in the lesson definition file. A longer-term view of the development plan is available in the Antidote development roadmap, as presented at Interop 2019.

Resume Antidote

I assume you previously installed and started Antidote, as described in my previous blog post about installing the Network Reliability Engineering Labs Antidote network emulator. Resume Antidote with the command shown below:

$ cd ~/antidote-selfmedicate
$ ./selfmedicate.sh resume

Open a web browser and go to the URL http://antidote-local:30001. You will see the Antidote web interface. Click on the Find Lesson Content button, as shown below:

If you followed the procedure from my previous post and also downloaded the nrelabs-curriculum files from the NRE Labs GitHub repository, you should see three categories in the lessons catalog, with lessons listed under each one.

Clicking on a lesson starts it. I showed an example of running an existing lesson in my previous post.

Explore the curriculum directory

NRE Labs creates network emulation scenarios — called lessons — based on text files and images published in a lesson directory. Each lesson is in its own directory.

Have a look at how the NRE Labs lessons are organized in the nrelabs-curriculum/lessons directory. List its contents. See there are three more subdirectories.

$ cd ~/nrelabs-curriculum/lessons
$ ls
fundamentals  tools     workflows

As you can see, the NRE Labs authors have categorized the lessons as fundamentals, tools, or workflows. These match the categories displayed on the Antidote web interface. When creating new lessons, you’ll have to use one of these directories. For now, explore an existing lesson.

Go to the fundamentals directory and explore the file structure of the lesson directories.

$ cd fundamentals
$ ls
lesson-14-yaml   lesson-17-git       lesson-22-python
lesson-16-jinja  lesson-19-restapis  lesson-23-linux
$ cd lesson-17-git
$ ls
lesson.meta.yaml stage1

See that the file that defines the lab topology for each lesson is named lesson.meta.yaml. You should also see one or more stage subdirectories, each of which contains a Markdown file named guide.md and may also include other subdirectories and files, such as node configuration files. Each stage changes the configuration of devices in the lab and builds a new web page that helps students learn about and interact with those devices.

For example, you can see the file structure of the YAML lab in the directory lesson-17-git:

$ tree lesson-17-git/
lesson-17-git/
├── lesson.meta.yaml
└── stage1
    └── guide.md

1 directory, 2 files

Create a simple lesson

We’ll start by creating a very simple lesson that starts just one node — a container running Linux — and helps the student to run a few commands on it.

Choose the category directory in which you will create the lesson and navigate to it. In my case, I will create a new lesson directory in the fundamentals directory.

$ cd ~/nrelabs-curriculum/lessons/fundamentals
$ mkdir lesson-100-frr
$ cd lesson-100-frr

Remember that every lesson has, at a minimimum, a lesson definition file named lesson.meta.yaml and one or more lesson stage directories, each containing a Markdown file named guide.md.

Lesson definition file

Create and edit a lesson.meta.yaml file for the lesson.

$ nano lesson.meta.yaml

In this file, specify the network topology and the stages that compose the lesson. Use the YAML format when editing this file. The simplest lesson file you could create is something like below:

---
lessonName: Introduction to Lessons
lessonId: 100
category: fundamentals
tier: local
description: Demonstrate how to build a lesson in Network Reliability Labs Antidote Selfmedicate
slug: FRR

utilities:
- name: server1
  image: antidotelabs/utility

stages:
  - id: 1
    description: Create sample lab definition file

Pay attention to the indentation of the YAML file text. Information relating to a node must be indented further than its parent node.

The Antidote documentation lists some of the data elements that make up a lesson definition. The mandatory information that must be included in a lesson definition file is:

Lesson name
Lesson number (ID)
Tier. Must be one of three values:
- local, ptr, or prod
Catergory. Must be one of three values:
- fundamentals, tools, or workflows
Description
Slug
At least one node. May be one or more of the following types, with name and image (in Docker Hub format):
- utilities, devices, or blackboxes
At least one stage, with a stage number and description

More elements may be added, such as tags, connections, lesson diagrams, videos, prerequisites, and more. I will cover some of these additional elements in the second lesson example, further below.

Save the lesson definition file and close the editor.

Lesson stages

Next, create one, or more, stages for the lab. Create a subdirectory for the each stage of the lab:

$ mkdir stage1
$ cd stage1

Then, create a guide.md file in the stage directory that will display some instructions for the lab.

$ nano guide.md

The file is in Markdown format, with some embedded HTML tags for the run buttons. In this case, I’ll keep the lesson definition very simple, as shown below.

# Introduction to Lessons
## Part 1 - The basic lessons file

Welcome to "Introduction to Lessons". This is a Markdown file so it is easy to create and edit.

## Part 2 - Running commands in devices

You can run commands the devices you set up in the lab. Set up the commands (separated by line feeds) in between the code tics, as shown below. The HTML below the code runs it in the specified lab node.

```
hostname
ls
pwd
```
<button type="button" class="btn btn-primary btn-sm" onclick="runSnippetInTab('server1', this)">Run this snippet</button>

### Part 2.1

We can have subsections in each part. See in the code above where we run it in the node named "server1"?

As you can see above, the Antidote lesson allows you to run code listed in the guide.md file on the network nodes running in the lab. The file includes snippets of HTML that create run buttons that, when pushed, tell Antidote to run the commands (or code) on a specified node in the lab..

You can see how Antidote makes it easy to create lessons and to separate the lessons into multiple stages. If you want to re-use the images that already come bundled with Antidote, such as the Utility image and the VQFX image, you can create lessons just by creating new directories and editing a few text files.

Lesson validation

Run the syrctl validate command to validate the lesson definition and report any errors. See the Antidote Syringe documentation for more details about syrctl.

Currently, the NRE Labs developers developers suggest you pull the latest Syringe container from the antidotelabs repository on Docker Hub and run the syrctl validate command in that container.¹ I installed the nrelabs-curriculum directory in my home directory so I will use the path ~/nrelabs-curriculum in the command below.

$ docker run \
    -v ~/nrelabs-curriculum:/antidote \
    antidotelabs/syringe \
    syrctl validate /antidote

You will see a line of output for every lesson in the nrelabs-curriculum directory. The line corresponding with the lesson we created looks like below:

time="2019-05-16T23:40:18Z" level=info msg="Successfully imported lesson 100: Introduction to Lessons --- BLACKBOX: 0, IFR: 0, UTILITY: 1, DEVICE: 0, CONNECTIONS: 0"

You can see the lesson passed. Is has lesson ID 100 and it has one Utility node. If you see an error, the message should provide enough information for you to fix it.

Reload the curriculum

Run the reload command to copy the contents of the nrelabs-curriculum directory into the appropriate directory on the Minikube VM, as shown below:

$ cd ~/antidote-selfmedicate
$ ./selfmedicate.sh reload

Go back to the Antidote web interface at http://antidote-local:30001/advisor/index.html and refresh the page. You should see a new lesson named Introduction to Lessons in the Fundamentals section, as shown below:

Add more stages

If you plan to submit lessons to NRE Labs, the developers recommend you split lessons up into stages so each stage takes less than five minutes to complete. To create another stage for our simple lesson, update the lesson definition file, lesson.meta.yaml, to add another stage. Then create a directory named stage2 and add a new guide.md file in it.

For example, I update the Introduction to Lessons lesson definition file to add a new stage:

$ cd ~/nrelabs-curriculum/lessons/fundamentals/lesson-100-intro
$ nano lesson.meta.yaml

Add a second stage so the stages section of the file looks like below:

stages:
  - id: 1
    description: Create sample lab definition file
  - id: 2
    description: Get the IP address of the node's interfaces

Save the file. Then, create a new stage directory:

$ mkdir stage2
$ cd stage2

Then, create a guide.md file in the stage directory that will display some instructions for the lab.

$ nano guide.md

The file is in Markdown format, with some embedded HTML tags for the run buttons. In this case, I’ll keep the lesson definition very simple, as shown below.

# Adding another stage
## Part 3 - Get the IP addresses

Get the IP addresses of the management interface on *server1*:

```
ip addr show eth0
```
<button type="button" class="btn btn-primary btn-sm" onclick="runSnippetInTab('server1', this)">Run this snippet</button>

You should see text that looks like the following:

```
23: eth0@if24: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UP group default
     link/ether 2e:5a:4a:a3:15:80 brd ff:ff:ff:ff:ff:ff link-netnsid 0
     inet 10.32.0.7/12 brd 10.47.255.255 scope global eth0
         valid_lft forever preferred_lft forever
```

Save the file. Then, validate and reload the curriculum into the Antidote VM:

$ docker run \
    -v ~/nrelabs-curriculum:/antidote \
    antidotelabs/syringe \
    syrctl validate /antidote

Go back to the lesson catalog in the web interface and refresh the page. Click on Introduction to Lessons and then select the second stage from the drop-down menu at the top left of the page.

You can see the menu uses the stage description you entered in the lesson definition file. You now have two stages.

You may continue to add new stages. You can see how you can build lessons one piece at a time. You may continue to add stages until you have a complete lesson.

Create a complex network lab lesson

Previously, I showed how to create the lesson definition file and the student guide for each stage of a simple lab that uses one node. Next, I will show how use the connections and devices lab endpoints to build a more complex lab topology consisting of three nodes connected together.

Currently, Antidote checks to see if we have a Device lab endpoint defined in the lab definition file. Antidote will only make connections from Device endpoint to other endpoints. For example: At the time I wrote this post, Antidote can connect a Device endpoint to another Device endpoint or to other endpoints like the Utility endpoint, but it cannot connect two Utility endpoints together.

The Device enpoint requires a valid configuration file and, currently, Antidote only supports NAPALM for configuration. Until this is changed, device endpoints can only be Cisco, Arista, or Juniper VMs because other routers, including open-source routers, do not support NAPALM.

The Antidote developers intend to change this is a future release to give more flexibilty to lesson developers. For now, to give you an example of how you can create a lab with multiple interconnected nodes, I will use the built-in VQFX container that comes with Antidote.

Create a lesson directory

As we did in the simple example, above, create a new directory for the lesson. In this case, I will create the new lesson in the fundamentals folder.

$ cd ~/nrelabs-curriculum/lessons/fundamentals/
$ mkir lesson-200-cons

Plan the lab topology

First, decide on the lab topology. The only limit you have is the amount of CPU power and memory on the system running Antidote. In my case, I have a low-powered laptop so, I will create a topology with just one VQFX node and two Utility nodes. It will look like the diagram below.

This is a good opportunity to test how the Lab Diagram feature works in Antidote. Create a lab diagram using your favorite image tool and save it as a PNG file in the lesson directory.

Antidote will not read the lesson diagram from a local directory on your PC. You must place your lesson diagram image on a web server. I chose to store the lesson diagram as a GitHub Gist at the following URL:

https://gist.githubusercontent.com/blinklet/85611bf2b2ce09699f08848924c56b4e/raw/56f621885d018b8e29f954e337cf855f458a8396/lessondiagram.png

If you are planning to contribute a lesson to NRE Labs, then you probably are working from a fork of the nrelabs-curriculum repo so you can store your image in the lesson folder in the repo and use its raw GitHub URL.

Reusing existing labs

Since lesson files are just text files, it’s easy to “repurpose” the content from an existing lesson to build a new lesson. In this case, since I don’t know Juniper VQFX switches or NAPALM, I am just going to take the configurations from the already-existing ToDD lesson, which has a similar network topology as my planned lesson. I will change the router’s interface names and IP addresses to make the lab work.

Create the lab definition file

Create a new lesson definition file in the lesson directory:

$ cd lesson-200-cons
$ nano lesson.meta.yaml

Enter the file contents as shown below. You can see we define the VQFX router node under the devices endpoint and we show how the nodes are connected under the connections endpoint.

---
lessonName: Lab connections
lessonId: 200
category: fundamentals
lessonDiagram: https://gist.githubusercontent.com/blinklet/85611bf2b2ce09699f08848924c56b4e/raw/56f621885d018b8e29f954e337cf855f458a8396/lessondiagram.png
tier: local
description: Connect three endpoints together.
slug: Networking

utilities:
- name: user1
  image: antidotelabs/utility
- name: server1
  image: antidotelabs/utility

devices:
- name: vqfx1
  image: antidotelabs/vqfx:snap1

connections:
- a: user1
  b: vqfx1
- a: vqfx1
  b: server1

stages:
  - id: 1
    description: Test the link

When each connection is created, it uses the next available interface on the connected node and it uses them in the order listed in the lesson definition file. For example, in this example, we know that the first available interface on the VQFX router, em3, will connect to the net1 interface on node user1. The VQFX’s second interface, em4, connects to the net1 interface on server1.

You should also see we added the lesson diagram URL to the lesson definition file.

Create the first stage of the lesson

Create the first stage directory, stage1. In that directory, create the guide.md file:

$ mkdir stage1
$ cd stage1
$ nano guide.md

Edit the guide.md file and add the following contents:

# Three nodes connected together

## Test network links

The interfaces on the *vqfx1* node are configured according to the config file you created. The interfaces on *user1* and *server1* are configured by Antidote, and Antidote seems to choose addresses from the `10.10.0.0/16` subnet for each interface.

Test that we have an actual connection between *user1* and *vqfx1* by running the ping command:

```
ping -c 1 10.10.0.100
```
<button type="button" class="btn btn-primary btn-sm" onclick="runSnippetInTab('user1', this)">Run on user1</button>

## Check ARP caches

Did we really ping along the connection we created between *user1* and *vqfx1*? Check the ARP cache on *user1* to be sure:

```
ip neigh
```
<button type="button" class="btn btn-primary btn-sm" onclick="runSnippetInTab('user1', this)">Run on user1</button>

you should see that the IP address 10.10.0.100 is reachable from interface *net1*.

## Test the other link

Now, test the connection between *server1* and *vqfx1*. Use the ping command, again:

```
ping -c 1 10.10.0.101
```
<button type="button" class="btn btn-primary btn-sm" onclick="runSnippetInTab('server1', this)">Run on server1</button>

## Check the router's ARP chache

It should also succeed. If it fails, it is probably because Antidote assigned the *same* IP address to the *net1* interface on both *user1* and *server1*. You can check for that by checking the ARP chace on the *vqfx1* router

```
show arp
```
<button type="button" class="btn btn-primary btn-sm" onclick="runSnippetInTab('vqfx1', this)">Run on vqfx1</button>

Hopefully, you will not see duplicate IP addesses but, if you do, you know why the ping command failed from *server1*.

The Device configuration files

The device configurations are stored in a folder named configs in each stage directory. In this case, we have one device so, we’ll create only one configuration file.

The configuration file should have the same name as the device it configures and end with the .txt extension. In this example, the device is named vqfx1 so the file name will be vqfx1.txt.

Create the configs directory and the vqfx1.txt NAPALM configuration file:

$ mkdir configs
$ cd configs
$ nano vqfx1.txt

Enter in the XML data for the router configuration into the file. I list the entire contents of the file, copied from the ToDD lesson configuration file, with some modifications, in Appendix A, at the end of this post.

Files and scripts in nodes

If you want to make some scripts or other files available on the nodes in your lesson lab, just put the files in your lesson directory. The entire curriculum is mapped via volume to every container that runs in a lesson, so just by having those files in that directory, you’ll have access to them from any node.

To test this, we’ll add a file named test.txt to the lesson directory and see if we can access it when we run the lesson:

$ cd ~/nrelabs-curriculum/lessons/fundamentals/lesson-200-cons
$ echo "this is a test" > test.txt

When you reload the curriculum, the files in the lesson folder are copied to the /antidote folder on each Utility endpoint in the lab.

Validate the lesson

Verify that the lesson is correctly defined. Run the syrctl validate command:

$ docker run \
    -v ~/nrelabs-curriculum:/antidote \
    antidotelabs/syringe \
    syrctl validate /antidote

In the output, the line for Lesson 200 looks like:

time="2019-05-28T01:25:53Z" level=info msg="Successfully imported lesson 200: Lab connections --- BLACKBOX: 0, IFR: 0, UTILITY: 2, DEVICE: 1, CONNECTIONS: 2"

Test the lesson

Run the lesson on Antidote. First, reload Antidote-selfmedicate:

$ cd ~/antidote-selfmedicate
$ ./selfmedicate.sh reload

Start the Antidote web interface at https://antidote-local:3000 and navigate to the lessons page. You should see the lesson named Lab connections at the bottom of the fundamentals section.

Click on the Lab connections lesson and wait about a minute for it to start up. You should see three nodes (tabs) in the terminal.

Click on the Lesson Diagram button and see if it displays the lesson diagram image correctly.

Click the Close diagram button to get back to the lesson lab.

Execute each of the code snippets in the lab’s Stage 1 and test that the lab can actually pass test traffic from user1 to vqfx1, along the connection you created between them.

You may encounter a problem if Antidote assigned the same IP address to the network interface on both user1 and server1. It seems to do that, sometimes. Also, Antidote assigns IP addresses on the Utility nodes from the same subnet, which means we cannot test end-to-end connectivity from user1 to server1. Unfortunately, we cannot manually configure the Utility node IP addresses because the Utility node’s antidote user does not have the privileges to configure IP interfaces.

Finally, see that the file we added to the lesson is available. On any lab node, go to the /antidote directory and list the contents. You should see the test.txt file there. For example:

antidote@server1:~$ cd /antidote
antidote@server1:/antidote$ ls
labdiagram.png  lesson.meta.yaml  stage1  test.txt

Observations about the complex lab

The lab example above is very limited because of the way Antidote handles lab endpoints. As I mentioned before, the Antidote developers plan to change the way labs work in Antidote to give lab developers more flexibility, so this situations should improve.

I observed the following things about the lab setup above:

I confirmed that, if you allocate only 4096 GB of RAM to the AntidoteVM, the VQFX nodes will not work
I tried the same setup on a larger machine and then discovered that the Utility nodes do not allow users to configure the IP addresses. You cannot configure the user as a sudo user.
The interfaces created on user1 and server1 already have IP addresses configured – probably by Antidote, and sometimes, they get the same IP address which causes a problem.

For now, if you want to create a complex lab in Antidote, it looks like you’ll need to use the standard three-node VQFX topology used in lab lessons like Network Unit Testing with JSNAPY, and then just create labs stages that run commands on the Linux utility node.

Creating device images

Antidote allows you to create your own Docker images for lab endpoints such as devices and utilities. However, as mentioned above, Antidote currently will not support open-source devices. I cannot create open-source routers to test in Antidote until the developers change the way Antidote endpoints work.

In addition, no documentation is available for creating images, yet.² To learn how to build new images for Antidote, I found some useful information about making Docker images for Antidote at the OpenJNPR-Container-vMX project and you can try to reverse-engineer the NRE Labs VQFX container.

Conclusion

I’ve gone as far as I can in using Antidote, until the developers add in support for configuring open-source devices and until I can create my own containers that enable sudo users. I also need to learn more about Docker before I can attempt to reverse-engineer how images are created for NRE Labs and Antidote.

I showed how to install Antidote on your own PC in my previous post, and I showed how to create simple labs using Antidote. I hope you see that is is fairly easy to build lessons in Antidote, especially if you use its built-in images.

I did not cover how to contribute labs to the Antidote project and to NRE Labs. You can see the Antidote documentation for information about contributing lessons to Antidote.

Appendix A: VQFX NAPALM config file

The full contents of the vqfx1.txt file for the lab we create in our second example, above. This is a copy of the VQFX config in the ToDD lab. Actually, vqfx1 in any NRE Lab lesson seems to be configured the same way.

I modified the interface names and the IP addresses to match what Antidote automatically configures on the Utility nodes.

<configuration operation="replace">
        <version>15.1X53-D60.4</version>
        <system>
            <host-name>vqfx1</host-name>
            <root-authentication>
                <encrypted-password>$1$mlo32jo6$BOMVhmtORai2Kr24wRCCv1</encrypted-password>
                <ssh-rsa>
                    <name>ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA6NF8iallvQVp22WDkTkyrtvp9eWW6A8YVr+kz4TjGYe7gHzIw+niNltGEFHzD8+v1I2YJ6oXevct1YeS0o9HZyN1Q9qgCgzUFtdOKLv6IedplqoPkcmF0aYet2PkEDo3MlTBckFXPITAMzF8dJSIFo9D8HfdOV0IAdx4O7PtixWKn5y2hMNG0zQPyUecp4pzC6kivAIhyfHilFR61RGL+GPXQ2MWZWFYbAGjyiYJnAmCP3NOTd0jMZEnDkbUvxhMmBYSdETk1rRgm+R4LOzFUGaHqHDLKLX+FIPKcF96hrucXzcWyLbIbEgE98OHlnVYCzRdK8jlqm8tehUc9c9WhQ== vagrant insecure public key</name>
                </ssh-rsa>
            </root-authentication>
            <login>
                <user>
                    <name>antidote</name>
                    <class>super-user</class>
                    <authentication>
                        <encrypted-password>$1$iH4TNedH$3RKJbtDRO.N4Ua8B6LL/v/</encrypted-password>
                    </authentication>
                </user>
                <password>
                    <change-type>set-transitions</change-type>
                    <minimum-changes>0</minimum-changes>
                </password>
            </login>
            <services>
                <ssh>
                    <root-login>allow</root-login>
                </ssh>
                <netconf>
                    <ssh>
                    </ssh>
                    <rfc-compliant/>
                </netconf>
                <rest>
                    <http>
                        <port>8080</port>
                    </http>
                    <enable-explorer/>
                </rest>
            </services>
            <syslog>
                <user>
                    <name>*</name>
                    <contents>
                        <name>any</name>
                        <emergency/>
                    </contents>
                </user>
                <file>
                    <name>messages</name>
                    <contents>
                        <name>any</name>
                        <notice/>
                    </contents>
                    <contents>
                        <name>authorization</name>
                        <info/>
                    </contents>
                </file>
                <file>
                    <name>interactive-commands</name>
                    <contents>
                        <name>interactive-commands</name>
                        <any/>
                    </contents>
                </file>
            </syslog>
            <extensions>
                <providers>
                    <name>juniper</name>
                    <license-type>
                        <name>juniper</name>
                        <deployment-scope>commercial</deployment-scope>
                    </license-type>
                </providers>
                <providers>
                    <name>chef</name>
                    <license-type>
                        <name>juniper</name>
                        <deployment-scope>commercial</deployment-scope>
                    </license-type>
                </providers>
            </extensions>
        </system>
        <interfaces operation="replace">
            <interface>
                <name>em0</name>
                <unit>
                    <name>0</name>
                    <family>
                        <inet>
                            <address>
                                <name>{{ mgmt_addr }}</name>
                            </address>
                        </inet>
                    </family>
                </unit>
            </interface>
            <interface>
                <name>em3</name>
                <unit>
                    <name>0</name>
                    <family>
                        <inet>
                            <address>
                                <name>10.10.0.100/16</name>
                            </address>
                        </inet>
                    </family>
                </unit>
            </interface>
            <interface>
                <name>em4</name>
                <unit>
                    <name>0</name>
                    <family>
                        <inet>
                            <address>
                                <name>10.10.0.101/16</name>
                            </address>
                        </inet>
                    </family>
                </unit>
            </interface>
        </interfaces>
        <forwarding-options>
            <storm-control-profiles>
                <name>default</name>
                <all>
                </all>
            </storm-control-profiles>
        </forwarding-options>
        <routing-options>
            <autonomous-system>
                <as-number>64001</as-number>
            </autonomous-system>
        </routing-options>
        <protocols>
            <igmp-snooping>
                <vlan>
                    <name>default</name>
                </vlan>
            </igmp-snooping>

        </protocols>
        <vlans>
            <vlan>
                <name>default</name>
                <vlan-id>1</vlan-id>
            </vlan>
        </vlans>
</configuration>

When NRE Labs development activity stabilizes, you would install the syrctl binary on your Linux system. You would just download the syringe-darwin-amd64.tar.gz archive from the NRE Labs Syringe repository on GitHub, extract the syrctl binary from the archive, make it executable, and copy it to a folder on your path, such as /usr/local/bin/. However, currently, the NRE Labs development team is actively updating the Antidote project, curriculum files, and the Syringe software. They do not re-compile a Syringe binary until they prepare an official release. Because we are cloning the Antidote and curriculum files from the NRE Lab projects master branch, we need to get the latest syrctl version, to be compatible. The latest version is available in the Syringe container. ↩
If you find a docker image in a public repository that you want to try (and if it is compatible with Antidote), then you can create a new Dockerfile and start it with a FROM statement and use the Docker image you want to build off of. The main change you need to make is to set the userid and password. Any Docker image you create must have a userid of antidote and the password must be antidotepassword. These credentials are hard-coded into Antidote. See the Utility image dockerfile for an example of setting userid and password. ↩

↧

Video chat about NRE Labs

June 4, 2019, 7:04 am

≫ Next: Run a script on virtual machines when the host is shut down

≪ Previous: Create lab lessons for the NRE Labs Antidote network emulator

Yesterday, I participated in a screen-cast with Derick Winkworth, aka @CloudToad, to discuss my blog posts about installing NRE Labs Antidote network emulator on your PC and creating lessons for NRE Labs. We also covered some general points like contributing to communities, how to get started blogging about technical topics, and more. Check it out, below:

This video, and other NRE Labs videos are available on YouTube. Also, the NRE Labs team runs a live screen-cast every Monday at 1:00 PM using the Discord app. Join the NRE Labs Discord channel and engage in the discussion.

↧

Run a script on virtual machines when the host is shut down

September 28, 2019, 4:40 pm

≫ Next: Install Azure CLI on your Android Phone

≪ Previous: Video chat about NRE Labs

I want to show you how to configure a host server so, when it is shut down, it executes a script that runs commands on any running virtual machines before the host tries to stop them. I will configure the host server to wait until the script completes configuring the virtual machines before continuing with the shutdown process, shutting down the virtual machines, and eventually powering off.

I had to learn how Systemd service unit configuration files work and some more details about how Libvirt is configured in different Linux distributions. Read on to see the solution, plus some details about how to test the solution in Ubuntu and CentOS.

Solution Summary

Install Libvirt and libguestfs-tools. Ensure that the libvirt-guests service is already started and enabled, and is configured appropriately. Create a new Systemd service named graceful-shutdown that runs a script when the host system shuts down, but before Libvirt shuts down any virtual machines.

The graceful-shutdown.service unit configuration file

Create a new Systemd unit configuration file named graceful-shutdown.service and save it in the directory, /etc/systemd/system, where it is advised you put custom configuration files.

For example:

# vi /etc/systemd/system/graceful-shutdown.service

Enter the following text into the file, then save it:

[Unit]
Description=Graceful Shutdown Service
DefaultDependencies=no
Requires=libvirt-guests.service
After=libvirt-guests.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStop=/sbin/stoplab.sh

[Install]
WantedBy=multi-user.target

Then, enable and start the service with the commands:

# systemctl enable graceful-shutdown
# systemctl start graceful-shutdown

The graceful-shutdown service will cause Systemd to run a Linux shell script, /sbin/stoplab.sh before Systemd stops the libvirt-guests service. Because the service type is oneshot, Systemd will wait until the script exits before it continues with its standard shutdown process.

If you are not familiar with the way Systemd works when starting up a Linux system and when shutting it down, you may think that the order of execution in the unit file is backwards. However, it is correct. I will explain how to read service unit files when I test the solution later in this post.

The stoplab.sh script

The file you created will cause Systemd to run a Linux shell script, /sbin/stoplab.sh. You need to create that script yourself so it meets your own needs. In my case, the stoplab script runs commands that prepare the lab virtual machines before they are shut down. I will show an example of a simple stoplab script when I test the solution.

Naturally, the script must be executable:

# chmod +x /sbin/stoplab.sh

The libvirt-guests service

The graceful-shutdown service relies on the libvirt-guests service.

If the libvirt-guests service is not running, the graceful-shutdown service will not trigger the stoplab script so the virtual machines will not be gracefully shut down. Without libvirt-guests service, Libvirt will power off the virtual machines, which may cause corruption on processes or databases running on those virtual machines.

Ubuntu

Ubuntu will enable and start the libvirt-guests service when you install Libvirt. The libvirt-guests default configuration on Ubuntu works well for my needs. It runs the shutdown command on any running virtual machines when the host is shut down and it does not restart them when the host is started again.

CentOS and Fedora

CentOS does not enable or start the libvirt-guests service when you install the Virtualization Host group package. You must enable and start the service. In addition, the libvirt-guests default configuration on CentOS causes running VMs to be suspended instead of shut down, and automatically resumes when the host start again. This is not the behaviour I want, so I recommend editing the libvirt-guests configuration file.

Verify the solution is working

To verify that the solution is working, review the Systemd log files after you shut down the host server and restart it. Use the journalctl command to read system logs. You can list all logs related to the graceful-shutdown service with the following command:

# journalctl -u graceful-shutdown

If you want to filter the logs so you see only the logs from the previous boot session, the end of which is when you shut down the system and generated the logs you want to view, use the following command:

# journalctl -b -1 -u graceful-shutdown

Persistent logs in CentOS

The above commands work in Ubuntu. However, CentOS does not enable persistent logs by default so you will lose the shutdown logs unless you set up a persistent log file.

In CentOS, run the following commands to enable a persistent log file.

# mkdir /var/log/journal
# systemd-tmpfiles --create --prefix /var/log/journal
# systemctl restart systemd-journald

Then shut down the host, restart it, and run the journalctl command to view the logs related to the graceful-shutdown service.

Summary conclusion

I covered the basics of the solution for those who may already be familiar with Systemd and with Libvirt. If you would like to know more about why I created this solution and how it works, please read on.

Graceful shutdown use case

I build open-source network emulation labs using cloud computing service providers, like Google Cloud or Microsoft Azure, that support nested virtualization on cloud instances. Each lab consists of interconnected nested virtual machines running on a single instance. Network emulation labs require a large amount of resources and can be costly to run.

To keep costs low, I ask lab users to stop their cloud instances when they are not actively using them. I want users, some of whom are beginners, to be able to stop the instances simply, by pushing a button on a web portal, or I configure the cloud instances to shut down at a scheduled time. To avoid corrupting virtual machines or losing virtual network node configurations when the instance stops, I created scripts that start when the host VM begins shutting down and that run commands on the nested virtual machines before they are shut down.

The graceful shutdown service I describe and test in this post enables me to save virtual network node configurations, avoid database corruption on virtual network appliances, and to more quickly shut down virtualized commercial network nodes that do not respond to Libvirt’s normal shutdown commands.

Save configurations

Some open-source and commercial networking operating systems allow users to make changes to a running configuration in memory without saving the changes to a persistent storage, like the system’s disk. Users must take an extra step to save the configuration. The stoplab script can run commands that log into each running virtual network node and run the appropriate commands to save running configurations to disk. This will ensure the configurations are recovered when the system starts again.

Stop databases

Before shutting down a virtual network appliance that is running a database, you should first stop the database to ensure it does not become corrupted during the shut down process. Most of the network appliances I work with come with a set of procedures required to shut down the system safely. If you have similar systems running in your virtual lab, read their documentation and integrate those procedures into your stoplab script.

Shut down virtualized commercial network appliances

The libvirt-guests service uses the virsh shutdown command, which requires that the guest operating system is configured to handle ACPI shut down requests. Commercial networking equipment operating systems do not always support ACPI shutdown requests, or may require additional configuration to accept ACPI shut down requests. Instead of allowing the libvirt-guests service to shut down the virtual network node running a commercial networking operating system, you may need to run the virsh destroy command in the stoplab script to stop the virtual network node. This will help your server to shut down more quickly.

Testing the solution

To test the graceful-shutdown service, you must first set up a Linux system with Libvirt and KVM, plus some additional virtualization tools. I describe how to set up a network emulation lab using Libvirt in a previous post. Below, I show how to set up a simple test using one nested virtual machine on both Ubuntu and CentOS.

Create a guest VM in Ubuntu

If you are using an Ubuntu system, run the following commands to install the virtualization software and set up a virtual machine.

$ sudo apt install -y qemu-kvm \
    libvirt-daemon libvirt-daemon-system \
    bridge-utils libguestfs-tools \
    virt-manager libnss-libvirt virt-top \
    libosinfo-bin sshpass expect

Set up groups so you can use Libvirt without root permissions:

$ newgrp libvirt

Then, to allow libguestfs-tools to work, change the permissions on the Linux kernel so other users can read the kernel file¹.

$ sudo chmod 0644 /boot/vmlinuz*

On your Ubuntu host server, run the following commands to quickly set up a simple virtual machine that we will use to test the graceful-shutdown service.

$ mkdir ~/images
$ virt-builder centos-7.6 \
    --output ~/images/testvm01.qcow2 \
    --arch x86_64 \
    --format qcow2 \
    --size 6G \
    --hostname testvm01 \
    --root-password password:root

$ virt-install --import --name testvm01 \
  --virt-type=kvm \
  --ram 2048 \
  --vcpus 2 \
  --disk path=~/images/testvm01.qcow2,bus=virtio,format=qcow2 \
  --os-type linux \
  --os-variant rhel7.6\
  --network bridge=virbr0 \
  --graphics none

This creates and starts a new VM named testvm01 and opens a console to the VM. Get out of the console by pressing the CTRL–] key combination.

testvm01 login:
CTRL - ]
Domain creation completed.
$

Set the new VM to autostart when the host system starts:

$ virsh autostart testvm01

Create a guest VM in CentOS

When using CentOS, I’ve found that VMs must be created by the root user in order for this solution to work.

Only Libvirt-managed VMs visible to the root user can be shut down by the libvirt-guests service during system shutdown. This may be an issue with the way CentOS handles Libvirt URI’s.² VMs created by a normal user could not be seen by root and, for some reason, were shut down immediately when the host shut down, instead of waiting for the graceful-shutdown and libvirt-guests services to stop, in that order. The root user is unable to run libguestfs-tools on VMs unless they are in the default Libvirt images folder, /var/lib/libvirt/images, due to a bug in CentOS that looks like it will never be fixed.

Install the virtualization software in CentOS.

$ sudo su
# yum -y install epel-release
# yum -y update
# yum groups mark convert
# yum -y group install "Virtualization Host"
# yum -y group install "Virtualization client"
# yum -y install libguestfs-tools
# systemctl enable libvirtd
# systemctl start libvirtd
# yum -y install expect openssl sshpass

Edit the /etc/sysconfig/libvirt-guests file to configure it to shut down guest VMs when the host shuts down.

# sed -i 's/#ON_SHUTDOWN=suspend/ON_SHUTDOWN=shutdown/g' \
    /etc/sysconfig/libvirt-guests
# sed -i 's/#SHUTDOWN_TIMEOUT=300/SHUTDOWN_TIMEOUT=240/g' \
    /etc/sysconfig/libvirt-guests
# sed -i 's/#ON_BOOT=start/ON_BOOT=ignore/g' \
    /etc/sysconfig/libvirt-guests

We set the timeout to less than 5 minutes, because most cloud services shutdown timeout is five minutes so we need to complete all scripts before five minutes or the cloud service will force the cloud instance to shut down, anyway.

Then, start the libvirt-guests service:

# systemctl enable libvirt-guests
# systemctl start libvirt-guests

Since we will use Libvirt as root, we do not need to modify the groups for any users.

On your CentOS host server, run the following commands to quickly set up a simple virtual machine that we will use to test the graceful-shutdown service.

# virt-builder centos-7.6 \
    --output /var/lib/libvirt/images/testvm01.qcow2 \
    --arch x86_64 \
    --format qcow2 \
    --size 6G \
    --hostname testvm01 \
    --root-password password:root

# virt-install --import --name testvm01 \
  --virt-type=kvm \
  --ram 2048 \
  --vcpus 2 \
  --disk path=/var/lib/libvirt/images/testvm01.qcow2,bus=virtio,format=qcow2 \
  --os-type linux \
  --os-variant rhel7.6\
  --network bridge=virbr0 \
  --graphics none

This creates and starts a new VM named testvm01 and opens a console to the VM. Get out of the console by pressing the CTRL–] key combination.

testvm01 login:
CTRL - ]
Domain creation completed.
#

Set the new VM to autostart when the host system starts:

# virsh autostart testvm01

The stoplab.sh script

To test the graceful-shutdown service, create a stoplab.sh script that connects to the VM and runs some commands on it when the host system shuts down. I chose to name the script stoplab.sh and place it in the /sbin directory.

$ sudo vi /sbin/stoplab.sh

An example of the script is shown below. It shows two ways to send commands to a nested VM from a script running on the host. In each case, it writes the time to a file, waits 20 seconds, and writes the time again. This will show that Systemd waits until all scripts complete on the VM before continuing to shut down the host.

#!/usr/bin/env bash
# Gracefully shut down lab VM

rm -f /root/.ssh/known_hosts
echo "Starting lab server shutdown"
date

# One way to send commands to the VM

expect -c "
    set timeout 120
    spawn ssh -o StrictHostKeyChecking=no root@192.168.122.81
    expect \"\*\?assword\" { send \"root\r\" }
    expect \"# \" { send \"echo 'some commands' \>\> /root/example.txt\r\" }
    expect \"# \" { send \"date \>\> /root/example.txt\r\" }
    expect \"# \" { send \"sleep 40\r\" }
    expect \"# \" { send \"date \>\> /root/example.txt\r\" }
    expect \"# \" { send \"exit\r\" }
    expect eof
"

# Another way to send commands to the VM

sshpass -proot \
    ssh -o StrictHostKeyChecking=no \
    root@192.168.122.81 \
    "echo 'some more commands' >> /root/example.txt"
sshpass -proot \
    ssh -o StrictHostKeyChecking=no \
    root@192.168.122.81 \
    "date >> /root/example.txt"
sleep 40
sshpass -proot \
    ssh -o StrictHostKeyChecking=no \
    root@192.168.122.81 \
    "date >> /root/example.txt"
sshpass -proot \
    ssh -o StrictHostKeyChecking=no \
    root@192.168.122.81 \
    "shutdown -h now"

# Shut down the VM

virsh shutdown testvm01

# Finish and clean up

echo "Finished lab server shutdown"
date
rm -f /root/.ssh/known_hosts
exit 0

Remember to make it executable:

$ chmod +x /sbin/stoplab.sh

The Systemd unit configuration file

Systemd is the initialization system used by most Linux distributions. Immediately after a Linux system boots, Systemd begins the process of starting all the programs and servers that create a running Linux system. Systemd follows instructions in the Systemd unit configuration files. Each Linux distribution may arrange these files a bit differently and administrators may modify them or create their own unit configuration files to define custom services.

We previously created a custom service by creating the file /etc/systemd/system/graceful-shutdown.service. Here, I will walk through the information in the file to show how Systemd interprets it.

The Unit section

The first section of the file is the Unit section. It tells Systemd which other services the graceful-shutdown service depends on and the order in which this service will start (or stop) in relation to other Systemd services or targets.

[Unit]
Description=Graceful Shutdown Service
DefaultDependencies=no
Requires=libvirt-guests.service
After=libvirt-guests.service

In the Unit section shown above, we see the graceful-shutdown service requires that the libvirt-guest service be running before it will start and that it will start after the libvirt-guest service starts.

In our case, we are more interested in what will happen during the system shutdown process, when Systemd will be stopping services. When services are being stopped, Systemd goes through all the services in reverse order. This means that, when we tell Systemd to the start the graceful-shutdown service after the libvirt-guests service, Systemd will do the reverse during shutdown: it will ensure the graceful-shutdown service executes any stop commands and exits before the libvirt-guest service executes its stop commands.

Also, we need to have both an After line and a Requires line in the unit file.

The Service section

The second section of the file is the Service section. This contains options that define the type of service and the commands that will run to start the programs or perform the actions related to the service. The file I created has the following options:

[Service]
Type=oneshot
RemainAfterExit=true
ExecStop=/sbin/stoplab.sh

In this example, the service type is oneshot, which will run a command when it starts and another command when it stops. In our example, we do not run any command when the service starts and we want the service to remain active so it will run a command when it is stopped during the shutdown process. So, we add in the option, RemainAfterExit=true. Finally, we define the command that will run when the service stops in the ExecStop option.

The last section of the unit file is the Install section, which tells Systemd what to do with the service when you enable it.

[Install]
WantedBy=multi-user.target

In this case, when you enable the service, Systemd will install a symbolic link in the /etc/systemd/system/multi-user.target.wants directory, which will point to the unit file associated with the service. This adds the graceful-shutdown service into the chain of files that will be parsed by Systemd when the system starts up or shuts down.

Test the graceful-shutdown service

We created the graceful-shutdown service, enabled it, and started it. We created the /sbin/shutdown.sh script and made it executable. We installed the virtualization software and ensured that the libvirt-guests service was configured and enabled.

If you are using CentOS, remember to enable persistent system logs:

$ sudo mkdir /var/log/journal
$ sudo systemd-tmpfiles --create --prefix /var/log/journal
$ sudo systemctl restart systemd-journald

Now, we can shut down the host system and see how the graceful-shutdown service works. If you created your Linux system on a cloud service, go to the user portal and stop the cloud instance. After it finishes stopping, start it again.

If you are testing on a local PC, use the operating system’s restart command.

$ sudo reboot

After the system finishes restarting, login and check the system logs:

$ sudo journalctl -b -1 -u graceful-shutdown
-- Logs begin at Wed 2019-09-25 22:43:51 UTC, end at Fri 2019-09-27 16:43:24 UTC. --
Sep 27 15:10:17 ubuntu systemd[1]: Started Gracefully stop running labs.
Sep 27 15:19:55 ubuntu systemd[1]: Stopping Gracefully stop running labs...
Sep 27 15:19:55 ubuntu stoplab.sh[2738]: Starting lab server shutdown
Sep 27 15:19:55 ubuntu stoplab.sh[2738]: Fri Sep 27 15:19:55 UTC 2019
Sep 27 15:19:55 ubuntu stoplab.sh[2738]: spawn ssh -o StrictHostKeyChecking=no root@192.168.122.81
Sep 27 15:19:56 ubuntu stoplab.sh[2738]: Warning: Permanently added '192.168.122.81' (ECDSA) to the list of known
Sep 27 15:19:56 ubuntu stoplab.sh[2738]: root@192.168.122.81's password:
Sep 27 15:19:57 ubuntu stoplab.sh[2738]: [root@testvm01 ~]# echo 'some commands' >> /root/example.txt
Sep 27 15:19:57 ubuntu stoplab.sh[2738]: [root@testvm01 ~]# date >> /root/example.txt
Sep 27 15:19:57 ubuntu stoplab.sh[2738]: [root@testvm01 ~]# sleep 40
Sep 27 15:20:37 ubuntu stoplab.sh[2738]: [root@testvm01 ~]# date >> /root/example.txt
Sep 27 15:20:37 ubuntu stoplab.sh[2738]: [root@testvm01 ~]# exit
Sep 27 15:20:37 ubuntu stoplab.sh[2738]: logout
Sep 27 15:20:37 ubuntu stoplab.sh[2738]: Connection to 192.168.122.81 closed.
Sep 27 15:21:19 ubuntu stoplab.sh[2738]: Connection to 192.168.122.81 closed by remote host.
Sep 27 15:21:19 ubuntu stoplab.sh[2738]: Domain testvm01 is being shutdown
Sep 27 15:21:19 ubuntu stoplab.sh[2738]: Finished lab server shutdown
Sep 27 15:21:19 ubuntu stoplab.sh[2738]: Fri Sep 27 15:21:19 UTC 2019
Sep 27 15:21:19 ubuntu systemd[1]: graceful-shutdown.service: Succeeded.
Sep 27 15:21:19 ubuntu systemd[1]: Stopped Gracefully stop running labs.

You can see that the graceful-shutdown service ran the stoplab.sh and sent the output from the script to the system logs.

The testvm01 VM should be running. Log into the testvm01 VM and verify that the script successfully wrote to a file on the VM before it was shut down.

$ ssh root@192.168.122.81
root@192.168.122.81's password:
[root@testvm01 ~]# cat example.txt
some commands
some more commands
some commands
Fri Sep 27 10:32:53 EDT 2019
Fri Sep 27 10:33:33 EDT 2019
some more commands
Fri Sep 27 10:33:34 EDT 2019
Fri Sep 27 10:34:14 EDT 2019
[root@testvm01 ~]#

You can see that the script waited 20 seconds each time it wrote the dates to the file. This proves the graceful-shutdown service made Systemd wait until the script completed before continuing with the shutdown process.

Conclusion

I showed how to create a custom Systemd service that, when the host server shuts down, will ensure any running VMs are first gracefully shut down by a custom shutdown script.

The <em>libguestfs-tools</em> used in this example require read access to the host’s Linux kernel file but the Ubuntu developers decided to make the Linux kernel readable only by the root user. Canonical says they did this to <a href="https://bugs.launchpad.net/fuel/+bug/1467579">improve security</a> but others <a href="https://bugs.launchpad.net/ubuntu/+source/linux/+bug/759725">strongly disagree</a> with them. The libguestfs tools install guide <a href="http://manpages.ubuntu.com/manpages/bionic/man1/guestfs-faq.1.html">refutes Canonical on this point</a> so I set the Linux kernel file to be readable by all users on my host system, instead of running virtualization tools with root privileges. I suggest you do the same. ↩
Also see “First few steps with libvirt” in <em><a href="http://rabexc.org/posts/how-to-get-started-with-libvirt-on?tag=Technology">How to get started with libvirt on Linux</a></em> by <a href="https://twitter.com/rabexc">@rabexc</a>. ↩

↧

Install Azure CLI on your Android Phone

October 31, 2019, 9:29 am

≫ Next: Run the Antidote network emulator on KVM for better performance

≪ Previous: Run a script on virtual machines when the host is shut down

I installed the Azure CLI in the Termux app on my Android phone. This post describes all the steps required to successfully run Azure CLI on most Android phones.

Installing Azure CLI on Termux on your Android phone is an alternative to using Azure Cloud Shell on Chrome or Firefox, or to using the Cloud Shell feature on the Azure mobile app. It’s also a cool thing to try.

This post is based on the excellent work done by Matthew Emes, who wrote a blog post about installing Azure CLI on a Chromebook. Matthew’s procedure got me started, but I had to modify it to make Azure CLI work in Termux on my Android phone. Also, Azure CLI has changed since Matthew wrote about it and some of his steps, while they still work, are no longer necessary.

Termux

Install Termux on your Android phone. Termux is a terminal emulator and Linux environment that runs on most Android devices with no rooting or setup required. You can use Termux as a terminal emulator to manage remote systems and it will run a large number of Linux utilities and programming languages directly on your phone. Install it from the Google Play store.

Install required packages

The Azure CLI requires the Python language, a C compiler, the make utility, and some encryption libraries to be installed. In addition, I recommend installing OpenSSH and the nano text editor.

Open Termux on your phone. Install the following packages in Termux using the built-in package manager, pkg. You may also use apt, if you prefer.

$ pkg update
$ pkg install openssl libffi python clang make
$ pkg install openssh nano

Use the Python virtual environments functionality to create a separate Python instance and directory in which Azure CLI will run, avoiding locked-down directories in Android. Install Python’s virtual environment package.

$ pip install --user virtualenv

You will see a warning stating that the virtualenv package is not in the path. Add it to the path with the following commands:

$ PATH=$PATH:~/.local/bin
$ export PATH

Make the path change permanent by creating a .bashrc file in your home directory:

$ nano ~/.bashrc

Add the path commands to the .bashrc file:

PATH=$PATH:~/.local/bin
export PATH

Save the file and exit. Read the .bashrc file:

$ source ~/.bashrc

Create a Python virtual environment for the Azure CLI scripts. This prevents Python from trying to install Azure CLI in protected directories, and failing to install.

$ virtualenv ~/.local/lib/azure-cli
$ cd ~/.local/lib/azure-cli
$ source ./bin/activate
(azure-cli) $

Install Azure CLI in the virtual environment:

(azure-cli) $ pip install cffi
(azure-cli) $ pip install azure-cli

Be patient. It takes a very long time to install the azure-cli package because it is compiling some of the dependencies. It took about twenty minutes to install on my phone.

Optionally, you may create a requirements.txt file to support upgrading Azure CLI in the future:

(azure-cli) $ pip freeze > requirements.txt

Test Azure CLI in Termux

Test the Azure CLI. Log in to your Azure subscription:

(azure-cli) $ az login

Follow the instructions displayed on the terminal. You will probably have to open the Chrome web browser on your phone and go to the URL: https://microsoft.com/devicelogin and enter in a code provided on the terminal screen. Enter the code in the browser, then return to Termux.

If the login page hangs up after you select your account, check which browser you are using. At the time I wrote this post, the device login page did not work in Firefox, so you may need to use Chrome.

After the login process is complete, run a test command. For example, list your resources in Azure. In the example below, I list resources in a resource group named test-group.

(azure-cli) $ az resource list -g test-group -o table

Bash wrapper for Azure CLI

You need to run Azure CLI from the terminal and in shell scripts. Create a bash script that runs azure CLI using the Python instance you installed in the virtual environment.

Get the path of the Python that runs in the virtual environment, so you can use it later:

(azure-cli) $ which python
/data/data/com.termux/files/home/.local/lib/azure-cli/bin/python

Copy the full path to the clipboard.

Quit the virtual environment:

(azure-cli) $ deactivate

Edit a shell script named az, which will serve as an alias for the process of running the azure-cli command in the Python virtual environment we previously created.

$ nano ~/.local/bin/az

Enter the following into the file. The last line includes the Python path you previously copied, which you can paste in.

#!/usr/bin/env bash
/data/data/com.termux/files/home/.local/lib/azure-cli/bin/python -m azure.cli "$@"

Make the command executable

$ chmod +x ~/.local/bin/az

Test the command:

$ cd ~
$ az account list -o table

You should see a list of your available Azure subscriptions.

Now, logout:

$ az logout

Enable Bash command completion for Azure CLI

It is convenient to enable bash command completion for Azure CLI commands. This is an optional configuration and you may skip it, if you wish.

The Python argcomplete package is already installed as part of Azure CLI. However, in its default configuration, it will not work for Azure CLI on Android because you do not have write access to the /etc/bash_completion.d directory. Activate Azure CLI bash command completion for your user by running the following command:

$ eval "$(~/.local/lib/azure-cli/bin/register-python-argcomplete az)"

To make it work all the time, add the command to your .bashrc file:

$ nano ~/.bashrc

In the file, add the eval command. See the last line I added in the script, below:

PATH=$PATH:~/.local/bin
export PATH
eval "$(~/.local/lib/azure-cli/bin/register-python-argcomplete az)"

Save the file and read the .bashrc file again:

$ source ~/.bashrc

Upgrading Azure CLI

You may occasionally want to upgrade Azure CLI. To upgrade, activate the virtual environment and use pip:

$ cd ~/.local/lib/azure-cli
$ source ./bin/activate
(azure-cli) $ pip install --upgrade azure-cli
(azure-cli) $ deactivate
$

Conclusion

You now have the ability to run Azure CLI commands from your Android phone. This can be useful if you build some shell scripts based on Azure CLI to accomplish repetitive tasks in Azure, or to accomplish a task that is not supported by the Azure web app.

↧

Run the Antidote network emulator on KVM for better performance

November 15, 2019, 5:59 am

≫ Next: Fixing a Thinkpad T420 battery problem on Linux

≪ Previous: Install Azure CLI on your Android Phone

Antidote is the network emulator that runs the labs on the Network Reliability Labs web site. You may install a standalone version of Antidote on your personal computer using the Vagrant virtual environment provisioning tool.

In this post, I show you how to run Antidote on a Linux system with KVM, instead of VirtualBox, on your local PC to achieve better performance — especially on older hardware.

Why use KVM instead of VirtualBox?

Antidote runs emulated network nodes inside a host virtual machine. If these emulated nodes must also run on a hypervisor, as most commercial router images require, then they are running as nested virtual machines inside the host virtual machine. Unless you can pass through your computer’s hardware support for virtualization to the nested virtual machines, they will run slowly.

VirtualBox offers only limited support for nested virtualization. If you are using a Linux system, you can get better performance if you use Libvirt and KVM, which provide native support for nested virtualization.

When to use VirtualBox

If you plan to run Antidote on a Mac or a PC, you should use Antidote’s standard installation with VirtualBox¹. Vagrant and VirtualBox are both cross-platform, open-source tools.

The VirtualBox developers plan to improve support for nested virtualization. See the VirtualBox 6.0 changelog and the VirtualBox 6.1 beta release notes for more information. However, the planned VirtualBox nested virtualization features require a 5th generation Core i5 or i7 Broadwell processor, or newer, or an AMD processor. Older hardware, like my Lenovo T420 laptop, which uses a 2nd generation Intel Core i5 Sandy Bridge processor, will not benefit from those improvements.

Check your PC’s virtualization support

Before you start using KVM and Libvirt, verify that your Linux computer has hardware support for virtualization. Enter the following commands in the VM’s terminal.

$ grep -cw vmx /proc/cpuinfo

It should return a value equal to the number of virtual cores on your processor. If it returns 0, then something is wrong. You may need to change your PC’s BIOS or EFI settings.

Check the nested virtualization settings on the host computer:

$ cat /sys/module/kvm_intel/parameters/nested

The output should be “Y”. If it is “N”, run the following commands to enable nested virtualization:

$ sudo modprobe -r kvm_intel
$ sudo modprobe kvm_intel nested=1

To make the changes persistent after a reboot, add a line with the text, options kvm_intel nested=1, to the file /etc/modprobe.d/kvm.conf, as shown below.

$ echo "options kvm_intel nested=1" | sudo tee /etc/modprobe.d/kvm.conf

Install software

On a Linux system running Ubuntu 18.04, install the software the Antidote relies upon. If you are using a different Linux distribution or version, this same procedure should still work. You may need to modify it a bit if you are using different package managers and different software repositories, but you should be able to find the necessary adaptations with a few Internet searches.

Install the KVM hypervisor, an NFS server, Libvirt, and Vagrant. Vagrant does not natively support Libvirt, so you must also install the vagrant-libvirt plugin.

Libvirt

Install libvirt using the following commands:

$ sudo apt update
$ sudo apt -y install libvirt-clients libvirt-daemon-system qemu-kvm
$ newgrp libvirt

Utilities

Install additional utilities, if they are not already installed:

$ sudo apt install -y git curl

Vagrant

Install the latest version of Vagrant. Go to the Vagrant web site and download the Vagrant package, or use the curl command below. Replace the file name in the below example with the latest version. Then, install it.

$ cd ~/Downloads
$ curl -O https://releases.hashicorp.com/vagrant/2.2.6/vagrant_2.2.6_x86_64.deb
$ sudo dpkg -i ~/Downloads/vagrant_2.2.6_x86_64.deb

NFS server

The Antidote project uses NFS to transfer files from the host computer to the Antidote VM. Install an NFS server, if it is not already installed.

$ sudo apt install nfs-kernel-server

Vagrant-libvirt plugin

Enable the source code repositories so you can build the vagrant-libvirt plugin. Edit the file /etc/apt/sources.list and uncomment-out all the lines with deb-src at the start, as shown below:

$ sudo sed -i 's/# deb-src/deb-src/g' /etc/apt/sources.list
$ sudo apt update

Install the build dependencies for the vagrant-libvirt plugin:

$ sudo apt build-dep -y vagrant ruby-libvirt
$ sudo apt install -y qemu libvirt-bin ebtables dnsmasq-base
$ sudo apt install -y libxslt-dev libxml2-dev libvirt-dev zlib1g-dev ruby-dev

Install the plugin. The installation process will build it from the source code, so it may take a few minute to complete:

$ vagrant plugin install vagrant-libvirt

Host updater plugin

Install the other Vagrant plugins. The Antidote Vagrant install guide also requires the vagrant-hostsupdater plugin. Not that you should not install the other plugins mentioned in the Antidote documentation, because those plugins support VirtualBox, which we are not using.

$ vagrant plugin install vagrant-hostsupdater

Login to Vagrantcloud

Finally, create account on vagrantcloud.com. You need access to vagrantcloud.com to download the Vagrant box specified in Antidote’s Vagrantfile.

Vagrant cloud web site login page

After you create an account, log in to Vagrant with the following command:

$ vagrant cloud auth login

Enter your Vagrantcloud userid and password.

Download and install Antidote

Antidote requires you to create a lessons directory before you install it. Download the nrelabs-curriculum directory from the NRE Labs GitHub repository.

$ mkdir ~/antidote-local
$ cd ~/antidote-local
$ git clone https://github.com/nre-learning/nrelabs-curriculum

Download the antidote-selfmedicate directory. Unless you specify a new location in the antidote-config.yml configuration file (see below), you must install this directory at the same point in your filesystem as the nrelabs-curriculum directory. That is, each of these should be sub-directories of the same directory.

$ git clone https://github.com/nre-learning/antidote-selfmedicate

Configure Antidote

Antidote has some default values pre-configured the tell it which Vagrant provider to use and how much resources the Minikube VM should consume. The user may modify any of these values by editing the antidote-config.yml file in the antidote-selfmedicate directory.

$ cd antidote-selfmedicate
$ nano antidote-config.yml

Edit the file and enter in values appropriate for your own situation. For my system I chose the following configurations:

vm_config:
  memory: 4096
  cores: 2
  provider: libvirt

Save the file. Below, I describe each line in the file.

Memory

As I mentioned earlier, I am using my 9-year old Lenovo T420 laptop, which only has two hyper-threaded CPU cores and 8 GB memory. It is not powerful enough to run the Antidote VM using default configurations. After some experiments, I found that running an idle Antidote VM with 4 GB of RAM, alongside the other tools I use, results in the general performance of my laptop staying in an acceptable range and, when using Libvirt instead of VirtualBox, seems to support the lessons currently available in the curriculum — just barely.

Cores

I found performance is much better when I respect the KVM Performance Limits for virtual CPU cores, so I set the VM vCPU requirements to 2 CPU, which is appropriate for my laptop.

Provider

Change the provider to libvirt.

Run the Vagrant provisioner

Run Vagrant using the up command. Vagrant will read the Vagrantfile in the antidote-selfmedicate directory and create the environment specified in the file:

$ cd ~/antidote-local/antidote-selfmedicate
$ vagrant up

It may take a long time to get started because it is downloading large files from the NRE Labs repositories.

Testing performance

I compared Antidote’s performance on my Linux PC with both Libvirt and VirtualBox. I ran Antidote with Libvirt on my Ubuntu Linux 18.04 laptop. Then, I reinstalled Ubuntu Linux 18.04 and ran Antidote with VirtualBox.

I ran lessons that launched QEMU/KVM nested virtual machines as network nodes. Each nested VM supported a Juniper VQFX router image. I ran lessons that started one, two, and three VMs. I measured the time it took for the lesson to start. I did not measure how the nodes performed after they started.

The lessons I tested were:

1 VM: Junos PyEz lesson 24
2 VMs: Device Specific Template Generation lesson 35
3 VMs: Junos Terraform lesson 31

Between each test run, I ran the vagrant reload --provision command to restart the Minikube VM and refresh all configurations. I ran each test 7 times, discarded the highest and lowest time, then averaged the remaining five times. I display the test results in the table below.

Start time, in seconds	1 VM	2 VM	3 VM
NRE labs web site	43	38	50
Libvirt	108	139	315
VirtualBox	100	175	560*

I was surprised to see the performance was so close for the lessons with one VM. The Libvirt provider performed a bit better in the two-VM tests. However, the three-VM test showed a large improvement in performance for the Libvirt provider compared to the VirtualBox provider. The VirtualBox three-VMs test failed five out of seven times and the two successful test times were over nine minutes.

I suspect that, because I set the Minikube VM size to 4 GB, which is well below the minimum recommended VM size of 8 GB, the three-VM test is at the very edge of what can be supported in that configuration.

See the chart, below, for another view of the results:

Chart showing test results. Libvirt starts labs faster.

For comparison, I also included timing results from the NRE labs web site. On that site, Antidote is running on bare metal on a modern server, so it is much faster.

Conclusion

Running the Antidote network emulator on a Linux system using KVM and Libvirt, instead of VirtualBox, results in measurably better performance, especially when using an older computer, or when you have limited memory.

Note that HyperV and VMware also support nested virtualization, but they are not open-source tools and are platform-specific. ↩

↧

Fixing a Thinkpad T420 battery problem on Linux

March 8, 2020, 10:44 am

≫ Next: Python: The Minimum You Need to Know

≪ Previous: Run the Antidote network emulator on KVM for better performance

I upgraded my T420 because Ubuntu Mate 19.10 now supports the Nvidia Optimus drivers and includes a utility that lets me switch between Intel and Nvidia graphics cards. However, the upgrade seemed to break the power management on my laptop. When running on the battery, the laptop would suddenly lose power after only 10 minutes, even when the battery still shows ninety percent charge.

I installed Linux Advanced Power Management, TLP. TLP solved my problem. Also, for good measure, I upgraded the BIOS because, while troubleshooting this issue, I discovered is was very out of date.

In his post, I describe how to install and configure TLP and how to upgrade the BIOS on a Lenovo Thinkpad T420.

Install TLP

The Mate Power Management utility is part of the Mate desktop environment and provides basic configurations for power management. I don’t know why installing TLP solved my battery problem. I can only suggest that, if you are seeing a similar problem with your battery, try installing TLP before you spend money on a new battery.

TLP is in the Ubuntu repositories. Install TLP using the following command:

$ sudo apt update
$ sudo apt install tlp tlp-rdw
$ sudo tlp start

After restarting my PC and testing it on battery power, I found that simply installing TLP solved my problem. My battery life returned to normal.

Managing TLP

You can view and edit TLP status and settings from the command line.

The tlp-stat command will show you the TLP configuration and the status of your battery:

$ sudo tlp-stat

You may specify a parameter to display only a specific portion of the tlp-stat output. The most commonly used would be -c to see the current configuration and -b to see the battery status.

$ sudo tlp-stat --help
Usage: tlp-stat [ -b | --battery   | -c | --config    |
                  -d | --disk      | -e | --pcie      |
                  -g | --graphics  | -p | --processor |
                  -r | --rfkill    | -s | --system    |
                  -t | --temp      | -u | --usb       |
                  -w | --warn      | -v | --verbose   |
                  -P | --pev       |    | --psup      |
                  -T | --trace ]

TLP is managed by editong a configuration file: /etc/default/tlp. It is a large file with many options. All the default values work well and you probably do not need to change anything. Instead of editing it directly, I installed an optional graphical user interface that helps me to change the TLP configuration.

Optional TLP modules

The TLP utility supports some features specific to Thinkpads, such as the ability to set battery charging thresholds. These are optional and require you install additional kernel modules. Install the additional modules with the following commands:

$ sudo apt-get install acpitool tp-smapi-dkms acpi-call
$ sudo apt-get install smartmontools

TLP Graphical User Interface

The TLP GUI is a set of Python scripts that help users to change TLP configuration files easily. It protects users from setting invalid configurations.

In install TLP GUI, clone the repository to your local drive:

$ git clone https://github.com/d4nj1/TLPUI

To run TLP GUI, navigate to he TLPUI directory and run the Python program:

$ cd TLPUI
$ python3 -m tlpui

The GUI shows you most of the configuration options available for TLP.

The TLP GUI modifies the configuration file when you click the Save button.

Update BIOS

I also decided to upgrade the BIOS on my Thinkpad T420. To check the current vesion of the BIOS, run the following command:

$ sudo dmidecode | grep -A 3 "BIOS Information"

I see I have version 1.37, as shown in the output below:

BIOS Information
        Vendor: LENOVO
        Version: 83ET67WW (1.37 )
        Release Date: 11/28/2011

According to the Lenovo support web site, the latest version is: 1.5227 Jun 2018, so my BIOS is very out of date.

To upgrade the Lenovo Thinkpad T420 BIOS while using Linux, you must download a bootable disk image containing the BIOS upgrade utility from the Lenovo Support site.

NOTE: At the time I wrote this post, the T420 was still supported by Lenovo. However, it looks like it will soon be moved to Lenovo’s unsupported product list. If you do not find the T420 downloads page when you search for T420 on the Lenovo support page, check the Lenovo End of Life product page, instead.

In my case, the file I downloaded was named 83uj33us.iso. After downloading the ISO disk image, convert it into El Torito CDROM format using the following commanda:

$ cd ~/Downloads
$ geteltorito -o bios.img 83uj33us.iso

The command creates a new, converted image named bios.img.

Booting catalog starts at sector: 20 
Manufacturer of CD: NERO BURNING ROM
Image architecture: x86
Boot media type is: harddisk
El Torito image starts at sector 27 and has 75776 sector(s) of 512 Bytes

Image has been written to file "bios.img".

Insert a USB stick into the Thinkpad. Find the device name so you can write to it. Be sure to find the correct name or you risk overwiting the wrong disk!

Run the following command:

$ dmesg | tail

In the output, look for something like: Attached SCSI removable disk

[16698.230209] usbcore: registered new interface driver usb-storage
[16698.242852] usbcore: registered new interface driver uas
[16699.245775] scsi 6:0:0:0: Direct-Access     Kingston DataTraveler 3.0      PQ: 0 ANSI: 6
[16699.252329] sd 6:0:0:0: Attached scsi generic sg2 type 0
[16699.255635] sd 6:0:0:0: [sdb] 30218842 512-byte logical blocks: (15.5 GB/14.4 GiB)
[16699.256610] sd 6:0:0:0: [sdb] Write Protect is off
[16699.256618] sd 6:0:0:0: [sdb] Mode Sense: 4f 00 00 00
[16699.258499] sd 6:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[16699.265314]  sdb: sdb1
[16699.269369] sd 6:0:0:0: [sdb] Attached SCSI removable disk

Here we see the device sdb as recently attached as a removable disk. This is our USB key.

Assuming the USB stick is mounted, you can double-check using the df -lh command, which will list all filesystems on your local PC. You should see that the device sdb is the same size as your USB key:

$ df -lh
Filesystem      Size  Used Avail Use% Mounted on
udev            3.8G     0  3.8G   0% /dev
tmpfs           785M  1.5M  783M   1% /run
/dev/sda1       110G   33G   72G  32% /
...
tmpfs           785M   64K  785M   1% /run/user/1000
/dev/sdb1        15G  8.0K   15G   1% /media/brian/xfer

Next, copy the converted disk image, bios.img to the USB stick. The following command will erase the entire stick so, again, be sure you picked the correct device. In this example, it is sdb. In your case, it may be different.

$ sudo dd if=bios.img of=/dev/sdb bs=1M

Now, boot your Thinkpad from the USB stick. Restart the computer and as soon as you see the Thinkpad logo on the screen, press the F12 key to select the boot disk. Choose the USB stick as the boot dosk.

The system should boot into the BIOS update utility. Follow the prompts and choose the Update system program option.

When the utility completes the upgrade, press the Enter key at the information prompt.

The Thinkpad will reboot again. Let it oot into your normal Linux system.

After reboot, open a terminal and run the following command to see BIOS version:

$ sudo dmidecode | grep -A 3 "BIOS Information"

You should see the BIOS has been updated to th latest version which is 1.52, in this case.

BIOS Information
    Vendor: LENOVO
    Version: 83ET82WW (1.52 )
    Release Date: 06/04/2018

It looks like the upgrade was succesful.

Conclusion

Installing TLP solved my battery problem. I did not ned to install any optional modules or upgrade the BIOS to solve the battery issue but I did those things, anyway, and the seemed to cause no other problems.

↧

Python: The Minimum You Need to Know

September 22, 2020, 6:06 am

≫ Next: A Python learning path for network engineers

≪ Previous: Fixing a Thinkpad T420 battery problem on Linux

Many network engineers and other professionals are transitioning their skills set to include programming and automation. Commonly, their previous programming experience comes from a few programming courses they attended in university a long time ago. I am one of those professionals and I created this Python programming guide for people like you and me.

In this guide, I explain the absolute minimum amount you need to learn about Python in order to create useful programs. Follow this guide to get a very short, but functional, overview of Python programming in less than one hour.

I omit many topics from this text that you do not need to know when you begin using Python; you can learn them later, when you need them. I don’t want you to have to unlearn misconceptions later, when you become more experienced, so I do include some Python concepts that other beginner guides might skip, such as the Python object model. This guide is “simple” but it is also “mostly correct”.

Getting Started

In this guide, I will explore the seven fundamental topics you need to know to create useful programs almost immediately. These topics are:

The Python object model simplified
Defining objects
Core types
Statements
Simple programs
Modules
User input

Of course, there is much more to learn. This guide will get you started quickly and you can build more skills as you gain experience writing Python programs that perform useful tasks.

There is no substitute for learning by doing. I recommend you also start a terminal window and run the Python interactive shell so you can type in commands as you follow this guide.

Install python

This guide is targeted at Linux users but is still applicable to any operating system. You can find instructions to install Python on any operating system in the Python documentation.

Python is probably already installed in your Linux system.

Python Interactive Prompt

There are many ways to start and run Python programs in Linux. While you are learning about Python’s basic building blocks, you will use the Python Interactive Prompt to run Python statements and explore the results. Later, you will run Python programs using the Python interpreter. In both cases, you will launch Python from the Linux bash shell.

Open a new Terminal window. To start the Python interactive prompt, type python or python3 at the command prompt.

> python

You will see the interactive prompt, >>>.

>>>

To quit interactive mode, type quit() or the CTRL-Z key combination.

>>> exit()

You will find that the Python interactive prompt is a great tool for experimenting with Python concepts. It is useful for learning the basics but it is also useful for trying out complicated ideas when you get more experienced. You will use the Python interactive prompt often in your programming career.

The Python object model simplified

Everything in Python is an object.

Python is an object-oriented programming language, but you do not need to use its object-oriented features to write useful programs. You may start using Python as a procedural programming language, which is familiar to most people who have a little programming knowledge. While I focus on procedural programming methodologies, I will still use some terminology related to objects so that you have a good base from which you may expand your Python skills.

In Python, an object is just a thing stored in your computer’s memory. Objects are created by Python statements. After objects are created, Python keeps track of them until they are deleted. An object can be something simple like an integer, a sequence of values such as a string or list, or even executable code. There are many types of Python objects.

Python creates some objects by default when it starts up, such as its built-in functions. Python keeps track of these objects and of any objects created by the programmer.

Python objects

When you start Python, it creates a number of objects in memory that you may list using the Python dir() function. For example:

> python
>>> dir()

This will return a list the Python objects currently available in memory, which are:

['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__']

Note that this is returned as a Python list, as indicated by the square brackets (more about lists later).

Create a new object. Define an integer object by writing a Python statement that creates an integer object, assigns the value of 10 to it, and points to it with the variable name a:

>>> a = 10

Call the integer object named by a. Python will return the result in the interactive prompt:

>>> a
10

List all objects available in memory, again. Look for the integer object a:

>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__','a']

See that the object a is added to the end of the list of Python objects. It will remain until you quit the Python interactive session.Python automatically deletes objects when they are not used by your program, in a process called Garbage Collection. Beginners don’t need to know anything about that.

Getting help

You may use the help() function to see the built-in Python documentation about each object type. Call the name of the object (or the type, if you know it) and the Python help function will print the documentation. For example:

>>> help(a)

You asked for help about object a. Python knows object a is an integer so it showed you the help information for a Python int, or integer, object type. You would get the same output if you had called the help function using the object type int.

>>> help(int)

As you work with Python in the interactive prompt, you can use the dir() and help() functions to better understand Python

Defining objects

In Python, statements define an object simply by assigning it to a variable or using it in an expression.

One of the fundamental concepts in Python is that you do not need to declare the type of an object before you create it. Python infers the object type from the syntax you use to define it.

In the example below, a defines an integer object, b defines a floating-point object, c defines a string object, and d defines a list object, and in this example each element of the list is a string object.

>>> a = 10                  # An integer 
>>> b = 10.0                # A floating point 
>>> c = 'text'              # A string 
>>> d = ['t','e','x','t']   # A list (of strings)

See how the syntax defines the object type: different objects are created if a decimal point is used, if quotes are used, if brackets are used, and depending on the type of brackets used. I will explain each of the Python object types a little bit later in this guide.

Comments

Note also that the syntax for comments in Python is the hash character, #. Other ways to comment and document Python programs are available but, for sake of simplicity, I skipped them from this guide.

Variables point to objects

In each of the four examples above, you created an object and then pointed a variable to that object. This is fundamentally different from more traditional programming languages. The variable does not contain the value, the object does. The variable is just a name pointing to the object, so you can use the object in your program.

A variable may be re-assigned to another object, even if the object is a different type. You are not changing the value of the variable or the type of the variable because the variable has no value or type. Only the object has a value or a object type. The variable is just a name you use to point to any object. So, the following code will work in Python:

>>> a = 10
>>> a
10
>>> a = 'text'
>>> a
'text'

See that you can assign an integer object to variable a and, later, assign a string object to variable a. The original integer object that had a value of 10 is erased from memory after you reassign the variable a to a string object that has a value of ‘text’.

When you begin working with Python, I suggest you write your code to avoid mixing up object types with the same variable names, but you may see this behavior if you are working with code someone else has written.

Object methods

Each instance of a Python object has a value, but it also inherits functionality from the core object type. Python’s creators built methods into each of the Python core object types and you, the programmer, access this built-in functionality using object methods. Object methods may evaluate or manipulate the value stored in the object and allow the object to interact with other objects or create new objects.

For example, number objects have mathematical methods built into them that support arithmetic and other numerical operations; string objects have methods to split, concatenate, or index items in the string.

The syntax for calling methods is object.method(arguments), adding the name of the method, separated by a period, after the object name and ending with closed parenthesis containing arguments.

For example, one (not recommended) way to add two integers together is to use the integer object’s __add__ method:

>>> a = 8
>>> a.__add__(2)
10

Above, you created an integer object with a value of 8 and pointed the variable a to it. Then you called the integer object pointed to by variable a and used its __add__ method to return a new object that has a value of 10. Note that you do not normally do addition this way in Python but the Python integer object’s __add__ method is the underlying code used by Python’s addition operator, +, and the sum() function when using them with integer objects.

Here is another example: create an integer object with a value of 100 and assign it to a variable named c.

>>> c = 100
>>> c
100

Then look at all the methods and objects associated with the integer object by using the dir() function:

>>> dir(c)

You get a long list of object methods. These were all defined by the creators of Python and are “built in” to the integer object. Other Python functions may use some of these methods to perform their tasks, but you don’t need to know all the details of how Python works “under the hood”. From this list, you see that one of the methods associated with the integer object c, is bit_length. Use help() to get more information about what this method does:

>>> help(c.bit_length)

See it returns the minimum number of bits required to represent the number in binary. For example, the number 100 is binary 1100100, which is seven bits. Verify this using the bit_length method that is built into the integer object c:

>>> c.bit_length()
7

In summary: every Python object also comes with built-in methods that are available when the object is created. You can see the methods and learn more about them using the dir() and help() functions.

Core object types

As I mentioned previously, everything in Python is an object. You need to learn about a few basic object types to get started with Python. There are more object types than those listed below but we’ll start with this list of the object types that network engineers will use most often.

Integer objects
Floating point objects
String objects
File objects
List objects
Program Unit objects

Numbers object types

Numbers objects are usually defined as integers or floating point numbers. Python also supports complex numbers and special types that allow users to define fractions with numerators and denominators, and fixed-precision decimal numbers. The following code creates two integers and adds them together:

>>> a = 10
>>> b = 20
>>> a + b
30
>>> c = a + b
>>> c
30

String object types

Strings objects may be text strings or byte strings. The main difference is that text strings will be automatically encoded and decoded into readable text by Python, and binary strings will be left in their raw, machine readable, form. Byte strings are usually used to store media such as pictures or sounds.

Readable text strings are created with quotes as follows:

>>> z = 'text'
>>> z
'text'

File object types

Files are objects created by Python’s built-in open() function. Type help(open) at the interactive prompt for more information. Whether opening an existing file, or creating a new one, the open() function returns a file object which is assigned to a variable name, so you can reference it later in your program. For example:

>>> myfile = open('myfile.txt', 'x')
>>> myfile
<_io.TextIOWrapper name='myfile.txt' mode='x' encoding='cp1252'>

Remember, you can see all the methods available for the file object you created by typing dir(myfile), and you may get help about the open command and all its options by typing help(open).

You may close a file using the file object’s close method.

>>> myfile.close()

List object types

When you were in school and you may have taken a course about data structures. Or, if you have experience working with computer languages like C or C++, you had to create your own data structures to manage data in your programs. You probably implemented a data structure called a linked list, which contained a series of elements in computer memory linked by pointers. You probably wrote code to create functions that allowed you to insert items in the list, remove items, find items by index, and more.

Well, forget all that because Python has done it for you. Python has built-in data structure objects like lists, dictionaries, tuples, and sets. The list and the dictionary are the most commonly used data structures. I cover lists in this guide. You can read about the other data structures in the Python help() function or in the Python documentation.

You create a list object in Python using square brackets around a list of objected separated by commas. For example:

>>> k = [1,3,5,7,9]

Above, you created a list of five integer objects.

Python lists are very flexible and may contain a mixture of object types. For example:

>>> k = [1, "fun", 3.14]

Above, the list object contains three objects: an integer object, a string object, and a floating-point object. Lists can also contain other list objects, which is knows as nested lists. For example:

>>> k = [[1,2,3],['a','b','c'],[7.15,8.26,9.33]]

Above, you created a list of three objects, each of which is a list of three other objects.

Individual items in a list can be evaluated using index numbers. For example:

>>> k
>>> [[1,2,3],['a','b','c'],[7.15,8.26,9.33]]
>>> k[0]
[1, 2, 3]
>>> k[1]
['a', 'b', 'c']
>>> k[1][0]
'a'

Lists can be iterated over, concatenated, split, and manipulated in other ways using the list object’s built-in methods or Python’s functions and operators. Lists are a useful “general purpose” data structure and, in most programs, you will use lists to gather, organize, and manipulate sequential data. Lists are often used as iterators in for loops.

Increase the size of a list by adding elements to the end of the list with the list object’s append method. For example:

>>> list = []
>>> list.append("one")
>>> list.append("two")
>>> list
["one", "two"]

Insert an item at some indexed point in the list using the insert object. For example:

>>> list.insert(1,"three")
['one', 'three', 'two']

Pop and item from the list data structure at any index location or, by default, at the end using the pop method. For example:

>>> a = list.pop(1)
>>> a
'three'
>>> list
['one', 'two']
>>> list.pop()
'two'
>>> list
['one']

There are many more list methods for manipulating the sequence of items stored in the list data structure. See the Python documentation for more details.

Program Unit Types

Like any programming language, Python has programming statements and syntax used to build programs. In addition to that, Python defines some object types used as building blocks to create Python programs. These program unit object types are:

Operations
Functions
Modules
Classes

Operations

Operations are symbols used to modify other objects according to the methods supported by each object. Python contains operators to assign values, do arithmetic, make comparisons, and do logic. There are also operators that perform bitwise operations (for binary values), identity operations, and membership operations.

Below is a list of common operation types. Many more exist; check the Python documentation for more information.

assignment operators include =, and “+=”
arithmetic operators include, +,–,*, and /
comparison operators include >, >=, ==, and !=
logic operators include “and”, “or”, and “not”

Functions

Functions are containers for blocks of code, referenced by a name, commonly used in procedural programming. They are a universal programming concept used by most programming languages and may also be called subroutines or procedures. Use functions in your programs to reduce redundancy and to organize you program code, so it is easier for others to maintain.

Some functions are already built into Python, like the sum(), dir() and help() functions. Other functions may be created by programmers like you and included in programs.

The Python def statement defines function objects. The def statement syntax is: def function_name(argument1, argument2, etc): followed by statements that make up the function.

Here, I get ahead of myself a little bit because, to define a function, you need to show Python statements and syntax. For now, just know that Python uses leading spaces to group code into statements. Define a simple function in the Python interactive prompt:

>>> def fun(input):
...     x = input + ' is fun!'
...     return x
...

Note that the interactive prompt changes from >>> to ... when Python understands that you will enter multi-line statements. This behavior is activated by the syntax of the statement and the indentation you use after that (see Python statements and syntax, below). Press return on an empty line to finish defining the function.

Run the dir() function. You can see that the object fun has been added to the list of objects Python is tracking:

>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'fun']

Call the fun() function and input a string as an argument.

>>> fun('skiing')
skiing is fun!

You will see the object addressed by the variable name fun in the list of objects returned by the dir() function. If you pass the function object into the dir() function, you will see all the methods associated with function objects, in general.

>>> dir(fun)
['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

You can do a lot with functions and, until you get to advanced topics like object-oriented programming, functions will be the primary way you organize code in Python.

Modules

I will cover modules in more detail when I discuss running our Python program from saved files. A Python module is a file containing Python code that you can import into another program when you run it. Modules allow you to organize large projects into multiple files and also allow you to re-use modules created by other programmers. For example: Python’s built-in modules.

When writing complex programs, Python developers usually create module files that each contain multiple related functions. Any text file whose filename ends with the .py extension may be imported as a module into another program.

Classes

Classes are objects used in object-oriented programming. Use classes to create new objects or to customize existing objects. I ignore Python classes in this guide. But, you will need to learn about classes and object-oriented programming if you want to work with more complex frameworks, or if you are collaborating with other coders on the same project.

Mutability of objects

When programming in Python you may find documents that talk about how some object types are mutable and other object types are immutable. When working with only the minimum sub-set of Python object types you need to know, you usually do not need to worry about whether objects are mutable or immutable. In larger projects, where you will work with more object types and will need to know the technical details about how objects are handled when they are passed into and out of functions as arguments and results, you need to understand this concept.

Remember that the Python variables do not contain values; they simply point to exiting objects. Some object types are immutable and cannot be modified directly. That is, the value of the object cannot change as a result of some operation. But a new object could be created as a result of an expression that involves another object. Other object types are mutable and can be directly changed as the result of an operation. This can lead to some confusing behavior if you are not familiar with this concept.

It is pretty obvious that numbers are immutable, but this concept becomes important when using certain “immutable” objects like strings. You cannot directly assign a new value into some part of an already-created string object. For example:

>>> string = "test'
>>> string[2]
s
>>> string[2]='r'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

Lists are mutable. They are data structures that can be directly manipulated. Think of each element in a list as a variable that references another object. Each element of the list can be re-assigned to another object in the same way variables can be reassigned. Objects may be removed from a list using one of the list objects methods such as pop. Objects may be inserted anywhere in the list. This makes the list very useful as a store for data in a program.

for example:

>>> list = ['t','e','s','t']
>>> list[2]
s
>>> list[2] = r
>>> list
['t', 'e', 'r', 't']
>>> list.pop()
't'
>>> list
['t', 'e', 'r']

In the above example a novice programmer might not expect that, when popping the last item from a list, she is not simply “reading” that item, she is actually also removing it is the same operation.

But, what if I assign a new value to an integer like a? Isn’t that modifying the immutable integer object? No. You are creating and assigning a new integer object to the variable named a. Remember the relationship between variables and objects.

For example, if I add “2” to an integer object referenced by a variable named “a”, the object does not change. For example:

>>> a = 100
>>> a + 20
120
>>> a
100
>>> a = 3000
>>> a
3000

See that the original integer object referenced by variable a is not changed by the addition operation but the variable a can be reassigned to a new integer object with the value of 3000.

Python statements

A python program is composed of statements. Each statement contains expressions that create or modify objects. Python organizes the syntax of statements — especially control statements like if statements, or for statements — by indenting lines using blanks or tabs.

Python statements are grouped into the following categories:

Assignment statements such as a = 100
Call statements that call objects and object methods. For example: fun('skiing') or a.bit_length()
Selecting statements such as if, else, and elif
Iteration statements such as for
Loop statements such as while, break, and continue
Function statements such as def

The list above is a good starting point for building Pyton programs.

Statement syntax

Python uses white space indents to group statements into code blocks. Other languages might use brackets or semicolons to separate statements, but Python uses only blanks or tabs (Pick a side! Fight!) and newlines.

For example, a Python if statement would look like this in the interactive prompt:

>>> a = 10
>>> b = 20
>>> if a > b:
...     print('A is bigger')
... else:
...     print('A is NOT bigger')
...
A is NOT bigger

White space is used to define which code blocks are inside iterators, loops, functions, or selector statements. If you nest statements, you will see how the indenting using white space helps you identify the groups of expressions in each statement. For example:

>>> a = 10
>>> b = 20
>>> c = 3
>>> if a > b:
...     print(a)
...     for i in range(c):
...         a = a + 1
...     print(a)
... else:
...     print(b)
...     for i in range(c):
...         b = b + 2
...     print(b)
...
20
26

See how the if and else statements contain blocks of code that contain for statements that also contain blocks of code. The indents make it easy for you to read the code, but they also makes it hard to find errors so be careful to consistently indent your code.

Assignment statement syntax

Assignment statements create objects and name variables that point to the objects. They consist of the variable name, the = operator, and the value of the object to be created, written in syntax that identifies the type of object. For example:

a = 100
b = 3.14
c = 'stretch'
d = [3, 4, 'pine']

Call statement syntax

Call functions or object methods using call statements. The syntax consists of the function or object method name, followed by parenthesis that enclose the arguments to be passed to the function. For example:

fun('skiing')
a.bit_length()

Selecting statements syntax

Selecting statement allow the programmer to define operations that occur depending on the value of specific objects. The syntax involves colons, spaces, and newlines. Start with the if statement and expression to be tested, followed by a colon. On the next line, indent the text (I use 4 spaces) and add in the statement to run if the condition was true. Back out one indent (or 4 spaces) if you will add in elif statements of an else statement. The else and elif statements are followed by a colon, and the code to run in each of these statement is indented. For example:

if a == b:
    print('A is equal to B')
elif a > b: 
    print('A is greater then B')
elif a < b:
    print('A is less than B')
else:
    print('all other cases')

Iteration statement syntax

Iteration statements such as for require an iterator object such as a list through which it can iterate. The for statement ends with a colon and indent the code that will execute on each iteration below it. For example:

fruit = ['berries','apples','bananas', 'oranges']
for i in fruit:
    print(i)

The above statements would print out the following:

berries
apples
bananas
oranges

You can use the for statement to create loops by incorporating the range() function, as follows:

a = 0   
for i in range(100):
    a = a + 1
    print(a)

The above code prints a series of numbers from 1 to 100.

Technically, the for statement is not a loop, it is an iterator. In the last example above, it iterates through a list object containing 100 integers with values from 0 to 99, which was created by the range(100) function. Each iteration updates the value pointed to by the i variable until the statement iterates to the end of the list.

Loop statement syntax

Loop statements control how many times a section of code will run in a loop. The while statement ends with a colon and the next lines are indented. For example:

kk = 1
while kk < 100:
    print(kk)
    kk += 1

The above code prints a series of numbers from 1 to 100.

Additional control statements work with the while statement to break out of a loop if a condition is met, or continue.

Function statement syntax

The def statement creates a function object in Python. The function object may be called using a call statement. The syntax of the def statement is: def followed by the name of the function, followed by the arguments expected by the function in parenthesis, followed by a colon. The code in the function is indented starting on the next line. You already saw examples of creating and calling a function above, but here is another example:

def test_func(number, string):
    x = number * string
    return x

Functions may also return an object when they run. If you insert the return statement into a function’s code, the function will terminate at that point and return the result to the calling program. The return statement is the main way you will send the results of functions back to the calling program or function.

The above function should return a string that is a concatenation of the input string repeated times the input number. You can test the function by calling it with input argument as shown below:

>>> test_func(3,'go')
'gogogo'

Simple Python Programs

Now you can stop the interactive prompt and start writing programs. A Python program is just a text file that contains Python statements. The program file name ends with the .py extension.

For example, use your favorite text editor to create a file called program.py on your Linux PC. The contents of the file should be:

a = 'Hello World'
print(a)

The simplest way to run a Python program is to run it using Python. For example, open a Terminal window, and type the following:

> python program.py
Hello World

The above text will run the file program.py in the Python interpreter.

In Linux, you may also just call the program file at the command prompt. Start the program file with the following text:

#!/usr/bin/python3

This shebang line is used at the start of most interpreted program files. Linux uses it to determine which programming language interpreter it needs to start to run the program. You should just make a habit of including it, regardless which operating system you use.

Python Modules

You can create simple, or very complex, Python programs all in one file. But, as you get more experience using Python in network engineering, you will start breaking your programs up into separate files that can be maintained and tested separately.

To bring code from another file into your Python program at run time, use the import statement. Everything you import to your program is called a module, even though, on its own, it just looks like any other Python program. You will usually have one main program file with the basic logic of your program, and you may create other files, now called modules, that contain definitions for functions and other objects that your main program will call.

Python also comes with many built-in modules you can import to your program to access more functionality. Look at the Python socket and requests modules, for example. Also, many third-party developers create modules that you can install in Python and then import into your own programs. Some of these are especially useful to network engineers. For example, look at the napalm and paramiko modules.

Let’s experiment with creating a module. This module will simply define five objects using an example you used previously.

Open a text editor and create a Python program called mod01.py. Add the following text:

#!/usr/bin/python3
a = 10
b = 10.0
c = 'text'
d = ['t','e','x','t']
def fun(input):
    print(input + ' is fun!')

Save the file mod01.py.

Now, open the Python interactive prompt:

> python

Check the objects Python tracks in memory:

>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__']

Now import the module you created:

>>> import mod01

Now check the objects tracked by Python again:

>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'mod01']

See that a new object has been created, called mod01. This object has methods that are objects contained within it; the objects you created in the mod01.py program. View them by running the dir() function:

>>> dir(mod01)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'a', 'b', 'c', 'd', 'fun']

See that the module mod01 conatins the usual Python objects, plus the five objects you created. These objects were created because, when Python read the import statement, it ran the file mod01.py and the statements in that file created the objects.

To access these specific objects in the main program, call each object’s method by name using the syntax for calling object methods. For example:

>>> mod01.a
10
>>> mod01.d
['t', 'e', 'x', 't']
>>> mod01.fun('wrestling')
wrestling is fun!
>>>

Of course, you need to know what each of the module’s methods is so you can use it properly. If you are using a Python module or a third-party module, consult the module’s documentation to learn how to use all its methods.

Importing large modules can use up a lot of memory and you may only use a few specific methods from a module. There are ways to be more efficient but, for now, just import modules and don’t worry about memory usage. I am keeping this guide simple so I will not discuss importing specific objects from modules, or the concepts and issues related to Python namespaces. Just remember those are things you will want to learn later in your learning journey.

If name == ‘main’:

If you are reading code that someone else wrote, you will probably see a code block near the end of the file that starts with a statement like the following :

if __name__ == '__main__':

This code tests if this file is being run in Python as the main file. It is usually found in Python modules files so those files can be tested, or run, by themselves. The code in this if block will not run of it is in an imported module. This allows Python developers to create modules that, when run by themselves, can test their own code.

The if __name__ == '__main__': code block will contain statements that run the functions defined in the module file (or imported from other modules). By convention, Python programmers use this text in every file that also contains function definitions, even the main program file.

If you see this text in the main program file in a Python project, it will contain the code that starts the program.

Get user input

Typically, your Python program will require some input from a user. This input can be input as arguments at the command line when the python program start, or it may be gathered by the program at run time by asking for input from a user. It may even be read in from a file.

To input arguments at the command line, you would need to explore some topics like the Python sys and argparse modules, how to parse arguments, how to test arguments before using them, and more. I’m choosing not to discuss that in this simple guide, but you can find some good information about parsing Python program command line arguments in the Python documentation.

You will have to learn to reading input from a file and write to a file in the near future. I skip that topic in this guide. Information about using Python to read input from a file is in the Python documentation.

I suggest that, while you are still learning the basics, use Python’s input() function to request and receive user input. This lets you prompt the user for input and then reads the first line the user types in response. It reads the input as a string, so you may need to convert it to another object type if that is what you require. For example, try the following at the Python interactive prompt:

>>> age = input('How old are you? ')
How old are you? 51
>>> age
'51'
>>> x = int(age)
>>> newage = x + 10
>>> print('you will be ' + str(newage) + ' in ten years')
you will be 61 in ten years
>>>

Final example

Bring most of the concepts I discussed above together into one final example. Create two Python files using a text editor. One will be the main Python script and the other will be a module containing some function definitions.

The script will gather three numbers from the user, add them together, and then output the name of the new number, in English.

The first file will be a Python module containing all our functions. Save it with the filename functions.py. The text in the file is:

#!/usr/bin/python3

ones = ["one","two","three","four","five","six","seven","eight","nine"]
teens = ["eleven","twelve","thirteen","fourteen","fifteen","sixteen","seventeen","eightteen","nineteen"]
tens = ["","twenty","thirty","forty","fifty","sixty","seventy","eighty","ninty"]
hundred = "hundred"

def input_ok(input):
    if input >= 1000:
        return False
    elif input <= 0:
        return False
    else:
        return True

def convert_to_text(number):
    string = str(number)
    number_length = len(string)
    if number_length == 1:
        print(ones[number-1])
    elif number_length == 2:
        low_digit = int(string[1])
        mid_digit = int(string[0])
        if mid_digit == 1:
            print(teens[low_digit-1])
        else:
            x = tens[mid_digit-1] + " " + ones[low_digit-1]
            print(x)
    elif number_length == 3:
        low_digit = int(string[2])
        mid_digit = int(string[1])
        high_digit = int(string[0])
        if mid_digit == 1:
            b = teens[low_digit-1]
        else:
            b = tens[mid_digit-1] + " " + ones[low_digit-1]
        print(ones[high_digit-1] + " " + hundred + " " + b)
    else:
        print("Error: bad input not caught")

The second file will contain the main program logic. Save it as numtext.py. The text in the file is:

#!/usr/bin/python3

import functions

number_list = []
i = 0
while i != 3:
    numstr = input("Enter a number: ")
    numint = int(numstr)
    if functions.input_ok(numint):
        number_list.append(numint)
        i += 1
    else:
        print("Input must be less than one thousand and greater than zero")

for j in number_list:
    functions.convert_to_text(j)

See how you imported the functions module? When Python encountered the import statement, it ran the functions.py file, which created the list objects and functions in memory. These functions were addressed using the module name in the code.

There are many ways this simple program can be improved. For example, you could improve the input_ok function so it also checks for non-numeric characters; you could improve the logic of the convert_to_text function so it is more concise and elegant.

Now, run the program and see the results. At the command prompt, enter the name of the main script, numtext.py to run it.

> numtext.py
Enter a number: 51
Enter a number: 13
Enter a number: 78
fifty one
thirteen
seventy eight

Try again with numbers outside the acceptable range of 1 to 999:

> numtext.py
Enter a number: -34
Input must be less than one thousand and greater than zero
Enter a number: 0
Input must be less than one thousand and greater than zero
Enter a number: 5
Enter a number: 6
Enter a number: 30000
Input must be less than one thousand and greater than zero
Enter a number: 51
five
six
fifty one

Now you have introduced yourself to the basic building blocks of the Python programming language. You can build programs that gather user input, perform evaluations and calculations, and output results to the terminal window.

Conclusion

I covered the elements of Python programming that represent the minimum knowledge a network engineer should have to get started writing useful scripts in Python. There is still more to know, such as learning about Python’s built-in networking modules and modules created by third parties, learning about the application programming interfaces supported by networking equipment from various vendors, and learning about network automation best practices.

After reading this guide, I hope you feel you are ready to start learning about these other topics, while using the Python programming language to interact with those technologies. You will learn more about Python as you experiment with it to develop network automation programs.

Resources

Sites that provide information about Python programming are listed below:

↧

A Python learning path for network engineers

November 30, 2020, 7:52 pm

≫ Next: Flask web app tutorial for network engineers

≪ Previous: Python: The Minimum You Need to Know

Python programming is now a required skill for network engineers. I recorded videos of myself as I learned and practiced Python programming. I think these videos, along with the links to learning resources associated with each video’s topic, serve as a good learning guide for network engineers getting started with Python programming.

This post collects links to all ten videos I created. Over the course of these videos, I wrote a program called Usermapper that reads a configuration file and builds an XML authentication file for the Guacamole web proxy. I also used the Git version control system and posted the code in my Usermapper GitHub repository

Topics I need to learn

I learned some programming during my Electrical Engineering degree program many years ago. After I graduated, except for some basic scripting, I’ve not had to do any programming.

These videos do not cover the basics of Python. I strongly suggest you read a book about Python, or watch some video training (see suggestions below) before you start working through these videos. Before I started recording this first video, I read the O’Reilly book, Learning Python, and wrote a blog post about what I learned in the first part of the book.

This video refers to the following resources:

My first Python program

In this video, I write the first part of a program that will build a user authentication file that is compatible with the Apache Guacamole web proxy server. The output file will eventually be in XML format but this first version creates a Python dictionary populated with all the required information.

As I wrote this code, I learned more about the modules and packages available in Python’s standard library. I also got some good practice with the Python Object Model as it relates to copying opjects and references to objects.

I mention the following resources in this video:

Writing an XML file

In my previous video, I created a nested dictionary containing the raw data that must be written, in XML format, to the user mapping file. In this video, I add the code that writes an XML file based on that dictionary’s contents. I solve problems like parsing through several layers of a nested dictionary to modify values.

I mention the following resources in this video:

Reorganizing my program

In this video, I’ll explore a bit more about Python by organizing my current project into functions and modules to make it easier to maintain.

My project currently consists of one file and the data used to build the XML output file is hard-coded into the program. This is not very flexible and makes it hard for others to use the program. First I will divide the project into module files and then turn some of the program logic into functions.

I mention the following resources in this video:
* The Guacamole default authentication file format (https://guacamole.apache.org/doc/gug/configuring-guacamole.html#basic-auth)

This is the fourth video in a series about learning enough to build and maintain a Python application.

This video series is a personal project. My opinions are my own and do not reflect the opinions of my employer.

Git and GitHub

In this video, I use the Git version control system to manage changes to my code and also create a remote repository on Github.

Before watching this video, I recommend you review the following:

Other resources I mention in this video are:

The git documentation
Git cheat sheet: https://training.github.com/downloads/github-git-cheat-sheet.pdf
GitHub First Contributions repo: https://github.com/firstcontributions/first-contributions
Another good guide to using git: https://alan-turing-institute.github.io/rsd-engineeringcourse/ch02git/02Solo.html

This video series is a personal project. My opinions are my own and do not reflect the opinions of my employer.

Python virtual environments

In this video, I start making my first program more flexible. I define the input in a configuration file. I learn the YAML file format. I introduce Python virtual environments. Also, I get more practice using Git as I make my code changes.

Before watching this video, I recommend you review the following:

Python virtual environments overview

Other resources I mention in this video are:

Rewriting mapperdata.py

In this video, I continue making my first program more flexible. I use a Python virtual environment. I install the PyYAML package using pip. I rewrite my mapperdata.py module to read the YAML config file and build a data structure based on its contents. I make a few classic mistakes while iterating through nested dictionaries and I introduce the useful git restore command.

Before watching this video, I recommend you review the following:

PyYAML documentation: https://pyyaml.org/wiki/PyYAMLDocumentation

Other resources I mention in this video are:

Using VS Code

In this video, I finish rewriting my mapperdata.py module to read the YAML config file and build a data structure based on its contents. I also introduce the Visual Studio Code text editor, which I’ll be using from now on.

Before watching this video, I recommend you review the following short videos:

Other resources I mention in this video are:

Requirements.txt and using GitHub Issues

In this video, I create a requirements.txt file so others can easily deploy the Usermapper application. I also fix a bug I found in the program. I use GitHub Issues to save notes about improvements I would like to make to the application, if I find time in the future. Finally, I discuss what I think it will take to learn enough about the Flask framework so I can move on to the next step.

Resources I mention in this video are:

The Flask framework

Packaging and command-line arguments

In this video, I organize my Python modules into a package that others can download and install. I also modify the program so users can specify the input file and output file locations and filenames in command line arguments. The final result is at: https://github.com/blinklet/usermapper.git

Before watching this video, I recommend you read the following article:

Packaging Python programs

Other resources I mention in this video are:

Python module search paths (the -m option)
https://docs.python.org/3/using/cmdline.html
The package setup.py file: https://docs.python.org/3/distutils/setupscript.html
Command line arguments for Python programs: https://realpython.com/python-command-line-arguments/
Why we need a module named __main__ in the package directory: https://docs.python.org/3/using/cmdline.html
Install Python packages hosted on Github: https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support
References in GitHub: https://docs.github.com/en/free-pro-team@latest/github/writing-on-github/autolinked-references-and-urls

Conclusion

Over the course of a month, I spend about one hour per evening learning and practicing Python. I found that choosing a specific project to implement in Python helped me learn. If you administer a Guacamole web proxy server, have a look at my Usermapper program! I am now looking forward to learning the Flask framework and building a web site using Python.

↧

Flask web app tutorial for network engineers

December 16, 2020, 6:05 pm

≫ Next: azruntime: Manage Azure Infrastructure with Python

≪ Previous: A Python learning path for network engineers

Most network engineers don’t need to create web sites but they may, like me, want to convert their existing Python command-line programs into web apps so others can use them more easily. This tutorial presents the minimum you need to know about Python, Flask, and the Bootstrap CSS framework to create a practical web app that looks professional.

This tutorial covers a different type of use-case than is usually demonstrated in Flask tutorials aimed at beginners. It shows you how to create a web app that “wraps up” another Python program’s functionality.

I will show you how to use the Flask framework to build a web app that re-uses code from my Usermapper program and enables users to run it on a website, instead of installing and running it locally on their PC. You will create a “usermapper-as-a-service” application, served as a responsive web app that looks good on computer screens, tablets, and mobile phones.

I wrote this tutorial while I was learning Flask and developing my usermapper-web Flask application. It was written by a beginner, for other beginners. It walks through topics in the order in which I learned them. I hope you find this approach to be readable and informative.

Flask overview

Flask is a Python framework, or code library, that makes it easier for developers to build web applications.

I think it’s helpful to think about Flask as a server that you may configure with Python statements and functions. To use Flask, you write a Python program that configures the Flask server so that it “routes” users to “view functions” based on the address information in the URL the user entered in a web browser. The Flask server has a “user interface” that is managed by Python tools like decorators.

Prerequisite learning

I previously wrote a blog post describing The Minimum You Need to Know About Python and created a YouTube playlist about building Usermapper, my first useful Python program.

Those efforts treated Python like a simple scripting language. They focused on Python syntax and basic logic, and built programs in a procedural way. To appreciate the Flask framework, you need to learn more about Python’s object-oriented programming features and how they are used. In my case, I re-read the second half of the Learning Python book which covers both functional programming and object-oriented programming in Python, and covers Decorators.

Learning about Flask

Next, I watched a video tutorial about using Flask. There are many great videos on YouTube that introduce Flask. I looked at a few and I most enjoyed the Web Programming video from the Harvard CS50 course. It covers Flask in a two-hour-long video and it gave me confidence I could get started. Later versions of this course have been expanded so, if you want more information about Flask and web programming, go to the latest version of the CS50 course.

Finally, I browsed through the Flask documentation. I did not deep-dive into the docs. I browsed through them and learned just enough to get started.

Before you start Flask programming

Before you go further, you should review the object-oriented features in Python, read about decorators and how they are used in Python, and watch the CS50 Flask video mentioned above or a similar introduction-to-Flask video. You should have already created one or more simple command-line programs using Python and should be comfortable using the Git version control system.

No database (yet)

As you will see later, even a simple Flask app must store data somewhere so it can be used by the Flask “views” in the application. Most beginner Flask tutorials show you how to build a web app that registers user names or stores objects like photos in a database, but I disagree with forcing beginners to use databases in their first Flask applications.

I want to focus only on learning just enough Flask, Python, and Bootstrap to make a professional-looking web app. Databases are a separate subject that requires, in my opinion, more study than is usually provided in the database section of other Flask tutorials aimed at beginners. Those tutorials treat databases like some kind of “magic” and, while they show the commands required to set up a database that supports their example application, do not really teach the reader about databases. I will learn database technology later.

Most network engineers who just want to “wrap” their command-line tools in a Flask app do not need to use a database because their command-line programs, like my Usermapper program, already use files for data storage instead of a database. So, this tutorial uses the host server’s filesystem for data storage.

Eventually, you will need to learn about databases. Writing data to and reading from a database is much faster than writing to and reading from the filesystem. Using a database also allows you to deploy your web app in a more flexible environment, as you will see later when you deploy this web app to a Python platform-as-a-service. If, at some point in the future, you find your web app is used by more than a few people, you should consider incorporating a database.

Set up the programming environment

The first step in any programming project is to set up your environment. Create directories for source code, then create a Git repository, a remote Git repository, and a Python virtual environment.

Project directories

Create a directory in which you will build the Flask app and, eventually, in which you will clone the Usermapper source code so you can import its functions into your web app.

In my case, I will put all my code in a directory named “~/Projects”

$ mkdir ~/Projects
$ cd ~/Projects

Next, create a new folder for the Flask application in the ~/Projects directory. For example, I chose the directory name, usermapper-web.

$ mkdir usermapper-web
$ cd usermapper-web

In VScode, open the usermapper-web folder.

Git repository

Initialize a Git repository for the usermapper-web directory.

$ git init

Create a .gitignore file for the project¹. Copy the standard .gitignore file for Flask projects found at: https://github.com/pallets/flask/blob/master/.gitignore. Place the file in the usermapper-web directory.

Commit the file to your local Git repository and push the change to GitHub.

$ git add .
$ git commit -m 'Added .gitignore file for Flask project'

Then change the branch name to main.

$ git branch -M main

Create a remote Git repository

Go to GitHub and create a new repository named usermapper-web. Get the URL of the repository and copy it to the clipboard. In my example, the GitHub URL is: https://github.com/blinklet/usermapper-web.git.

Then, on the local machine, connect the local Git repository to the remote GitHub repository and push all the changes you made to the remote repository:

$ git remote add origin https://github.com/blinklet/usermapper-web.git
$ git push --set-upstream origin main

Python virtual environment

Create and start a Python virtual environment in the usermapper-web directory:

$ python3 -m venv env
$ source env/bin/activate
(env) $ pip install wheel
(env) $

Now, install Flask in the usermapper-web virtual environment:

(env) $ pip install flask

Now we’re almost ready to get started.

Flask “Hello, World!”

Test that Flask is working by pasting in the classic Flask “Hello, World!” app into a file and running it.

Create a file named application.py in the usermapper-web directory. Copy and paste the following code in the file, then save it.

from flask import Flask

app = Flask(__name__)

@app.route("/")
def index():
    return "Hello, world!"

The code shown above configures the Flask server. It does not create all the logic that builds a web app; Flask does that.

In the first line, the program imports the Flask class from the flask package. The second line creates a instance of the Flask class which inherits the functions and classes in the application file (referenced by __name__ variable). The third line is the Flask “route” decorator that “registers” a Flask “view” function with the address “/”, the root level of the web app. The Flask app will run the view function registered with this URL address when a user enters it in their browser’s search bar. The last two lines are the view function. The view function returns the text, “Hello, world!” to the Flask server. the Flask server (technically, the WSGI server that supports the Flask server) presents the text to the user in their browser window.

Run the Flask app using the Flask command-line interface. The flask command reads the Flask environment variables to learn the name of the application file and any other settings you may wish to set at run-time.

At a minimum, we need to tell Flask the application module name.

(env) $ export FLASK_APP=application

Then, run Flask:

(env) $ flask run

You should see some text appear in the terminal console that tells you the IP address and port from which the Flask app is being served. In my case, it is 127.0.0.1, which is my PC’s localhost address. In the browser, go to localhost:5000 and see the text, “Hello, World!”

Flask templates, HTML & CSS

See the data rendered by the browser by using the developer tools. Enter the CTRL-SHIFT-I key combination to see the browser’s Developer Tools or CTRL-U to see the source code on the page. In this example, the source code consists only of simple text with no HTML markup.

Flask is not magic. You can’t write a few lines of Python code and get a fully functioning web page. You need to create your own HTML pages and use the Flask render_template function to grab those HTML pages and serve them up to the browser. You may also use the built-in Jinja template library to create placeholders in HTML pages that can be dynamically replaced during run-time.

By default, Flask expects to find Jinja templates in a directory named templates. Create a new folder named templates. Go to the new folder:

(env) $ mkdir templates
(env) $ cd templates

In the templates directory, create a new file named index.html with an <H1> tag, paragraph, and a form box.

If you are using VScode, you can generate a simple HTML page snippet by pressing the CTRL-space key combination, then select “HTML”.
Delete the CSS and JS links from the snippet because we do not need them, yet. Add the web page title between the title tags and the web page content between the body tags.

<!DOCTYPE html>
<html>
<head>
    <title>Hello World</title>
    <meta name='viewport' content='width=device-width, initial-scale=1.0'>
</head>
    <body>
        <h1>Hello, World!</h1>
        <p>Hello, World!</p>
    </body>
</html>

Change the Flask application so it will render the HTML template you prepared, instead of just sending plain text to the browser. Go back to the usermapper-web directory and edit the application.py file.

Import Flask’s render_template function. Modify the first line of the application.py file as shown below:

from flask import Flask, render_template

Change the object returned by the index function. Change the last line of the application.py file as shown below

    return render_template("index.html")

Instead if returning a simple string, it will now return the results of the render_template function, which takes the index.html file as an argument. Then Flask will display the result, which is simply the contents of the index.html file, in the browser.

The application.py file should now look like the code listing below:

from flask import Flask, render_template

app = Flask(__name__)

@app.route("/")
def index():
    return render_template("index.html")

To make Flask read these changes, restart it. Enter Ctrl-C in the terminal, then flask run. Refresh the browser to see the rendered contents of the index.html page.

Look at the web page’s source code in the browser development tools: CTRL-U in the browser. You should see that the test displayed in the browser is formatted and you should see HTML code in the Browser’s development tools.

This example is not very interesting because it just serves up a web page named index.html, like any other web server. Soon, you will use Flask and Jinja templates to render web pages that includes data generated by the Flask program at run time, and allows the user to send data to the Flask application.

Flask development mode

To avoid restarting Flask when you modify your application code, set another environment variable to tell Flask to operate in a development environment. Flask will then automatically reload any changed code and will give you helpful error debug traces in the browser window, instead of in the console.

CTRL-C
(env) $ export FLASK_ENV=development
(env) $ flask run

Get user input using Flask forms

Enough about the basics. Now, you may begin developing the real Flask application. The application used as an example in this tutorial needs to accept input from the user. HTML web pages use forms to gather and submit user input to the Flask application.

To create a basic form in HTML, modify the index.html template as shown below and add an HTML form. Also, change the header and add paragraph text in the page so it starts to looks a bit like the application you want to create. The listing below shows my first attempt at creating a form that accepts a text string:

<!DOCTYPE html>
<html>
<head>
    <meta name="viewport" content="width-device-width, initial-scale=1.0">
    <title>Guacamole User Mapper</title>
</head>
<body>
    <h1>Generate usermapping.xml</h1>
    <p>Upload your configuration file</p>
    <form>
        <label for="fname">upload file:</label><br>
        <input type="text" id="fname" name="fname"><br>
        <input type="submit" value="Submit">
    </form> 
</body>
</html>

In the browser, go to localhost:5000. Your web page should look similar to the screenshot below.

The form looks OK but it does not do anything. You need to change the code so the form submits data to the Flask application.

Flask form extensions

As always, it’s best to use tools others have created to make your programming easier. Use the Python WTForms package and the Fask-WTF Flask extension to handle forms in your application. Even with these helper libraries, you still need to know the basic HTML code for HTML forms.

Install Flask-WTF, which also installs WTForms for you:

(env) $ pip install Flask-WTF

To get some experience with Flask forms, go to the Flask-WTF Quickstart page and copy the example code. Replace the code in the application.py file with the sample code shown below:

from flask import Flask, render_template
from flask_wtf import FlaskForm
from wtforms import StringField

app = Flask(__name__)

app.config['SECRET_KEY'] = 'fix this later'

class MyForm(FlaskForm):
    filename = StringField('Filename: ')

@app.route("/", methods=('GET','POST'))
def index():
    form = MyForm()
    return render_template("index.html", form=form)

In the application.py file, above, you added the app.secret key so Flask extensions and features can use it when needed. Flask-WTF uses the secret key to support Cross Site Request Forgery (CSRF) protection. Normally, you would not include the secret key value in your source code, which is why it currently the value, “fix this later”, to remind me you to clean this up before you deploy your application on a publicly-accessible web site.

You defined a new class called MyForm that inherits all the attributes and functions from the FlaskForm class and adds an instance of the StringField class, called filename.

In the index view function, you created an instance of my MyForm class and named it form. Then you returned the template, index.html, and passed the form object instance into it as an argument.

Now, add the form object to the index.html template. Again, use the example code from the Flask-WTF Quickstart Guide. Modify the templates/index.html file as follows:

<!DOCTYPE html>
<html>
<head>
    <meta name="viewport" content="width-device-width, initial-scale=1.0">
    <title>Guacamole User Mapper</title>
</head>
<body>
    <h1>Generate usermapping.xml</h1>
    <p>Upload your configuration file</p>
    <form method="POST">
        {{ form.csrf_token }}
        {{ form.filename.label }} {{ form.filename(size=20) }}
        <input type="submit" value="Upload">
    </form>
</body>
</html>

In the index.html template, above, you used Jinja template syntax to indicate where the form object should insert its HTML code. It places code generated by its csrf_token method and by its filename method into the Jinja placeholder text defined inside the HTML form tags.

Save the file and refresh the browser. Look at the page source code in the browser by pressing the CTRL-U key combination. You can see the HTML form code that Flask and WTForms created for you. It will be similar to the code snippet below:

<form method="POST">
    <input id="csrf_token" name="csrf_token" type="hidden" value="ImE5MmQ2YmE4YzIyYzIxM2NmNWYwODgyMTA2MzYwOTEyNWMzNWQyMDki.X9p5ow.Z_-HwhCD94kA1KR7Ui7BeHRiZYQ">
    <label for="filename">Filename: </label> <input id="filename" name="filename" size="20" type="text" value="">
    <input type="submit" value="Upload">
</form>

Adding input validation to Flask forms

Validate that the submitted form has data in it. Modify the application so it will show the text entered by the user after the form is submitted.

In the application.py file, import the validator classes you need from the validators module in the WTFforms library:

from wtforms.validators import DataRequired

Change the form object to use the validators:

class MyForm(FlaskForm):
    filename = StringField('Filename: ', validators=[DataRequired()])

Add the following validation check to the index function. If the validation passes, get the submitted form data, which is in the filename.data attribute of the form instance. Pass the submitted data to the index.html template by adding an extra argument when you call the render_template function.

@app.route('/', methods=('GET','POST'))
def index():
    data=None
    form = MyForm()
    if form.validate_on_submit():
        data = form.filename.data
    return render_template('index.html', form=form, data=data)

Then, modify the templates/index.html template so it will display the contents of the data variable after the form. Add the following after the \<form>\</form> stanza, before the closing \</body> tag:

        <p>{{ data }}</p>

Refresh the browser to see the results. The browser should display “None”. Check that the form will not let you submit it when the input field is empty. If you do enter some text, then you can submit the form and Flask will display the information you submitted in the form.

Change the template so it does not show the ‘None” when you use the application for the first time. It displays “None” because the data variable is empty until you submit data in the form.

Jinja templates can include conditional statements. Replace the data variable with the following Jinja statement:

    {% if data != None %}
        <p>{{ data }}</p>
    {% endif %}

Save the file and refresh the browser. See how the page renders with no values below the form, then shows values when they are entered in the form.

Uploading files

In application.py, import the wtforms.SubmitField class from the flask_wtf module. Import the FileField, FileRequired, and FileAllowed classes from the flask-wtf.file module. Also, import the os module so you can get system information like the Flask project directory when saving the file to the server. You no longer need the StringField and DataRequired classes from wtforms so you can delete those.

The new import lines in the application.py file will be:

from flask import Flask, render_template
from flask_wtf import FlaskForm
from flask_wtf.file import FileField, FileRequired, FileAllowed
from wtforms import SubmitField
from werkzeug.utils import secure_filename
import os
from wtforms.validators import DataRequired

Modify the MyForm class to handle a file upload form. Add a submit field to the class so that wtforms will handle creating the correct HTML for the submit button. It’s better to let the framework do the work for you, where possible.

class MyForm(FlaskForm):
    filename = FileField('Filename: ', 
        validators=[FileRequired(), FileAllowed(['yaml'])])
    submit = SubmitField('Upload')

Notice how you are using the new validators and are only allowing files with the .yaml extension to be uploaded.

Change the index view function to save the file that is uploaded.

@app.route("/", methods=('GET','POST'))
def index():
    form = MyForm()
    filename = None
    if form.validate_on_submit():
        f = form.filename.data
        basedir = os.path.join(
            os.path.abspath(os.path.dirname(__file__)), 
            'uploads')
        filename = os.path.join(
            basedir, secure_filename(f.filename))
        f.save(filename)
    return render_template('index.html', form=form, data=filename)

The file contents are in the form object’s filename.data attribute. We use the os.path module to get the directory in which the application.py file is located (since this may be different when we deploy to a server). We use the secure_filename function from the werkzeug module to ensure a user cannot enter a malicious file name.

The form submit input type in templates/index.html will not work for file uploads. replace it with the following Jinja template text so Flask-WTF can insert the submit tag syntax generated by the Jinja form.submit macro.

Replace the text in templates/index.html:

        <form method="POST">
            {{ form.csrf_token }}
            {{ form.filename.label }} {{ form.filename(size=20) }}
            <input type="submit" value="Go">
        </form>

with the following text:

        <form method="POST" enctype="multipart/form-data">
            {{ form.csrf_token }}
            {{ form.filename.label }} {{ form.filename(size=20) }}
            {{ form.submit }}
        </form>

Notice you added an encoding type to the form. Since you are now using the FileField class, you must change the form tag shown above so it tells the browser that the POST data will be encoded as multipart data.

If the form validation fails, you need to send an error message to the user. Insert the following code, which will display errors raised by the form.filename object, after the {{ form.submit }} Jinja placeholder:

    {% for error in form.filename.errors %}
        <p style="color: red;">{{ error }}</p>
    {% endfor %}

You must create an uploads directory in the application’s folder because you hard-coded your index view function in application.py to save files in folder named “uploads”.

(env) $ cd ~/Projects/usermapper-web
(env) $ mkdir uploads

Refresh the browser. Notice that the form looks different. Now, it contains a Browse button that will open the file explorer or your PC to find the file to upload.

Now, you can upload a YAML file using its original filename in the relative directory, ./uploads. When you upload a file, the Flask app saves it in the uploads directory and displays the file’s path on the screen. The application screen should look like the screenshot below:

Saving temporary files

Saving an uploaded file to a single location on disk could cause problems for web apps used by multiple users. Multiple users may overwrite each others’ configuration files.

Solve this problem by creating unique temporary files in randomly-named directories. Use the tempfile.mkdtemp function from the Python standard library to create a temporary directory that is unique for each user session.

NOTE: The temporary storage issue can also be solved by eventually incorporating a database and giving each user a unique id saved in their session variable — but that’s all for a later project.

In the application.py file, import the tempfile module:

import os, tempfile

Change the logic that defines the filename in the index view function. After the basedir variable is defined, add a line taht defines the tempdir variable and change the filename variable so it now incorporates the tempdir variable as part of its path:

        tempdir = tempfile.mkdtemp(dir=basedir)

        filename = os.path.join( 
            tempdir, secure_filename(f.filename))

Save the file and refresh the browser. Upload a config file. Check the filesystem for the temporary directory name, then go to it. You should see a random directory name containing the file you uploaded. For example:

(env) $ ls ./uploads/ 
test.yaml  tmpcrlrrmwa  tmppe646x2r

Limit the upload file size

As an additional check, limit the allowed size of the uploaded file. A malicious user could use up all your disk space or memory they submit a very large file.

I could not find a server-side function in Flask or Flask-WTF that lets you limit upload file size. You would need to do that on the client, using JavaScript (maybe as a future task).

Instead, implement a basic workaround using Flask environment variables. In the application.py file, under the secret key, under the application instance, add a new configuration that limits file upload sizes to 1 MB.

app.config['SECRET_KEY'] = 'fix this later'
app.config['MAX_CONTENT_LENGTH'] = 1024 * 1024

Now, any file that exceeds one megabyte in size will fail to upload. According to the docs, Python will raise an exception called RequestEntityTooLarge so, if you want, you can catch that exception and produce a nicer error announcement (also a future task).

Application files

The two files should now look like the two listings below:

application.py

from flask import Flask, render_template
from flask_wtf import FlaskForm
from flask_wtf.file import FileField, FileRequired, FileAllowed
from wtforms import SubmitField
from werkzeug.utils import secure_filename
import os, tempfile
from wtforms.validators import DataRequired

app = Flask(__name__)

app.config['SECRET_KEY'] = 'fix this later'
app.config['MAX_CONTENT_LENGTH'] = 1024 * 1024

class MyForm(FlaskForm):
    filename = FileField('Filename: ', 
        validators=[FileRequired(), FileAllowed(['yaml'])])
    submit = SubmitField('Upload')

@app.route("/", methods=('GET','POST'))
def index():
    form = MyForm()
    filename = None
    if form.validate_on_submit():
        f = form.filename.data
        basedir = os.path.join(
            os.path.abspath(os.path.dirname(__file__)), 
            'uploads')
        tempdir = tempfile.mkdtemp(dir=basedir)
        filename = os.path.join( 
            tempdir,secure_filename(f.filename))
        f.save(filename)
    return render_template('index.html', form=form, data=filename)

templates/index.html

<!DOCTYPE html>
<html>
<head>
    <meta name="viewport" content="width-device-width, initial-scale=1.0">
    <title>Guacamole User Mapper</title>
</head>
<body>
    <h1>Generate usermapping.xml</h1>
    <p>Upload your configuration file</p>
    <form method="POST" enctype="multipart/form-data">
        {{ form.csrf_token }}
        {{ form.filename.label }} {{ form.filename(size=20) }}
        {{ form.submit }}
        {% for error in form.filename.errors %}
        <p style="color: red;">{{ error }}</p>
        {% endfor %}
    </form>
    {% if data != None %}
        <p>{{ data }}</p>
    {% endif %}
</body>
</html>

Downloading a file from a Flask app

You will eventually want users to be able to download the XML file generated by the Usermapper package. To experiment with this functionality, write some code that will download the file you recently uploaded.

Modify the application.py file. Import the send_from_directory and url_for modules from the Flask package:

from flask import Flask, render_template, send_from_directory, url_for

Modify the index view function to add a download link called download_url, which points to the temporary file created when you previously uploaded a file, and sends the download_url variable to the index.html template as another argument.

The download_url should contain the route address, download/, and the relative path of the file that had previously been uploaded.

Define the download_url variable near the start of the view function.

    download_url = ""

Replace the return statement at the end of the index view function with the following lines. The download_url creates a Dynamic URL comprised of the /download route address, the name of the temporary folder that was created in the uploads folder in the filesystem, and the name of the file that was previously uploaded. Flask Dynamic URLs allow us to pass simple arguments from one view function to another view function.

        tempfolder = os.path.split(tempdir)[1]
        download_url = os.path.join(
            '/download', tempfolder, secure_filename(f.filename))

    return render_template(
        'index.html', form=form, 
        data=filename, download_url=download_url)

Create a new route and view function for downloading the file from the temporary directory. The route address, “/download/\<tempfolder>/\<filename>”, in the example below uses Flask’s Dynamic URL feature to send information encoded in the route URL to the view function.

@app.route("/download/<tempfolder>/<filename>", methods=('GET','POST'))
def download(tempfolder,filename):
    basedir = os.path.join(
        os.path.abspath(os.path.dirname(__file__)), 'uploads')
    temp_dir = os.path.join(basedir,tempfolder)
    return send_from_directory(
        temp_dir, filename, as_attachment=True)

Then, add a link to the templates/index.html template so the user can download the file. Display the download link in the browser only if the data variable is not empty, so it only appears if a file was previously uploaded.

Replace the following text to templates/index.html, after the HTML form tags:

    {% if data != None %}
        <p>{{ data }}</p>
    {% endif %}

with the following text:

    {% if data != None %}
        <p></p>
        <p><a href="{{ download_url }}">Download the file you recently uploaded</a></p>
        <p></p>
        <h2>File path:</h2>
        <p></p>
        {{ data }}
    {% endif %}

Save the template file. Refresh the browser.

Upload a file. Then, see the download link. Verify you can download the file when you click on the download link. After downloading the test file, your browser should look similar to the screenshot below:

One issue is that the temporary folders do not get automatically cleaned up. That’s a problem you will address later in this tutorial.

Now, a web app user can upload any file and then download the same file. By working through the previous steps, you have learned how to Flask and Jinja templates can create a functional web page that will display different options, depending on the values stored in program variables, and enables users to upload and download a file. You also learned how to pass simple bits of informtion from one view to another with Dynamic URLs.

You are ready to convert an existing Python command-line application to a Flask web app, or to build your own original Flask web app.

Wrapping an existing program in a Flask web app

In this tutorial, you will create a Flask app that will upload and read the contents of a YAML configuration file so the Usermapper program I previously wrote can read the uploaded configuration file and generate the XML file. Then, the Flask application will allow the user to download the generated XML file.

To “wrap” my Usermapper command-line program in a Flask web app, you need to import functions from my Usermapper package and reuse them. To get access to these functions, you must install the Usermapper package in your Python virtual environment.

Install the CLI package you plan to convert

Clone the Usermapper source code to the ~/Projects folder.

(env) $ cd ~/Projects
(env) $ git clone https://github.com/blinklet/usermapper.git

This creates a folder named usermapper and downloads the package files into it.

Have a look at the source code.

(env) $ tree usermapper
usermapper
├── config.yaml
├── example_config.yaml
├── example-xml.xml
├── LICENSE
├── README.md
├── requirements.txt
├── setup.py
├── test.py
└── usermapper
    ├── __init__.py
    ├── __main__.py
    ├── mapperdata.py
    └── usermapper.py

The source code consists of some helper files and a package directory named usermapper that contains the modules mapperdata.py and usermapper.py.

Install the usermapper package in the usermapper-web virtual environment in editable mode, so any changes we make to the source code in the ~/Projects/usermapper/usermapper package directory will automatically be appied to the installed instance in the usermapper-web virtual environment:

(env) $ pip install --editable ~/Projects/usermapper

By re-using packaged code in this way, any changes I make to my original usermapper package will be available to users of the command-line usermapper application, as well as to users of the web app.

Keeping the code for the two applications separated like this enables developers to work individually on their projects, as long as the interfaces are agreed between projects. This avoids maintaining the same code in two different projects.

Using Usermapper functions

Import the usermapper package’s functions into the application.py Flask program. Also, import the yaml module from the Python standard library. The application.py file’s imports should change as follows.

Add yaml to the module imports line:

import os, tempfile, yaml

Add functions from the usermapper package:

from usermapper.usermapper import xmlwriter
from usermapper.mapperdata import get_users

I also removed the werkzeug.utils import line because we no longer need the secure_filename function.

Instead of saving the uploaded configuration file on the server’s filesystem, process it immediately and save the generated XML file. The uploaded config file is stored in memory as the file objected named “f”. Also, since you know the location you want to use for the temp folders, you will not put the entire relative path in the URL.

Delete the uploads directory and create a new directory named “downloads”.

(env) $ rm -rf uploads
(env) $ mkdir downloads

In the application.py file, change the index view function to the following. Replace the following text near the end of the view function:

        f = form.filename.data
        basedir = os.path.join(
            os.path.abspath(os.path.dirname(__file__)), 'uploads')
        tempdir = tempfile.mkdtemp(dir=basedir)
        filename = os.path.join( 
            tempdir, secure_filename(f.filename))
        f.save(filename)
        tempfolder = os.path.split(tempdir)[1]
        download_url = os.path.join('
            /download', tempfolder, secure_filename(f.filename))

with the following new text:

        f = form.filename.data
        basedir = os.path.join(
            os.path.relpath(os.path.dirname(__file__)), 'downloads')
        tempdir = tempfile.mkdtemp(dir=basedir)

        filename = os.path.join(tempdir, 'user-mapping.xml')

        configuration = yaml.safe_load(f.read())
        structure = get_users(configuration)
        xmlwriter(structure,filename)

        tempfolder = os.path.split(tempdir)[1]
        download_url = os.path.join(
            '/download',tempfolder,'user-mapping.xml')

You made a lot of changes in the index view function. You built the basedir variable using the os.path.relpath function instead of os.path.abspath and pointed it to the new downloads directory. You changed the filename to a hard-coded value, user-mapping.xml. You no longer just save an uploaded file. You used the functions you imported from the Usermapping package to read the uploaded configuration file, process it and save the results to a temporary directory. Then, you set the download_url variable, which will create the download link in the Jinja template, to the file path of the saved user-mapping.xml file. You split statements into multiple lines to make the code a bit more readable.

Also, change the basedir variable in the download view function so it also points to the the new downloads folder:

@app.route("/download/<tempfolder>/<filename>", methods=('GET','POST'))
def download(tempfolder,filename):
    basedir = os.path.join(
        os.path.abspath(os.path.dirname(__file__)), 
        'downloads')
    temp_dir = os.path.join(basedir,tempfolder)
    return send_from_directory(
        temp_dir,filename,as_attachment=True)

Reload the browser and upload the config file again. This time use a real configuration file. I suggest you use the file: ~/Projects/usermapper/example-config.yaml.

See a new file named user-mapping.xml has been created in the a randomly-named directory in the usermapper-web/downloads/ directory.

Create a file preview in the web app

To provide some feedback to the user so they know the file generation worked, add some code that previews the contents of the XML file on the web page.

Modify the index view function. Change the section starting with configuration = yaml.safe_load(f.read()) to:

        configuration = yaml.safe_load(f.read())
        structure = get_users(configuration)
        xmlwriter(structure,filename)

        preview = open(filename, 'r')
        data = preview.readlines()
        preview.close()

        temp_folder = os.path.split(tempdir)[1]
        download_url = os.path.join('/download',temp_folder)

    return render_template('index.html', 
        form=form, data=data, download_url=download_url)

This is a bit kludgey². You write the user-mapping.xml file to disk, then re-open and read it to get its contents to display on the web page. But, sometimes, “good enough” is good enough.

Next, fix the templates/index.html template. It currently displays the XML preview as a big blob of text all on one line. Fix this by changing the Jinja template.

Change the templates/index.html template so it uses a Jinja For loop to iterate through the data object line by line. Use minus signs to manually strip whitespace from the HTML code that Jinja generates in the for loop block. Also, change some of the text in the download link generated by the index.html template. Change the following line:

In templates/index.html, change the if block from:

    {% if data != None %}
        <p></p>
        <p><a href="{{ download_url }}">Download the file you recently uploaded</a></p>
        <p></p>
        <h2>File path:</h2>
        <p></p>
        {{ data }}
    {% endif %}

to:

    {% if data != None %}
        <p></p>
        <p><a href="{{ download_url }}">Download the <em>user-mapping.xml</em> file</a></p>
        <p></p>
        <h2>File path:</h2>
        <p></p>
        <pre><code>
            {%- for item in data -%}
                {{ item }}
            {%- endfor %}
        </code></pre>
    {% endif %}

This displays the contents of user-mapping.xml in the browser. The file text needs to be formatted better and you can clean that up later with some CSS or Bootstrap classes.

It would also be helpful to add a button that will copy the text from the user-mapping.xml file. To do that, you would need to include some JavaScript to enable a copy field. So, that’s a topic for another tutorial.

Cleaning up temporary files

Delete temporary files after the user has downloaded them so they do not eventually fill up your disk with temporary files.

You should give the users at least a few minutes to download their files after they are generated. The temporary file should persist for, maybe, 20 minutes and then be deleted.

I think the easiest way is to run a cron job that runs every twenty minutes and deletes temporary files older than twenty minutes.

Create a crontab entry:

$ crontab -e

add the following line:

*/20 * * * * find /home/brian/Projects/usermapper-web/downloads/tmp* -maxdepth 0 -mmin +20 -exec rm -fr {} +;

Check if it is working (after twenty minutes):

$ grep CRON var/log/syslog

NOTE: Hard-coding the temporary file location like this is not ideal. You will eventually deploy this program to a remote server or to a serverless platform which may handle temporary files differently. As a future improvement, you may define the temporary file location using an environment variable so you can configure it to the appropriate value on any service or server where you deploy this app.

Create a separate download page

After uploading a config file and generating a user-mapping file, you will find that refreshing the browser generates a new temporary directory containing a new user-mapping.xml file. If you keep refreshing the browser, you generate more and more temporary files. Someone could create a minor denial of service attack and fill up your downloads directory with temporary files just my holding down the CTRL-R key combination in their browser!

This problem exists because the upload and download services are both on the same page. The browser sees the user is still on the same page so the browser stores the state of the last request. If you refresh the page at this point the browser will re-submit the cached form object, triggering the upload again and generating a nw user-mapping file in a new temporary directory.

The solution to the refresh problem is to create a new route and view function that returns a new template after successfully uploading a configuration file and saving the generated XML file. The new template will display the file preview and the XML file download link, and will include a link back to the index page.

Create a new template named: templates/download.html. Copy of the index.html template and paste it into the new template, with the form removed. Add in a link back to the index view. Change the displayed text so the instructions are clear. The templates/download.html template should look like the following:

<!DOCTYPE html>
<html>
<head>
    <meta name="viewport" content="width-device-width, initial-scale=1.0">
    <title>Guacamole User Mapper</title>
</head>
<body>
    <h1>Download usermapping.xml</h1>
    <p>Download your configuration file</p>

    {% if data != None %}
        <p></p>
        <p><a href="{{ download_url }}">Download user-mapping.xml</a></p>
        <p></p>
        <p><a href="{{ url_for('index') }}">Create another user-mapping File</a></p>
        <p></p>
        <h2>File preview:</h2>
        <pre><code>
            {%- for item in data -%}
                {{ item }}
            {%- endfor %}
        </code></pre>
    {% endif %}
</body>
</html>

Then, change the index.html template and remove the download link and file preview, as shown below:

<!DOCTYPE html>
<html>
<head>
    <meta name="viewport" content="width-device-width, initial-scale=1.0">
    <title>Guacamole User Mapper</title>
</head>
<body>
    <h1>Generate usermapping.xml</h1>
    <p>Upload your configuration file</p>
    <form method="POST" enctype="multipart/form-data">
        {{ form.csrf_token }}
        {{ form.filename.label }} {{ form.filename(size=20) }}
        {{ form.submit }}
        {% for error in form.filename.errors %}
            <p style="color: red;">{{ error }}</p>
        {% endfor %}
    </form>
</body>
</html>

Modify application.py to support the separate index and download templates.

Add redirect to the list of imports from the Flask package, as shown below.

from flask import Flask, render_template, send_from_directory, url_for, redirect

The index view function now only needs to handle the configuration file upload and conversion to the XMP user mapping file. Add a redirect to a new route named download_page and move the logic that builds the download_url from the index view function to the download_page view function. Pass the xml file’s temporary directory name to the new route, using the Flask url_for function to build a Dynamic URL.

Delete the initial download_url definition statement in the index view function. Delete the following text:

    download_url = ""

Also, delete code that reads the generated user-mapping.xml file. Delete the following text:

        preview = open(filename, 'r')
        data = preview.readlines()
        preview.close()

Delete the second download_url definition statement at the end of the if form.validate_on_submit(): block in the index view function:

        download_url = os.path.join('/download',temp_folder)

Add a redirect statement at the end of the if form.validate_on_submit(): block in the index view function so that, when the form is submitted and the user-mapping.xml file is generated, the web app redirects to the download page:

        return redirect (url_for('download_page', temp_folder=temp_folder))

Change the configuration dictionary
Delete the data and download_url variables from the index view function’s return statement. The statement should now look like the line shown below:

    return render_template('index.html', form=form)

The index view function should now look like the source code below:

@app.route("/", methods=('GET','POST'))
def index():
    form = MyForm()
    filename = ""
    if form.validate_on_submit():
        f = form.filename.data
        basedir = os.path.join(
            os.path.relpath(os.path.dirname(__file__)), 
            'downloads'
        )
        tempdir = tempfile.mkdtemp(dir=basedir)
        filename = os.path.join(tempdir, 'user-mapping.xml')

        configuration = yaml.safe_load(f.read())
        structure = get_users(configuration)
        xml_web_download(structure, filename)

        temp_folder = os.path.split(tempdir)[1]
        return redirect (url_for('download_page', temp_folder=temp_folder))

    return render_template('index.html', form=form)

Create a new view function called download_page. It receives the temporary directory name in a dynamic URL. Add into it the preview file logic you deleted from the index view function. When adding back in the file preview code, change the file open statement to a with statement, which is more “Pythonic”, results in fewer lines of code, and automatically closes the file when it is no longer needed.

@app.route('/download_page/<temp_folder>', methods=('GET','POST'))
def download_page(temp_folder):
    filename = os.path.join(
        os.path.relpath(os.path.dirname(__file__)), 
        'downloads',temp_folder,'user-mapping.xml')

    with open(filename) as preview:
       data = preview.readlines()

    download_url = url_for('download', 
        tempfolder=temp_folder, filename='user-mapping.xml')

    return render_template('download.html', 
        data=data, download_url=download_url)

Refresh the browser to test the application. After uploading a configuration file, you should end up with a screen that looks like the screenshot below.

You should see that, after you upload a file, you are redirected to a page that previews the generated XML file and provides a link to download it. Refreshing the browser no longer regenerates the download file.

Commit your code to Git, and record TO-DOs

Now your program is fully functional. Commit the new code to Git and push it to the remote repository.

(env) $ cd ~/Projects/usermapper-web
(env) $ git add .
(env) $ git commit -m 'First Flask program'
(env) $ git push

Next, make a record of any improvements you would like to make so, if you have time, you can implement those improvements in the future.

Go to the GitHub repository on the GitHub web site. In my case, it is https://github.com/blinklet/usermapper. Click on the Issues link and record any ideas you have for improving the code, so you do not forget about them.

There are a number of issues I want should record for later implementation in both the Usermapper command-line app repository and the Usermapper-web web app repository.

In the Usermapper project repository, I added the following Issues:

To improve efficiency, create the contents of the user-mapping.xml as a list in memory and return it to the flask app. The Flask app will save it to temporary storage. This decouples the usermapper.mapperdata module from the filesystem.
Add more error checking code on the loaded configuration file (example: so we do not create crazy-large xml files if someone says there are one million students). Some ideas for config file restrictions:

Only two user type allowed (trainer and student)
- Maybe three for flexibility
Up to 4 trainers allowed
up to 12 of any other type allowed
Up to 10 device types allowed
up to 10 devices per type

In the Usermapper-web project repository, I added the following issues:

Improve the user interface appearance with CSS and Bootstrap?
Add more input checking, such as for configuration file size, on the client side using JavaScript.
New user interface: Create a set of dynamic forms that allow the user to build the configuration in the browser and submit it — instead of perparing a yaml configuration file in advance.
Use session cookies instead of passing variables between routes using dynamic urls. It is more secure and more flexible. See: Flask-Session.

A quick break

Congratulations on making it this far through the tutorial. You successfully converted a Python command-line application into a web app using Flask and Flask extensions. At this point, you have a fully-functioning web app that runs in the development environment on your PC. If you only plan to use the application by yourself, you could stop here.

If you wish to share this application with others, continue reading. The next half of this tutorial shows you how to make the web app look more professional with the Bootstrap CSS library, and how to deploy the web app to a production environment running on a cloud service so everyone in the world can use it.

Style your web app with Bootstrap

Currently, your web app works but it looks terrible. I imagine you want to make the web app look more professional but you don’t want to spend an extra week learning CSS and JavaScript. You will achieve faster results if you use the Bootstrap library, which provides a set of HTML classes you can use to style and structure a web page.

Bootstrap-Flask

To keep things simple, I will use the Bootstrap-Flask helper library instead of manually importing Bootstrap and working with classes. Hopefully, the library developer will keep it up to date because Bootstrap 5 is coming out soon.

Install Bootstrap-Flask in your environment:

(env) $ pip install bootstrap-flask

Modify the application.py program to include Bootstrap-Flask. Import the Bootstrap class to the program, as shown below:

from flask_bootstrap import Bootstrap

Register The Bootstrap class with the application by creating an instance of the Bootstrap class, named bootstrap, that inherits all the functions and attributes of the original Flask application instance, named app:

app = Flask(__name__)
bootstrap = Bootstrap(app)

Bootstrap-Flask provides some Jinja macros that make developing templates a bit easier — especially for more complex elements like tables and forms. I am using it as a quick way to style my web app forms without learning a lot about Bootstrap, itself. However, Bootstrap-Flask covers only a small amount of Bootstrap functionality so, if you need it, the normal Bootstrap 4 classes are all still available.

Jinja Template hierarchy and design

Now is a good time to start using the block rendering features in Jinja templates because you will have a common elements, like a header or navigation bar, on each new page you create. To lean more about Jinja templates and template inheritance, see the following tutorials or videos from Pythonise, listed below:

Flask templates: tutorial, video
Flask templates and Jinja: tutorial, video

Create a template file named templates/base.html and copy the Bootstrap-Flask starter template from the Bootstrap-Flask web site into the base template. Also, change the title to a block placeholder:

The base.html template will look like:

<!doctype html>
<html lang="en">
<head>
    {% block head %}
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">

    {% block styles %}
        <!-- Bootstrap CSS -->
        {{ bootstrap.load_css() }}
    {% endblock %}

    <title>{% block title %}Your page title{% endblock %}</title>
    {% endblock %}
</head>
<body>
    <!-- Your page content -->
    {% block content %}{% endblock %}

    {% block scripts %}
        <!-- Optional JavaScript -->
        {{ bootstrap.load_js() }}
    {% endblock %}
</body>
</html>

Then, change the templates/index.html and templates/download.html templates so they inherit the base template and each extends the title, main, and script blocks with unique data.

The index.html template will look like:

{% extends "base.html" %}

{% block title %}Guacamole User Mapper{% endblock %}

{% block content %}
    <h1>Generate usermapping.xml</h1>
    <p>Upload your configuration file</p>
    <form method="POST" enctype="multipart/form-data">
        {{ form.csrf_token }}
        {{ form.filename.label }} {{ form.filename(size=20) }}
        {{ form.submit }}
        {% for error in form.filename.errors %}
            <p style="color: red;">{{ error }}</p>
        {% endfor %}
    </form>
{% endblock %}

The download.html template will look like:

{% extends "base.html" %}

{% block title %}Guacamole User Mapper{% endblock %}

{%block content %}

    <h1>Download usermapping.xml</h1>

    {% if data != None %}
    <p></p>
    <p><a href="{{ download_url }}">Download user-mapping.xml</a></p>
    <p></p>
    <p><a href="{{ url_for('index') }}">Create another user-mapping File</a></p>
    <p></p>
    <h2>File preview:</h2>
        <pre><code>
            {%- for item in data -%}
                {{ item }}
            {%- endfor %}
        </code></pre>
    {% endif %}

{% endblock %}

Reload the web page and see the fonts have changed. This gives us some indication tha Bootstrap is working properly. That’s how simple it is to add Bootstrap to the page.

Adding Bootstrap styles

Now we need to dig through the Bootstrap and Bootstrap-Flask documentation. We’ll be using div classes and other tag classes to style the elements on the web page. Because I do not have the time to become an expert in CSS, I’ll use only the classes that Bootstrap and Bootstrap-Flask provide.

Add some style to the form on the index page.

Replace all the Jinja form placeholders with just one line, which uses the render_form macro from Bootstrap-Flask.

You need to import the render_form macro into the template. Add the following line after the extends block at the top of the index.html file:

{% from 'bootstrap/form.html' import render_form, render_field %}

Delete the following text from index.html:

    <form method="POST" enctype="multipart/form-data">
        {{ form.csrf_token }}
        {{ form.filename.label }} {{ form.filename(size=20) }}
        {{ form.submit }}
        {% for error in form.filename.errors %}
            <p style="color: red;">{{ error }}</p>
        {% endfor %}
    </form>

and replace it with the following text:

        {{ render_form(form) }}

The index.html template now looks like:

{% extends "base.html" %}
{% from 'bootstrap/form.html' import render_form, render_field %}

{% block title %}Guacamole User Mapper{% endblock %}

{% block content %}
    <h1>Generate usermapping.xml</h1>
    <p>Upload your configuration file</p>

    {{ render_form(form) }}

{% endblock %}

This is a good example that shows how Flask extensions can make things simpler. Bootstrap-Flask’s render_form macro takes the form object that was passed into the template from application.py’s index view and renders all the form fields by in the form object.

Refresh the browser to see the changes. The form looks different because it is rendered by Bootstrap-Flask using CSS style classes provided by Bootstrap. Using Flask-Bootstrap macros makes development easier, but it forces you to lose some control over appearance.

To make things look a bit better, modify the application.py file and add a message in the FileAllowed validator and a description in the FileField object. Use the same text in both messages so it looks like the message turns red if the validation fails.

The MyForm class in the application.py file should now look like:

class MyForm(FlaskForm):
    filename = FileField('Select configuration file: ', 
        validators=[FileRequired(), FileAllowed(['yaml'], 
        message='Only YAML files accepted')], 
        description="Only YAML files accepted")
    submit = SubmitField('Upload')

Save the file and refresh the browser. Your web page should now look similar to the below screenshot:

Using the Bootstrap grid

Next, use Bootstrap’s grid system to arrange elements on the index web page. Create one row with two columns: one containing the form and another containing some information for the user.

Add div tags with Bootstrap’s container, row, and column-size classes to the index.html template. The content block in the index.html template should now look like:

{% block content %}

<div class='container'>
    <div class='row'>
        <div class='col-sm'>
            {{ render_form(form) }}
        </div>
        <div class='col-sm'>
            <h1>Generate usermapping.xml</h1>
            <p>Upload your configuration file</p>
        </div>
    </div>
</div>

{% endblock %}

Refresh the browser and see that the page is rendered in two columns and the layout is responsive. It should look similar to the screenshot below:

Similarly, add a grid layout to the downloads.html template. The content block in the downloads.html template should now look like:

{%block content %}

<div class = 'container'>
    <div class = 'row'>
        <div class='col'>
            <h1>Download user-mapping.xml</h1>

            {% if data != None %}
                <p></p>
                <p><a href="{{ download_url }}">Download user-mapping.xml</a></p>
                <p></p>
                <p><a href="{{ url_for('index') }}">Create another user-mapping File</a></p>
                <p></p>
                <h2>File preview:</h2>
                <pre><code>
                    {%- for item in data -%}
                        {{ item }}
                    {%- endfor %}
                </code></pre>
            {% endif %}

        </div>
    </div>
</div>

{% endblock %}

Jinja filters

Previously, in the download.html template, you used HTML preformatted text tags to present the user-mapping.xml file preview. This is OK, but could be better. You have limited style options in the preformatted text and the displayed lines are spaced a bit too far apart.

Now that you’ve learned more about Jinja templates, you can code a better solution using Jinja filters.

In the download.html template, delete the preformatted text tags and use jinja filters to preserve the preview indenting. Replace the text:

            <pre><code>
                {%- for item in data -%}
                    {{ item }}
                {%- endfor %}
            </code></pre>

With the following text:

        <p style="font-size: small; line-height: 1.25; font-family: 'Courier New', Courier, monospace;">
            {% for item in data %}
                {{ item|replace(' ','&nbsp;'|safe )}}<br/>
            {% endfor %}
        </p>

Instead of the preformatted text tag, you now use a paragraph tag and specified the style that will be rendered in the browser. But, the blank spaces you use to indent the XML code will not be rendered by the browser. So you need to replace each space character with the HTML code for a space, .

As the for loop iterates through the item placeholder, the Jinja replace filter swaps spaces for HTML non-breaking-space codes and uses the safe filter to prevent Jinja from automatically escaping the non-breaking-space HTML codes.

The final download.html template should look like:

{% extends "base.html" %}

{% block title %}Guacamole User Mapper{% endblock %}

{%block content %}

<div class = 'container'>
    <div class = 'row'>
        <div class='col'>
            <h1>Download user-mapping.xml</h1>

            {% if data != None %}
                <p></p>
                <p><a href="{{ download_url }}">Download user-mapping.xml</a></p>
                <p></p>
                <p><a href="{{ url_for('index') }}">Create another user-mapping File</a></p>
                <p></p>
                <h2>File preview:</h2>
                <p style="font-size: small; line-height: 1.25; font-family: 'Courier New', Courier, monospace;">
                    {% for item in data %}
                        {{ item|replace(' ','&nbsp;'|safe )}}<br/>
                    {% endfor %}
                </p>
            {% endif %}

        </div>
    </div>
</div>

{% endblock %}

Save the file and refresh the browser. After you upload a configuration file, the download page will look similar to the screenshot below:

Preparing to deploy your Flask application

Currently, you are running your Flask application on your local PC in a development environment. All your environment variables are either hard-coded in the source code, or manually configured in the Linux shell in which your application runs. Your application’s secret key, which must be kept secret, is visible for all to see in GitHub because it is part of the source code in the application.py file.

Before you deploy your application to a public server, you must find a way to protect your application’s configuration information from hackers who may scrape GitHub for application configuration information and secret keys. Of course, you could choose not to post your code in a public GitHub repository in order to protect your secret keys. However, you would then lose the benefits of collaborating with a community of open-source developers. In any case, tracking files that contain secret keys in any Git repository — even a private one — is bad practice.

In addition, the application configuration information may be different depending on where the application is running. For example, you must run your Flask application in a production environment on a public server.

This section of the tutorial shows you how to set up a configuration file that sets up environment variables for your development environment. You can then configure Git to ignore the configuration file. Depending on the platform you use to deploy your application to a public-facing web site, you may have a separate configuration file on the remote server.

Environment variables

You need to store your environment variables in a separate file that we can set Git to ignore, so it will never be uploaded to your public GitHub repository. That file is typically named .env and is referred to as a “dot-env” file.

You must especially protect the SECRET_KEY environment variable. Up until now, you’ve been using a dummy secret key. You need to generate a secure secret key. Use the following Python command to generate a secret key you can use.

$ python3 -c 'import secrets; print(secrets.token_urlsafe(32))'

Copy the output to the clipboard so you can paste it into the .env file.

Create a new file named .env in the usermapper-flask directory. Define the following environment variables in the file:

FLASK_APP=application
FLASK_ENV=development
SECRET_KEY=b8rD0UJDkrr6MrdP8RQ1GpLPEA_SYsrrIfMuTjfw5AI

Many other environment variables affect both Flask and Bootstrap. You can modify the operation and appearance of your program, to some degree, just by defining additional environment variables in the .env file.

Add the .env file to .gitignore

To prevent yourself, from accidentally uploading the secret key to GitHub, add the .env file to .gitignore:

(env) $ cd ~/Projects/usermapper-web
(env) $ echo '.env' >> .gitignore

NOTE: If you are using the standard Flask .gitignore file from the Flask web site, you already have a line in the file that ignores the .env file.

Other programmers who clone your project’s GitHub repository will be missing the .env file so the program will not work for them until they build their own .env file. They can infer which variables need to be defined in the file by looking at the source code in application.py. Most open-source Python projects have documentation for developers that tells them which environment files they need to define. That’s another item for my to-do list.

Install python-dotenv

To enable Python programs to read the contents of the .env file, you must install the the python-dotenv package in your Python virtual environment.

(env) $ pip install python-dotenv

Modify application.py

Edit the application.py file. Import the load_dotenv module from the dotenv package.

from dotenv import load_dotenv

Delete the secret key and content length configuration lines in application.py:

app.config['SECRET_KEY'] = 'fix this later'
app.config['MAX_CONTENT_LENGTH'] = 1024 * 1024

And replace them with the following configuration, which first finds the .env file and then configures the Flask app using the variables defines in the .env file:

basedir = os.path.abspath(os.path.dirname(__file__))
load_dotenv(os.path.join(basedir, '.env'))
app.config['SECRET_KEY'] = os.environ.get('SECRET_KEY')
app.config['FLASK_APP'] = os.environ.get('FLASK_APP')
app.config['FLASK_ENV'] = os.environ.get('FLASK_ENV')
app.config['MAX_CONTENT_LENGTH'] = 1024 * 1024

In the above code, you build the path to the .env file and to load the environment variables from that file. Then we get each environment variable and use it to configure the Flask app.

Save the file. Refresh the browser. Everything should work the same as before except that now, when we commit the changes and push them to Github, we have a secret key that remains on our local PC but does appear anywhere in the public GitHub repository.

Saving the requirements.txt file

To simplify installing the Flask application on a remote server, create a requirements.txt file for the Flask application.

If you run the pip freeze command, you will see a lot of packages in the file, but you only need a few of them. Also, you need to configure the requirements file so the Usermapper package is installed from my GitHub repository.

Create a file named requirements.txt in the usermapper-web directory. You previously installed flask, Flask-WTF, python-dotenv, and bootstrap-flask so add them to the file. Add the wheel package because it is needed to install the others. Also, install the usermapper package from its Git repository. The Usermapper setup script installs pyyaml so you don’t need to list pyyaml in your requirements.txt file.

Add the following lines to the requirements.txt file.

wheel
flask
Flask-WTF
python-dotenv
bootstrap-flask
git+https://github.com/blinklet/usermapper.git@v0.3#egg=usermapper

Test the requirements.txt file by deactivating the current Python virtual environment and creating a new environment named newenv in the usermapper-web directory:

(env) $ deactivate
$ python3 -m venv newenv
$ source newenv/bin/activate
(newenv) $ pip install -r requirements.txt
(newenv) $ flask run

Refresh the browser. The app should work as expected.

Then delete the test environment and switch back to the original.

(env) $ deactivate
$ rm -rf newenv
$ source env/bin/activate
(env) $

application.py listing

The application.py source code should now look like the listing below:

from flask import Flask, render_template, send_from_directory, url_for, redirect
from flask_wtf import FlaskForm
from flask_wtf.file import FileField, FileRequired, FileAllowed
from wtforms import SubmitField
import os, tempfile
import yaml
from usermapper.usermapper import xmlwriter
from usermapper.mapperdata import get_users
from flask_bootstrap import Bootstrap
from dotenv import load_dotenv

app = Flask(__name__)
bootstrap = Bootstrap(app)

basedir = os.path.abspath(os.path.dirname(__file__))
load_dotenv(os.path.join(basedir, '.env'))
app.config['SECRET_KEY'] = os.environ.get('SECRET_KEY')
app.config['FLASK_APP'] = os.environ.get('FLASK_APP')
app.config['FLASK_ENV'] = os.environ.get('FLASK_ENV')
app.config['MAX_CONTENT_LENGTH'] = 1024 * 1024

class MyForm(FlaskForm):
    filename = FileField('Select configuration file: ', 
        validators=[FileRequired(), FileAllowed(['yaml'], 
        message='Only YAML files accepted')], 
        description="Only YAML files accepted")
    submit = SubmitField('Upload')

@app.route("/", methods=('GET','POST'))
def index():
    form = MyForm()
    filename = None
    if form.validate_on_submit():

        f = form.filename.data
        basedir = os.path.join(
            os.path.relpath(os.path.dirname(__file__)), 
            'downloads')
        tempdir = tempfile.mkdtemp(dir=basedir)

        filename = os.path.join(tempdir,'user-mapping.xml')

        configuration = yaml.safe_load(f.read())
        structure = get_users(configuration)
        xmlwriter(structure,filename)

        temp_folder = os.path.split(tempdir)[1]
        return redirect (url_for('download_page', temp_folder=temp_folder))

    return render_template('index.html', form=form)

@app.route('/download_page/<temp_folder>', methods=('GET','POST'))
def download_page(temp_folder):
    filename = os.path.join(
        os.path.relpath(os.path.dirname(__file__)), 
        'downloads',temp_folder,'user-mapping.xml')

    with open(filename) as preview:
       data = preview.readlines()

    download_url = url_for('download', tempfolder=temp_folder, filename='user-mapping.xml')

    return render_template('download.html', 
        data=data, download_url=download_url)

@app.route("/download/<tempfolder>/<filename>", methods=('GET','POST'))
def download(tempfolder,filename):
    basedir = os.path.join(
        os.path.abspath(os.path.dirname(__file__)), 
        'downloads')
    temp_dir = os.path.join(basedir,tempfolder)
    return send_from_directory(
        temp_dir, filename, as_attachment=True)

Commit changes to Git

Commit these changes to git and push them to GitHub.

(env) $ git add .
(env) $ git commit -m 'added envonment variables in env file'
(env) $ git push
(env) $

Deploying a Flask application to Microsoft Azure

Now you are ready to deploy the web application to a remote server. You have two different ways to deploy the applications.

You may purchase a remote server and install the application on it, in which case you will follow the same procedures you used to run the application on your local PC but you will need to install a production-grade WSGI server and do some extra work to ensure your server is secure. Many companies provide virtual private servers that you can configure and use according to your needs. Some companies I am familiar with are Linode, DigitalOcean, Microsoft Azure Virtual Machines, Amazon AWS EC2, and Google Compute.

Alternatively, you may deploy your Python program to a Python web-app platform-as-a-service, in which case you do not need to create and secure a remote server, but you will need to learn the specific features and functions of the web-app service you choose and may not have access to all the functions you normally use on your own server. There are many Python web app services you may use, such as Heroku Cloud Application Platform, Microsoft Azure App Service, Google App Engine, Amazon AWS CodeStar, PythonAnywhere, Platform.sh, DigitalOcean App Platform, and more. To get started, look for one that offers a free tier of service for small applications with low usage.

In this tutorial, I chose to use a web app platform because I did not want to spend more time learning studying WSGI servers and web server security. The platform-as-a-service I choose will have dozens of engineers working to keep my application’s environment secure. All I need to do is follow the service’s instructions to deploy my app.

This tutorial uses the Microsoft Azure App Service because Azure offers a permanently-free app-service tier.

Azure Portal

If you do not already have an Azzure account, create one. The Azure Portal web interface is available at: https://portal/azure.com.

Follow the Azure quickstart documentation about deploying a Python web app. The Azure Web App Quick-Start Guide, which uses the Azure CLI, is the easiest way to deploy your web-app to Azure.

Azure CLI

Install the Azure command-line interface (CLI) on your Linux PC. Run the following command:

(env) $ curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

(env) $ az login

A browser window will open and prompt you to login. Follow the instructions in the browser. After you login, you may close the browser window or tab.

Deploy your web app using Git

You already have a usermapper-web Git repository on your PC. Use the Azure CLI to deploy your web app to an Azure Web App by following the steps outlined below.

Deploy the usermapper-web web app with the command:

(env) $ cd ~/Projects/usermapper-web
(env) $ az webapp up --sku F1 --name usermapper

The F1 web app size is the free tier.

Look at the output generated by the command, listed below. The Azure CLI automatically creates a lot of resources for you.

The webapp 'usermapper' doesn't exist
Creating Resource group 'mail_rg_Linux_centralus' ...
Resource group creation complete
Creating AppServicePlan 'mail_asp_Linux_centralus_0' ...
Creating webapp 'usermapper' ...
Configuring default logging for the app, if not already enabled
Creating zip with contents of dir /home/brian/usermapper-web ...
Getting scm site credentials for zip deployment
Starting zip deployment. This operation can take a while to complete ...
Deployment endpoint responded with status code 202
You can launch the app at http://usermapper.azurewebsites.net
{
  "URL": "http://usermapper.azurewebsites.net",
  "appserviceplan": "mail_asp_Linux_centralus_0",
  "location": "centralus",
  "name": "usermapper",
  "os": "Linux",
  "resourcegroup": "mail_rg_Linux_centralus",
  "runtime_version": "python|3.7",
  "runtime_version_detected": "-",
  "sku": "FREE",
  "src_path": "//home//brian//usermapper-web"
}

See the web app information in the command’s output. Go to the URL listed in the deployment response: http://usermapper.azurewebsites.net. You should see a server error. How do you debug this?

Check web app logs

To investigate the error, look at web app logs in the Azure portal. Or, run the following Azure CLI command:

(env) $ az webapp log tail --name usermapper

If you see a lot of logs, and no obvious errors, you may need to search for the “error” keyword:

(env) $ az webapp log tail --name usermapper | grep -i error
2020-12-11T21:26:52.422108240Z     raise RuntimeError(message)
2020-12-11T21:26:52.422113040Z RuntimeError: A secret key is required to use CSRF.

It looks like you do not have a secret key configured. This is because you did not configure the environment variables for the web app. The remote web app’s environment is a production environment, so the FLASK_ENV variable must be set to production. The SECRET_KEY variable also needs to be configured in the remote web app’s environment and it can either be the same as, or different from, the secret key you configured in your local .env file.

Quit the command with CTRL-C.

Configure web app environment variables

The Azure Portal offers an intuitive user interface for changing the Azure web application configuration settings but it’s easier to show the command-line-interface in a blog post like this so use the Azure CLI to configure the web app. In your Linux PC’s terminal window, enter the Azure CLI command shown below, except your resource group name and web app name will be different:

(env) $ az webapp config appsettings set \
        --name usermapper \
        --resource-group mail_rg_Linux_centralus \
        --settings FLASK_ENV="production" \
        FLASK_APP="application" \
        SECRET_KEY="b8rD0UJDkrr6MrdP8RQ1GpLPEA_SYsrrIfMuTjfw5AI"

You configured the environment variables for the FLASK_APP, FLASK_ENV, and SECRET_KEY environment variables. Now go to the web app URL: http://usermapper.azurewebsites.net.

The application looks like it works. Upload a config file. Then download the user-mapping.xml file. It seems to work OK.

The web app’s filesystem

Remember that the usermapper program saves downloaded files in temporary directories. Have a look at the web app’s filesystem and see those files on the remote web app service.

Azure offers an SSH console connection to the container running the web app. Log into the Azure web app container by doing the following:

Go to “App Services” in the Azure portal
Click on the “usermapper” web app
Click on “SSH”

Finally, click on the “go” link in the SSH Panel. A new browser tab will open runnning an SSH session connected to the web app’s container.

In the browser’s SSH tab, run the following commands:

# cd downloads
# ls 
tmphqspiosn  tmpnwjs1vmj

See one or more temporary directories have already been created. Each one should contain a user-mapping.xml file.

Problems cleaning up files on a web app

Unfortunately, you cannot delete these temporary files on a scheduled basis using the same method you used when you were developing the web app.

On your local PC, you used a cron job to delete temporary files every 20 minutes. I tried installing cron in the web app container and editing the crontab file, the same way I did when I was testing on my local PC. Installation and configuration worked OK. However, the web app container pauses itself when it is not being actively used so, given that it is very rarely used right now, it is almost always paused. Unless the container is actively running when its system clock ticks past a 20-minute mark on its clock, the cron service will not delete any temporary files.

This is a case where using a database would solve the problem because a managed database service can be configured to delete old data.

For now, because you want to use the free service provided by Microsoft Azure, you need to occasionally log into the web app’s SSH console and manually delete old temporary files so your web app’s disk space does not fill up.

Azure offers a platform service called WebJobs that runs a script on a scheduled basis, which could clean up files for you. Regretably, the WebJobs service is not available for Python apps. Maybe it will be available for Python apps in the future.

Custom domain name

Currently, your web app is a subdomain in the azurewebsites.net domain.

If you want to map a custom domain name to your new web app, you must upgrade to a paid Azure Web Services tier. I chose not to upgrade to a paid service tier at this time.

Paid options for web app deployment

If you already paid for a custom domain name registration, then you are probably willing to spend some money on hosting your web app. If that is the case, you could upgrade to a Basic Azure App Service tier, which costs at least $14 per month. Then, you could refactor your application to use a database and use one of Microsoft Azure’s managed database services to your web app. For a small application like this, the database cost would be very low. The Azure App Service already includes value-added services like load balancers and content delivery networks (CDNs).

If you want to keep costs low, purchase a cheap virtual private server (VPS) from any cloud infrastructure provider, including Azure, for around $5 per month. You take on more system administration responsibilities when you deploy a web app on remote VPS. However, you gain more control over the system so you could use cron or any other method you prefer to clean up old files. You can also configure a custom domain to point to a VPS for free, after paying for the domain, and use SSL encryption for free.

Conclusion

This tutorial showed you how to convert an existing Python command-line program into a web app so users can more easily access it. You learned how to use Flask to upload and download files, how to get user input using HTML forms, and how to use Bootstrap to make your application look professional while learning just the minimum you need to know about HTML and CSS. You also learned how to deploy a web app to a Python platform-as-a-service that costs nothing.

While working on this tutorial, I found a web app that helps developers create cron expressions, based on information they enter in the user interface. This is both a great tool and a good example of how the tools you develop may be made available to others.

See many useful .gitignore files at: https://github.com/github/gitignore ↩
In the future, I should re-write the usermapper package so it builds the user-mapping.xml file contents as a list in memory and returns that list to application.py. Then, the program can print the preview on the web page and save it to disk in application.py at the same time. ↩

↧

azruntime: Manage Azure Infrastructure with Python

February 2, 2021, 3:36 pm

≫ Next: Install azruntime as a CLI program using pipx

≪ Previous: Flask web app tutorial for network engineers

I wrote a new Python script called azruntime. It helps me manage my Azure VMs. The script is open-source and should work for anyone who also uses the Azure CLI. azruntime is available on my azure-scripts GitHub repository.

Table of Azure VM information

I learned a lot about the Azure Python SDK while working on the azruntime project. In this post, I share what I learned and highlight the more interesting topics like how to find information faster in the Azure Python SDK documentation, Azure authorization, and sorting nested lists by key.

Learning the Azure Python SDK and API

Microsoft offers excellent documentation of all its Azure services, including detailed documentation for the Azure Python SDK. The problem may be that there is so much documentation it is hard to know where to start.

In my opinion, the best place to start is to look at the Azure sample scripts available at the following URL:

https://docs.microsoft.com/en-ca/samples/browse/

Search by keyword or category. When you find a script that appears to display some of the functionality you want to implement, use a search engine to search for the Azure Python SDK classes and functions you see used in the sample scripts.

This is a faster way to find the information you need about the Azure Python SDK.

Azure authentication for Python scripts

Azure’s documentation assumes you are writing apps that run on servers and need to authenticate using their own managed identity and permissions. The examples you find when you are searching the documentation will usually describe complex scenarios. If you are writing scripts that just augment what you can do with Azure CLI, you can use a simpler authentication method: CLI-based authentication.

Use CLI-based authentication

Azure CLI-based authentication is easier to use than creating a user-assigned managed identity and then giving it permissions to read information about each individual resource I manage. After I run the az login command to login to Azure CLI, any scripts I run that use CLI-based authentication will operate with my roles and permissions.

CLI-based authentication is suitable for use in simple fact-finding scripts that help Azure users manage resources in their subscriptions. Use CLI-based authentication when you write scripts that automate operations you might normally perform with Azure CLI. My azruntime script is one such script.

Risks

Microsoft recommends that CLI-based authentication be used only for development. Using CLI-based authentication can be dangerous if you have write access to Azure resources because the script runs with the same authorizations as your user account. Different users may have different roles and permissions in Azure so a script that uses CLI-based authentication might work differently, or not at all, for other users.

To mitigate these risks, I use CLI-based authentication in my production scripts where the scripts meet all of the following criteria:

I wrote the script yourself or, if using a script someone else wrote, I have read the source code and understood it.
Users need to manage only the infrastructure or other resources to which they personally have access.
The script performs read actions, only.

Azure Identity Classes

Use either the DefaultAzureCredential or AzureCliCredential class from the Azure Identity client library to implement CLI-based authentication in a Python script.

I do not use the DefaultAzureCredential class because it raises a lot of errors as it searches for Azure authentication credentials on the system upon which it is installed. It works, but its output is messy. It also requires you install additional dependencies, like the PyGObject library, on your system.

I think it is clearer to just use the AzureCliCredential class. It is simpler to implement and it does not raise any error messages, as long as the user is logged into Azure CLI.

Implement CLI-based authentication

In the examples below, I show how authentication and authorization work for Azure Python applications.

First, login to Azure CLI.

$ az logout

As always, set up a virtual environment for development. This protects you from package conflicts that may occur with the Azure CLI packages installed on your system.

$ mkdir azruntime
$ cd azruntime
$ python3 -m venv env
$ source env/bin/activate
(env) $ pip install wheel
(env) $

Next, install the Azure Identity library:

(env) $ pip install azure-identity

To test your script’s authorization code, you need to perform actions on Azure resources or services. According to the Azure Python SDK documentation, the azure.mgmt.resource library module contains the classes that manage subscriptions and resource groups: SubscriptionClient and ResourceManagementClient. See the Azure Python Management sample code for ideas about how to search for and manage resources.

(env) $ pip install azure-mgmt-resource

Write a simple script that gets your Azure subscription information.

For example, I have two subscriptions and I want to write a script that prints the subscription information. Use the Python interactive prompt and the following code:

(env) $ python
>>> from azure.identity import AzureCliCredential
>>> from azure.mgmt.resource import SubscriptionClient
>>> cred = AzureCliCredential()

At this point, cred is an instance of the AzureCliCredential() class that contains the authentication token also used by Azure CLI. To work with subscription information, we pass the credential to the SubscriptionClient class.

>>> sub_client = SubscriptionClient(cred)

sub_client is an instance of the SubscriptionClient class and it represents a connection to the subscriptions that you have permission to use in Azure.

NOTE: If you forgot to login to Azure CLI, you still would have gotten this far without any errors because the Azure Python SDK does not try to authorize an action until you actually use the resource client.

Try printing the list of subscriptions:

>>> print(sub_client.subscriptions.list())
<iterator object azure.core.paging.ItemPaged at 0x7fe3298a1ee0>

The sub_client object queries Azure for your subscription information. At this request is authorized or rejected based on the permissions assigned to the user’s Azure CLI user id.

We see that the sub_client object returns an iterable. This type of object cannot be indexed so you cannot get just one item by index. You need to iterate through it to see each subscription, or you may unpack it as arguments in a function. For example:

Pull out data from the itereable using a list comprehension, as shown below.

>>> [[sub.display_name, sub.subscription_id] for sub in sub_client.subscriptions.list()]
[['BL-Dev','fd5a54e1-e6d6-94a1-9e02-112ec20d499e'],['BL-Prod','97dd7d07-ec4e-ed45-454a-1e629f6d5691']]

Or, create a generator if you want to iterate through the set of subscriptions in another way. (This is an excuse to write my first Python generator expression!)

>>> subs = ([sub.display_name, sub.subscription_id] for sub in sub_client.subscriptions.list())
>>> type(subs)
<class 'generator'>
>>> next(subs)
['BL-Dev', 'fd5a54e1-e6d6-94a1-9e02-112ec20d499e']
>>> next(subs)
['BL-Prod', '97dd7d07-ec4e-ed45-454a-1e629f6d5691']
>>> next(subs)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Or, dump all subscription information by unpacking the iterable in a print function.

>>> print(*sub_client.subscriptions.list())
{'additional_properties': {}, 'id': '/subscriptions/fd5a54e1-e6d6-94a1-9e02-112ec20d499e', 'subscription_id': 'fd5a54e1-e6d6-94a1-9e02-112ec20d499e', 'display_name': 'BL-Dev', 'tenant_id': '9d991563-8576-3e6a-09bc-90f49f943111', 'state': 'Enabled', 'subscription_policies': <azure.mgmt.resource.subscriptions.v2019_11_01.models._models_py3.SubscriptionPolicies object at 0x7fe3297ff040>, 'authorization_source': 'RoleBased', 'managed_by_tenants': [<azure.mgmt.resource.subscriptions.v2019_11_01.models._models_py3.ManagedByTenant object at 0x7fe3297ff130>], 'tags': None} {'additional_properties': {}, 'id': '/subscriptions/97dd7d07-ec4e-ed45-454a-1e629f6d5691', 'subscription_id': '97dd7d07-ec4e-ed45-454a-1e629f6d5691', 'display_name': 'BL-Prod', 'tenant_id': '9d991563-8576-3e6a-09bc-90f49f943111', 'state': 'Enabled', 'subscription_policies': <azure.mgmt.resource.subscriptions.v2019_11_01.models._models_py3.SubscriptionPolicies object at 0x7fe3297ff0d0>, 'authorization_source': 'RoleBased', 'managed_by_tenants': [], 'tags': None}

Quit the interactive Python prompt:

>>> quit()
(env) $

Next, use the ResourceManagementClient class to list all resource groups in a subscription.

Things are getting more complex so open a text editor and create a Python program that lists all resource groups in your subscriptions. It should look similar to the one shown below:

from azure.identity import AzureCliCredential
from azure.mgmt.resource import SubscriptionClient
from azure.mgmt.resource import ResourceManagementClient

cred = AzureCliCredential()
sub_client = SubscriptionClient(cred)
for sub in sub_client.subscriptions.list():
    sub_id = sub.subscription_id
    sub_name = sub.display_name
    resource_client = ResourceManagementClient(cred, sub_id)
    for group in resource_client.resource_groups.list():
        print(sub_name, group.name)

Save the file as test1.py and run it. You should see output similar to below:

(env) $ python test1.py
BL-Dev vpn2021
BL-Dev routerlab
BL-Dev labtest
BL-Dev optical
BL-Dev vpnsec
BL-Prod lab02
BL-Prod lab01
BL-Prod app-frontend
BL-Prod app-backend
BL-Prod applab

You can imagine how you could keep expanding this script to list all VMs in each resource group in each subscription, get the activity logs for each VM, and so on. That’s how I built my azruntime script.

To manage other resources in Azure, you can use other libraries. To experiment with this, install the following libraries:

(env) $ pip install azure-mgmt-resource
(env) $ pip install azure-mgmt-compute
(env) $ pip install azure-mgmt-monitor

Sorting lists of lists by key

my azruntime script builds a table containing Azure VM information. Each row contains information about each VM. Each column is a specific piece of data like VM name, subscription name, location, or running time.

In memory, I represent this table as a list of nested lists. Each nested list is a row in the table.

Sort by key using the itemgetter function

The way most people sort a list of lists is to use the key keyword argument in either the sorted function or in the list object’s sort method. While I was figuring out how to implement this, I learned about the operator.itemgetter function, which is easier than using lambda functions in the sort() function’s key argument.

I used the operator.itemgetter function to pick an item by index from each nested list and use it as the sort key.

First, import the operator.itemgetter function:

from operator import itemgetter

Then use the sorted function to return a new list, sorted by the items indexed in each nested list.

def sort_by_column(input_list, column_index):
    return(sorted(input_list, key=itemgetter(column_index)))

If I want to sort a table by the third column, I use column_index = 2 when I call the function. For example:

sort_by_column(vm_table, 2)

The function needs to be more flexible. The table has a header row that contains column names and I want to sort by column name instead of index number. So, because I assume the first nested list is list of column names and that the column_name parameter will be a string with a value like “VMname”, I write the update the function as follows:

def sort_by_column(input_list, column_name):
    header = input_list[0]
    rows = input_list[1:]
    column_index = header.index(column_name)
    rows.sort(key=itemgetter(column_index))
    rows.insert(0, header)
    return(rows)

To sort a list of lists named “vm_table” by the column name “Location”, I call the function with the following statement:

sort_by_column(vm_table, 'Location')

Argument packing and unpacking

I also learned about argument packing and unpacking, which enables you to write functions that accept a variable number of arguments and also lets you unpack iterables as arguments when you call a function. This is a Python feature that you may not appreciate when you first read about it (I forgot about it after reading the Learning Python book¹) but, when you need it, it is very useful.

I want to sort the table by more than one column name. For example, I want the table organized by “VMstatus”, then by “Subscription”, then by “VMsize”. From the Python documentation, I know the operator.itemgetter function will return items recursively from a nested list if you give it more than one integer as a parameter. For example:

rows.sort(key=itemgetter(2, 4, 0)

The expression above will recursively sort a list of lists by the third column, then by the fifth column, then by the first column. It’s an easy way to sort by multiple columns. But, how do I pass multiple arguments to the function? The solution is to use Python’s argument packing and unpacking feature, using the asterix operator.

The new function looks like the following:

def sort_by_column(input_list, *args):
    header = input_list[0]
    rows = input_list[1:]
    column_indexes = [header.index(column) for column in args]
    rows.sort(key=itemgetter(*column_indexes))
    rows.insert(0, header)
    return(rows)

Using the asterix operator before the args argument in the function header means that any number of positional arguments may be entered and they are all collected in an iterable named args. Inside the function, we build a list of integers representing column indexes by iterating through args. Then, we unpack that list into the operator.itemgetter function’s arguments using the asterix operator again.

I can call the function using one or more column names as parameters. It will recursively sort the nested lists by each column name, in order. For example:

sort_by_column(vm_table,'Subscription','Location','Vmsize')

This is a simple way to sort a table by multiple column names using Python.

Conclusion

The rest of the source code for the azruntime script is available on my azure-scripts GitHub repository.

I used more Azure Python SDK classes to get activity logs for each VM in my subscriptions, did some math using the datetime module to find the most recent “VM start” log and calculated the uptime of each VM. Then I created a table and pretty-print it using the Tabulate package.

Learning Python 5th Edition by Mark Lutz, Chapter 18, pages 549-550 and 555-556 ↩

↧

Install azruntime as a CLI program using pipx

February 14, 2021, 12:36 pm

≫ Next: Using the Python Rich library to display status indicators

≪ Previous: azruntime: Manage Azure Infrastructure with Python

azruntime, the Python program I wrote to manage virtual machines in my Azure subscriptions, is more convenient to use when run as a command from the Linux prompt instead of as a Python program in its virtual environment. You can install Python packages as command-line-programs using pipx.

To make azruntime work after using pipx to install it, I had to organize the project into a proper Python package folder structure, add an entry point in the setup.py file, and change the authentication class used by azruntime.

This post describes what I learned about pipx and Python packaging to enable me to install azruntime as a CLI application.

Changing the package directory structure

I originally structured the azruntime package so all its files were in one folder. I know this is not the standard way that packages are organized but I thought it was simpler and it worked with pip. However, pipx requires the correct package folder structure.

Below, I show the new folder structure I created.

azruntime/
├── LICENSE
├── README.md
├── azruntime
│   ├── __init.py__
│   ├── __main__.py
│   └── azruntime.py
├── requirements.txt
└── setup.py

At the top level, I have a project folder named “azruntime”. This can have any name and I could have called it “azruntime-project” to make it clearer. The top level project folder name is not relevant to packaging.

In the project folder, I have the Python package folder, named “azruntime” and the setup.py file. I also have other project files like the LICENSE and README files, and the requirements file.

The __main__.py file runs when you run the package using the python -m azruntime command. It’s not needed for users who will install the package as a command-line tool using pipx but it’s helpful to have during development. The function import statement in __main__.py contains the same expression you will use when adding an entry point in the setup.py file.

The __main__.py contents is listed below:

from azruntime.azruntime import main
main()

As you can see, it just imports and runs the main() function from the azruntime.py module in the azruntime package folder.

Entry Point in setup.py file

Pipx sets up a CLI command that runs a function in a Python module. The command passes arguments to the function using the normal Python argument passing methods. You need to tell pipx which function to use by defining an entry point in the setup.py file. In this case, I want pipx to set up a command that runs the same main() function that the __main__.py uses.

NOTE: The __main__.py file and the entry point in the setup.py file do not need to pint to the same functions. For example, you may use __main__.py for testing purposes and have it run a different function.

I added the following line to my setup.py file:

entry_points = {'console_scripts': ['azruntime=azruntime.azruntime:main'],},

I have one console script listed as an entry point. You could create multiple command-line programs using different functions from the same package just by adding them as additional console scripts in the entry_points line in the setup.py file.

I list the new setup.py file below:

from setuptools import setup

setup(
    name='AzRuntime',
    url='https://github.com/blinklet/azure-scripts/azruntime',
    packages=['azruntime'],
    install_requires=[
        'wheel',
        'azure-identity',
        'azure-mgmt-resource',
        'azure-mgmt-compute',
        'azure-mgmt-monitor',
        'azure-cli-core',
        'tabulate'
    ],
    version='0.4',
    license='GPLv3',
    description='Print a list of all running VMs in your subscriptions.',
    long_description=open('README.md').read(),
    entry_points = {
        'console_scripts': ['azruntime=azruntime.azruntime:main'],
    },
)

Azure CLI authentication and pipx

After installing azruntime with pipx, I got an error when I ran the azruntime command. It seems that the AzureCliCredential class cannot see the user’s existing Azure CLI credentials. When I install the azruntime package in its own virtual environment using pip, everything works. But when I try to install the package on my system using pipx, it does not work.

I decided to use the DefaultAzureCredential class, with arguments that stop it from running the other authentication methods. I then enable the method that allows the user to start an interactive Azure login.

In the azruntime.py module, I replaced the line:

credentials = AzureCliCredential()

with the following statement:

credentials = DefaultAzureCredential(
    exclude_environment_credential = True,
    exclude_managed_identity_credential = True,
    exclude_shared_token_cache_credential = True,
    exclude_visual_studio_code_credential = True,
    exclude_interactive_browser_credential = False
)

The new version of the azruntime script will check for existing Azure CLI credentials (if it is installed as a Python package using pip) and, if that is not working, it will start a web browser and allow the user to login interactively. So, if you install it using pipx, you will always have to authenticate using a web browser every time you run the azruntime command.

Help requested: I cannot find the reason why the AzureCliCredential class does not work if the package is installed using pipx. If you know, please post something in the comments below.

Using pipx

Now I can create a system-level command taht runs the azruntime program in its own virtual environment, but I do not need to activate a virtual environment, myself. Pipx makes it easier to distribute and use Python programs.

Pipx relies on pip and venv so you may need to install them:

sudo apt install python3-venv
sudo apt install python3-pip

Do not install pipx using your Linux system’s package manager like dnf or apt. You’ll get an old version that does not work. Instead, install pipx from PyPI as follows:

python3 -m pip install pipx
python3 -m pipx ensurepath

Then, install azruntime with the following command:

pipx install "git+https://github.com/blinklet/azure-scripts.git#egg=azruntime&subdirectory=azruntime"

Now, you can run the azruntime Python program from your Linux command line kust by typing the command:

$ azruntime

Conclusion

I changed that way users can install azruntime so they can install it as a command-line utility on their Linux systems. The same procedure should work for Windows and Mac systems — with some differences in the way pip, venv and pipx are installed.

↧

Using the Python Rich library to display status indicators

March 3, 2021, 4:05 am

≫ Next: Use Containerlab to emulate open-source routers

≪ Previous: Install azruntime as a CLI program using pipx

I recently added a status indicator to my azruntime application. If users have a lot of VMs in their subscriptions, the azruntime application can take a long time to run. Users will appreciate seeing the status so they know the program is still running and is not hung up.

I used the Rich library to implement a status indicator. I had to learn more about Python context managers to understand how the Rich library’s progress bar and status indicators work. The Rich library’s documentation is aimed at intermediate-to-advanced programmers and the Rich tutorials I found on the web did not cover using the Rich library’s status update features.

In this post, I will share what I learned while adding a status indicator to my program and show you how to implement the same in your projects.

Rich library overview

The Rich library makes it easy to add color and style to terminal output. Rich can also render pretty tables, progress bars, markdown, syntax highlighted source code, tracebacks, and more.¹

This post focuses only on creating a status indicator. To learn more about what Rich can do for you, I encourage you to read one of the excellent Rich overviews available on the Internet. I list a few below:

Another way to see how Rich can improve text output in your Python console applications is to run the sample code provided by the Rich package. Run the following to see most of what Rich can do:

(env) $ pip install rich
(env) $ python -m rich

This will output a sample screen, as shown below:

Learning from Rich sample code

The Rich project provides sample code for all its features. I found that the sample code was the best way to understand how to use each feature.

First, I run the sample code to see what the output looks like. Then I open the file and look at the code. The Rich library modules are listed in the Rich library documentation’s Reference section.

Run the sample code by running the module. For example, I was looking for a way to create a status update. I saw the module named rich.update and decided to try it. I ran the command:

(env) $ python -m rich.status

I saw that the output looks like the kind of status updates I wanted. See the output below:

The console displayed spinner icon next to some text that changes as the program runs. Next, I clicked on the rich.status module on the Rich documentation’s References page. I saw the module documentation. To see the examples in the source code, I clicked on the source code link, as shown below:

In the module’s source code, I scrolled to the bottom to find the test code in the if __name__ == “__main__”: block. As shown below, I can compare the code with the results I saw when I previously ran the module.

After looking at the rich.status module’s output, it’s reference page, and its source code, I now see how I can implement a “spinner”-style status indicator for my azruntime application.

I need to first create a console object from the Rich Console class. Then, I create a context manager using the console object’s status method and set an initial status message in it. Finally, each time I want to change the status message in the running context, I use the rich.status.update method.

Python context managers

I am using Rich to add functionality to an existing program and, until I started using the Rich library, I had never used context managers or the with statement. I re-read that Exceptions chapter in the Learning Python book² and looked at some online tutorials. Now, I can explain more about Python context managers.

Context managers are created by the with statement. They are an advanced Python topic but we use them all the time. Most beginner Python programmers have seen the with statement in examples and in tutorials. It is commonly used when working with files.

A typical example is shown below:

with open('example.txt','r') as reader:
    print(reader.read())

In the above example, the with statement calls the open() function and assigns the returned object, which is a file object, to a variable named reader. The next line prints everything returned by the file object’s read() method. The context manager code built into the file object closes the example.txt file as soon as the last statement in the code block, which in this case is the print() function, completes.

If you do not use the with statement, as shown below, Python will keep example.txt file open until the programmer tells it to do otherwise.

reader = open('example.txt','r')
print(reader.read())

In the above example, the file object returned by the open() function is assigned to a variable named reader. The next line prints everything returned by the file object’s read() method. In this case, the programmer must remember to explicitly close the file using the file object’s close() method, as shown below.

reader.close()

If the programmer does not close the file, it remains open until either all remaining code in the script finishes running or the reader variable is assigned to another object. Python’s garbage collection feature will free up the memory used by the file object and close the file.

The programmer needs to consider what might happen if an error occurs before they close a file. They may need check for errors and close the file using try/finally statements. The with statement also ensures resources are closed when errors occur and results in easier-to-read code.

Using the with statement is the “Pythonic” way to open files or other shared resources like network connections.

Network engineers who are learning Python will usually use the with statement when calling a function that opens a file, a network connection, or a database connection. More advanced programmers may use the context management protocol to create new functions or classes that perform specific actions when a context is closed.

How Rich uses context managers

Some features of the Rich library, such as the rich.status module, must be implemented using the with statement, which creates a context in which the output on the console screen is created and updated.

Create the following sample program to demonstrate how the rich.update module works. To test how rich.status will work in a loop, create an run a Python file containing the following code, or run it in the Python REPL.

from rich.console import Console
from time import sleep

status_list = ["First status","Second status","Third status"]
console = Console()
with console.status("Initial status") as status:
    sleep(3)  # or do some work
    for stat in status_list:
        status.update(stat)
        sleep(3)  # or do some more work

The program creates an object named console from the Console() class. Then, it creates a context object named status from the console object’s status method an intitalizes it with a status of “Initial status”. Then, it iterates through the status_list and updates the status object every three seconds. When the program runs you will see a “spinner” icon also gives the user some feedback that the program is running.

The above program follows the example shown in the rich.status module’s example code. But, why not use the rich.status.Status() class, instead of the rich.console.Console() class? In fact, you are already using it. The Console() class’s status method imports and uses the rich.status.Status() class. The Rich developers showed an example using the Console() class because developers may have multiple things happening in the same console or may use multiple consoles. Using a console object makes it clear where the status object’s output is to be displayed.

You may use the rich.status.Status() class directly, if you want. And, if you are concerned with which console to use, you may specify which console object the rich.status.Status() class uses when you create an object with it. It will use the default console if you do not specify one.

Below is an example the accomplishes the same status update display as the previous program, but I use the rich.status.Status() class directly:

from rich.status import Status
from time import sleep

status_list = ["First status","Second status","Third status"]
with Status("Initial status") as status:
    sleep(3)  # or do some work
    for stat in status_list:
        status.update(stat)
        sleep(3)  # or do some more work

The program creates a context object from the rich.status.Status class, named status that is initialized with a status of “initial status”. Then it iterates through the list of statuses and displays each one on the screen form three seconds.

Either of the two methods shown above will work. You can also implement status updates using the rich.live.Live() or rich.progress.SpinnerColumn() classes.

Adding Rich to azruntime

Showing how I added a status indicator to an existing program will help you better understand how to implement Rich library features.

I will use the rich.console.Console() class because it is the way the Rich developers implemented status updates in the rich.status module. Because the rich.console.Console() class must be implemented using a context manager, I need to implement it inside a function and I cannot pass a context manager to other functions.

My azruntime module defines a function named build_vm_list, which uses nested for loops to iterate through through generators that information about each virtual machine, per resource group, per subscription. The build_vm_list returns a nested list containing the run time information about all VMs in my subscriptions. That list is rendered as a table to the console by another function.

I opened my azruntime program in a text editor and made the following changes.

First, I import the Rich modules I will use:

from rich.console import Console

In the build_vm_list() function, I created a new Rich console and used the with statement to create a console.status context manager named status before the first for loop:

console = Console()
with console.status("[green4]Getting subscriptions[/green4]") as status:

I indented all the nested loop code below the with statement so Python knows it is part of the status context.

Finally, in the deepest nested loop, I update the status context with a status message containing the subscription name, resource group name, and VM name available at that point in time.

status.update(
    "[grey74]Subscription: [green4]" +
    subscription_name +
    "[/green4]  Resource Group: [green4]" +
    resource_group +
    "[/green4]  VM: [green4]" +
    vm_name +
    "[/green4][/grey74]"
)

The new function looks like the following. I removed some code to make the example shorter. The entire build_vm_list() function code is available in the azruntime repository on GitHub.

def build_vm_list(credentials):

    headers = [
        'VM name', 'Subscription', 'ResourceGroup',
        'Size', 'Location', 'Status',
        'TimeInState', 'style'
    ]

    returned_list = list()
    returned_list.append(headers)

    subscription_client = SubClient(credentials)
    subscriptions = sublist(subscription_client)

    console = Console()
    with console.status("[green]Getting subscriptions[/green]") as status:

        for subscription_id, subscription_name in subscriptions:

            resource_client = ResourceClient(credentials, subscription_id)
            compute_client = ComputeClient(credentials, subscription_id)
            monitor_client = MonitorClient(credentials, subscription_id)
            resource_groups = grouplist(resource_client)

            for resource_group in resource_groups:
                vms = vmlist(compute_client, resource_group)

                for vm_name, vm_id in vms:

                    status.update(
                        status="[grey74]Subscription: [/grey74][green4]" +
                        subscription_name +
                        "[/green4][grey74]  Resource Group: [/grey74][green4]" +
                        resource_group +
                        "[/green4][grey74]  VM: [/grey74][green4]" +
                        vm_name + "[/green4]"
                    )

                    """...other code that builds list of VM information..."""

        return returned_list

Conclusion

I used the Python Rich library to implement a status display for my azruntime CLI application, using just a few lines of code.

The Rich library contains many more classes and functions in addition to the rich.status module. For example, I also used the rich.table module to render the VM table output by the azruntime program.

If you add color to your output, the color tags Rich uses are listed in the Rich documentation Appendix.

From the Rich GitHub README page, accessed March 2, 2021 ↩
Learning Python, 5th edition, Chapter 33, pp 1152-1156 ↩

↧

Use Containerlab to emulate open-source routers

May 5, 2021, 12:27 pm

≫ Next: 2021 IT Blog Awards finalist!

≪ Previous: Using the Python Rich library to display status indicators

Containerlab is a new open-source network emulator that quickly builds network test environments in a devops-style workflow. It provides a command-line-interface for orchestrating and managing container-based networking labs and supports containerized router images available from the major networking vendors.

More interestingly, Containerlab supports any open-source network operating system that is published as a container image, such as the Free Range Routing (FRR) router. This post will review how Containerlab works with the FRR open-source router.

While working through this example, you will learn about most of Containerlab’s container-based features. Containerlab also supports VM-based network devices so users may run commercial router disk images in network emulation scenarios. I’ll write about building and running VM-based labs in a future post.

While it was initially developed by Nokia engineers, Containerlab is intended to be a vendor-neutral network emulator and, since its first release, the project has accepted contributions from other individuals and companies.

The Containerlab project provides excellent documentation so I don’t need to write a tutorial. But, Containerlab does not yet document all the steps required to build an open-source router lab that starts in a pre-defined state. This post will cover that scenario so I hope it adds something of value.

Install Containerlab

You may install Containerlab using your distribution’s package manager or you may download and run an install script. Users may also manually install Containerlab because it’s a Go application so users just need to copy the application binary to a directory in their system’s path and copy some configuration files to etc/containerlab.

Prerequisites:

Containerlab runs best on Linux. It works on both Debian and RHEL-based distributions, and can even run in Windows Subsystem for Linux (WSL2). It’s main dependency is Docker so first you must install Docker. I am running an Ubuntu 20.04 system.

sudo apt install apt-transport-https ca-certificates
sudo apt install -y curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
sudo apt update
apt-cache policy docker-ce
sudo apt install -y docker-ce

Install Containerlab

To install Containerlab from its repository, run the Containerlab install script:

$ bash -c "$(curl -sL https://get-clab.srlinux.dev)"

See the Containerlab installation documentation for other ways to install Containerlab, including manual installation for distributions that do not use Debian or RHEL-based packaging tools.

Containerlab files

The Containerlab installation script copies the Containerlab executable file to /usr/bin and copies lab-example and template files to /etc/containerlab. The latter directory is the most interesting because it contains the lab examples that users can use as models for lab development.

Build a lab using FRR

Containerlab supports commercial containerized router appliances such as Nokia’s SR Linux and Arista’s CEOS. In each case, Containerlab takes into account the specific requirements of each device. If you wish to use commercial containerized network operating systems that are not listed among the supported device types, you may need to communicate with the Containerlab developers and request that support for your device be added or, better yet, offer to contribute to the project.

However, Containerlab should be able to any use open-source network operating system, such as Free Range Routing (FRR), that runs on a Linux container. In this example, I will use the network-multitool container and the FRR container from DockerHub to create nodes in my network emulation scenario.

To build a lab, first create a new directory. In that directory, create a Containerlab topology file. You may optionally add any configuration files you wish to mount in the container and, as you will see below, you may need to write some simple shell scripts to ensure all the nodes in the lab start in a predefined state.

Create a Topology file

Containerlab defines lab topologies in topology definition files that use a simple YAML syntax. Look at the topology file examples in the /etc/containerlab/lab-examples directory for inspiration.

Create a directory for the network emulation scenario’s files:

$ mkdir -p ~/Documents/frrlab 
$ cd ~/Documents/frrlab

The lab in this example will consist of three routers connected in a ring topology and each router will have one PC connected to it. You must plan the topology and determine which ports will connect to each other.

Use your favorite text editor to create a file named frrlab.yml and add the following text to it:

name: frrlab

topology:
  nodes:
    router1:
      kind: linux
      image: frrouting/frr:v7.5.1
      binds:
        - router1/daemons:/etc/frr/daemons
    router2:
      kind: linux
      image: frrouting/frr:v7.5.1
      binds:
        - router2/daemons:/etc/frr/daemons
    router3:
      kind: linux
      image: frrouting/frr:v7.5.1
      binds:
        - router3/daemons:/etc/frr/daemons
    PC1:
      kind: linux
      image: praqma/network-multitool:latest
    PC2:
      kind: linux
      image: praqma/network-multitool:latest
    PC3:
      kind: linux
      image: praqma/network-multitool:latest

  links:
    - endpoints: ["router1:eth1", "router2:eth1"]
    - endpoints: ["router1:eth2", "router3:eth1"]
    - endpoints: ["router2:eth2", "router3:eth2"]
    - endpoints: ["PC1:eth1", "router1:eth3"]
    - endpoints: ["PC2:eth1", "router2:eth3"]
    - endpoints: ["PC3:eth1", "router3:eth3"]

The Containerlab topology file format is self-explanatory. The file starts with the name of the lab, followed by the lab topology. If you wish to run more than one lab at the same time, you must ensure each lab has a different name in the topology file. It defines each device and then it defines the links between devices. You also see it mounts a daemons configuration file to each router. We will create those files, next.

Add configuration files

The FRR network operating system must have a copy of the daemons file in its /etc/frr directory or FRR will not start. As you saw above, Containerlab makes it easy specify which files to mount into each container.

Each router needs its own copies of the configuration files. Make separate directories for each router:

$ mkdir router1
$ mkdir router2
$ mkdir router3

Copy the standard FRR daemons config file from the FRR documentation to the frrlab/router1 directory. Edit the file:

$ vi router1/daemons

Change zebra, ospfd, and ldpd to “yes”. The new frrlab/router1/daemons file will look like the listing below:

zebra=yes
bgpd=no
ospfd=yes
ospf6d=no
ripd=no
ripngd=no
isisd=no
pimd=no
ldpd=yes
nhrpd=no
eigrpd=no
babeld=no
sharpd=no
staticd=no
pbrd=no
bfdd=no
fabricd=no

vtysh_enable=yes
zebra_options=" -s 90000000 --daemon -A 127.0.0.1"
bgpd_options="   --daemon -A 127.0.0.1"
ospfd_options="  --daemon -A 127.0.0.1"
ospf6d_options=" --daemon -A ::1"
ripd_options="   --daemon -A 127.0.0.1"
ripngd_options=" --daemon -A ::1"
isisd_options="  --daemon -A 127.0.0.1"
pimd_options="  --daemon -A 127.0.0.1"
ldpd_options="  --daemon -A 127.0.0.1"
nhrpd_options="  --daemon -A 127.0.0.1"
eigrpd_options="  --daemon -A 127.0.0.1"
babeld_options="  --daemon -A 127.0.0.1"
sharpd_options="  --daemon -A 127.0.0.1"
staticd_options="  --daemon -A 127.0.0.1"
pbrd_options="  --daemon -A 127.0.0.1"
bfdd_options="  --daemon -A 127.0.0.1"
fabricd_options="  --daemon -A 127.0.0.1"

Save the file and copy it to the other router folders so each router has its own copy.

$ cp router1/daemons router2/daemons
$ cp router1/daemons router3/daemons

Start the lab

To start a Containerlab network emulation, run the clab deploy command with the new frrlab topology file. Containerlab will download the docker images used to create the PCs and routers, start containers based on the images and connect them together.

Since we are using containers from Docker Hub, we need to first login to Docker.

$ sudo docker login

Enter your Docker userid and password.

Now, run the Containerlab command:

$ sudo clab deploy --topo frrlab.yml

Containerlab outputs logs to the terminal while it sets up the lab. If you have any errors in your configuration file, Containerlab outputs descriptive error messages. The lisitng below shows a normal lab setup, based on the frrlab tolopogy.

INFO[0000] Parsing & checking topology file: frrlab.yml 
INFO[0000] Pulling docker.io/praqma/network-multitool:latest Docker image 
INFO[0009] Done pulling docker.io/praqma/network-multitool:latest 
INFO[0009] Pulling docker.io/frrouting/frr:v7.5.1 Docker image 
INFO[0032] Done pulling docker.io/frrouting/frr:v7.5.1  
INFO[0032] Creating lab directory: /home/brian/Documents/frrlab/clab-frrlab 
INFO[0032] Creating docker network: Name='clab', IPv4Subnet='172.20.20.0/24', IPv6Subnet='2001:172:20:20::/64', MTU='1500'
INFO[0000] Creating container: router2                  
INFO[0000] Creating container: router1                  
INFO[0000] Creating container: Router3                 
INFO[0000] Creating container: PC1                      
INFO[0000] Creating container: PC2                      
INFO[0000] Creating container: PC3                      
INFO[0006] Creating virtual wire: router1:eth2 <--> router3:eth1 
INFO[0006] Creating virtual wire: router2:eth2 <--> router3:eth2 
INFO[0006] Creating virtual wire: PC1:eth1 <--> router1:eth3 
INFO[0006] Creating virtual wire: router1:eth1 <--> router2:eth1 
INFO[0006] Creating virtual wire: PC2:eth1 <--> router2:eth3 
INFO[0006] Creating virtual wire: PC3:eth1 <--> router3:eth3 
INFO[0006] Writing /etc/hosts file                      
+---+---------------------+--------------+---------------------------------+-------+-------+---------+----------------+----------------------+
| # |        Name         | Container ID |              Image              | Kind  | Group |  State  |  IPv4 Address  |     IPv6 Address     |
+---+---------------------+--------------+---------------------------------+-------+-------+---------+----------------+----------------------+
| 1 | clab-frrlab-PC1     | 3be7d5136a58 | praqma/network-multitool:latest | linux |       | running | 172.20.20.4/24 | 2001:172:20:20::4/64 |
| 2 | clab-frrlab-PC2     | 447d4a3fd09d | praqma/network-multitool:latest | linux |       | running | 172.20.20.5/24 | 2001:172:20:20::5/64 |
| 3 | clab-frrlab-PC3     | 146915d85bfe | praqma/network-multitool:latest | linux |       | running | 172.20.20.6/24 | 2001:172:20:20::6/64 |
| 4 | clab-frrlab-router1 | fa4beabef9e4 | frrouting/frr:v7.5.1            | linux |       | running | 172.20.20.2/24 | 2001:172:20:20::2/64 |
| 5 | clab-frrlab-router2 | c65b32cc2b46 | frrouting/frr:v7.5.1            | linux |       | running | 172.20.20.7/24 | 2001:172:20:20::7/64 |
| 6 | clab-frrlab-router3 | c992143448f7 | frrouting/frr:v7.5.1            | linux |       | running | 172.20.20.3/24 | 2001:172:20:20::3/64 |
+---+---------------------+--------------+---------------------------------+-------+-------+---------+----------------+----------------------+

Containerlab outputs a table containing information about the running lab. You can get the same information table later by running the sudo clab inspect --name frrlab command.

In the table, you see each node has an IPv4 address on the management network. If your network nodes run an SSH server, you would be able to connect to them via SSH. However, the containers I am using in this example are both based on Alpine Linux and do not have openssh-server installed so we will connect to each node using Docker. If you want lab users to have a more realistic experience, you could build new containers based on the frrouting and network-mulitool containers that also include openssh-server.

Configure network nodes

Currently, the nodes are running but the network is not configured. To configure the network, log into each node and run its native configuration commands, either in the shell (the ash shell in Alpine Linux), or in its router CLI (vtysh in FRR).

To configure PC1, run Docker to execute a new shell on the container, clab-frrlab-PC1.

$ sudo docker exec -it clab-frrlab-PC1 /bin/ash

Based on the network plan we created when we designed this network, configure PC1’s eth1 interface with an IP address and static routes to the external data networks.

/ # ip addr add 192.168.11.2/24 dev eth1
/ # ip route add 192.168.0.0/16 via 192.168.11.1 dev eth1
/ # ip route add 10.10.10.0/24 via 192.168.11.1 dev eth1
/ # exit

Configure PC2 in a similar way:

$ sudo docker exec -it clab-frrlab-PC2 /bin/ash

/ # ip addr add 192.168.12.2/24 dev eth1
/ # ip route add 192.168.0.0/16 via 192.168.12.1 dev eth1
/ # ip route add 10.10.10.0/24 via 192.168.12.1 dev eth1
/ # exit

Configure PC3 :

$ sudo docker exec -it clab-frrlab-PC3 /bin/ash

/ # ip addr add 192.168.13.2/24 dev eth1
/ # ip route add 192.168.0.0/16 via 192.168.13.1 dev eth1
/ # ip route add 10.10.10.0/24 via 192.168.13.1 dev eth1
/ # exit

Configure Router1 by running vtysh in the Docker container clab-frrlab-router1.

$ sudo docker exec -it clab-frrlab-router1 vtysh

Enter the following FRR CLI commands to configure interfaces eth1, eth2, and eth3 with IP address that match the network design.

configure terminal 
service integrated-vtysh-config
interface eth1
 ip address 192.168.1.1/24
 exit
interface eth2
 ip address 192.168.2.1/24
 exit
interface eth3
 ip address 192.168.11.1/24
 exit
interface lo
 ip address 10.10.10.1/32
 exit
exit
write
exit

Configure Router2 in a similar way:

$ sudo docker exec -it clab-frrlab-router2 vtysh

configure terminal 
service integrated-vtysh-config
interface eth1
 ip address 192.168.1.2/24
 exit
interface eth2
 ip address 192.168.3.1/24
 exit
interface eth3
 ip address 192.168.12.1/24
 exit
interface lo
 ip address 10.10.10.2/32
 exit
exit
write
exit

Configure Router3:

$ sudo docker exec -it clab-frrlab-router3 vtysh

configure terminal 
service integrated-vtysh-config
interface eth1
 ip address 192.168.2.2/24
 exit
interface eth2
 ip address 192.168.3.2/24
 exit
interface eth3
 ip address 192.168.13.1/24
 exit
interface lo
 ip address 10.10.10.3/32
 exit
exit
write
exit

Some quick tests

After configuring the interfaces on each node, you should be able to ping from PC1 to any IP address configured on Router1, but not to interfaces on other nodes.

$ sudo docker exec -it clab-frrlab-PC1 /bin/ash

/ # ping -c1 192.168.11.1
PING 192.168.11.1 (192.168.11.1) 56(84) bytes of data.
64 bytes from 192.168.11.1: icmp_seq=1 ttl=64 time=0.066 ms

--- 192.168.11.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.066/0.066/0.066/0.000 ms
/ #
/ # ping -c1 192.168.13.2
PING 192.168.13.2 (192.168.13.2) 56(84) bytes of data.

--- 192.168.13.2 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

/ # 
/ # exit

Add OSPF

So that we can reach all networks in this example, set up a dynamic routing protocol on the FRR routers. In this example, we will set up a simple OSPF area for all networks connected to the routers.

Connect to vtysh on Router1:

$ sudo docker exec -it clab-frrlab-router1 vtysh

Add a simple OSPF configuration to Router1:

configure terminal 
router ospf
 passive-interface eth3
 passive-interface lo
 network 192.168.1.0/24 area 0.0.0.0
 network 192.168.2.0/24 area 0.0.0.0
 network 192.168.11.0/24 area 0.0.0.0
 exit
exit
write
exit

Configure Router2 in a similar way.

Connect to vtysh on Router2:

$ sudo docker exec -it clab-frrlab-router2 vtysh

Configure OSPF:

configure terminal 
router ospf
 passive-interface eth3
 network 192.168.1.0/24 area 0.0.0.0
 network 192.168.3.0/24 area 0.0.0.0
 network 192.168.12.0/24 area 0.0.0.0
 exit
exit
write
exit

Connect to vtysh on Router3:

$ sudo docker exec -it clab-frrlab-router3 vtysh

Configure OSPF:

configure terminal 
router ospf
 passive-interface eth3
 network 192.168.2.0/24 area 0.0.0.0
 network 192.168.3.0/24 area 0.0.0.0
 network 192.168.13.0/24 area 0.0.0.0
 exit
exit
write
exit

OSPF testing

Now, PC1 should be able to ping any interface on any network node. Run the ping command on PC1 to try to reach PC3 over the network.

$ sudo docker exec clab-frrlab-PC1 ping -c1 192.168.13.2
PING 192.168.13.2 (192.168.13.2) 56(84) bytes of data.
64 bytes from 192.168.13.2: icmp_seq=1 ttl=62 time=0.127 ms

--- 192.168.13.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.127/0.127/0.127/0.000 ms

A traceroute shows that the packets pass from PC1 to Router1, then to Router3, then to PC3:

$ sudo docker exec clab-frrlab-PC1 traceroute 192.168.13.2
traceroute to 192.168.13.2 (192.168.13.2), 30 hops max, 46 byte packets
 1  192.168.11.1 (192.168.11.1)  0.004 ms  0.005 ms  0.004 ms
 2  192.168.2.2 (192.168.2.2)  0.004 ms  0.005 ms  0.005 ms
 3  192.168.13.2 (192.168.13.2)  0.004 ms  0.007 ms  0.004 ms

This shows that the OSPF protocol successful set up the routing tables on the Routers so that all nodes on this network can reach each other.

Network defect introduction

To further demonstrate that the network configuration is correct, see what happens if the link between Router1 and Router3 goes down. If everything works correctly, the OSPF protocol will detect that the link has failed and reroute any traffic going from PC1 to PC3 through Router1 and Router3 via Router2.

But, there is no function in Containerlab that allows the user to control the network connections between nodes. So you cannot disable a link or introduce link errors using Containerlab commands. In addition, Docker does not manage the Containerlab links between nodes so we cannot use Docker network commands to disable a link.

Containerlab links are composed of pairs of veth interfaces which are managed in each node’s network namespaces. We need to use Docker to run network commands on each container or use native Linux networking commands to manage the links in each node’s network namespace..

One simple way to interrupt a network link is to run the ip command on a node to shut down a link on the node. For example, to shut off eth2 on Router1:

$ sudo docker exec -d clab-frrlab-router1 ip link set dev eth2 down

Then, run the traceroute command on PC1 and see how the path to PC3 changes:

$ sudo docker exec clab-frrlab-PC1 traceroute 192.168.13.2
traceroute to 192.168.13.2 (192.168.13.2), 30 hops max, 46 byte packets
 1  192.168.11.1 (192.168.11.1)  0.005 ms  0.004 ms  0.004 ms
 2  192.168.1.2 (192.168.1.2)  0.005 ms  0.004 ms  0.002 ms
 3  192.168.3.2 (192.168.3.2)  0.002 ms  0.005 ms  0.002 ms
 4  192.168.13.2 (192.168.13.2)  0.002 ms  0.007 ms  0.011 ms

We see that the packets now travel from PC1 to PC3 via Router1, Router2, and Router3.

Restore the link on Router1:

$ sudo docker exec clab-frrlab-router1 ip link set dev eth2 up

And see that the traceroute between PC1 and PC3 goes back to its original path.

$ sudo docker exec clab-frrlab-PC1 traceroute 192.168.13.2
traceroute to 192.168.13.2 (192.168.13.2), 30 hops max, 46 byte packets
 1  192.168.11.1 (192.168.11.1)  0.004 ms  0.005 ms  0.003 ms
 2  192.168.2.2 (192.168.2.2)  0.004 ms  0.004 ms  0.002 ms
 3  192.168.13.2 (192.168.13.2)  0.002 ms  0.005 ms  0.003 ms

Links can also be managed by ip commands executed on the host system. Each node is in its own network namespace which is named the same as its container name. To bring down a link on Router1 we first list all the links in the namespace, clab-frrlab-router1

$ sudo ip netns exec clab-frrlab-router1 ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
91: eth0@if92: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
    link/ether 02:42:ac:14:14:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
106: eth2@if105: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue state UP mode DEFAULT group default 
    link/ether 16:36:c6:ca:4e:77 brd ff:ff:ff:ff:ff:ff link-netns clab-frrlab-router3
107: eth3@if108: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue state UP mode DEFAULT group default 
    link/ether f2:4e:6d:f5:e9:01 brd ff:ff:ff:ff:ff:ff link-netns clab-frrlab-PC1
114: eth1@if113: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue state UP mode DEFAULT group default 
    link/ether 42:ca:0d:5c:15:3c brd ff:ff:ff:ff:ff:ff link-netns clab-frrlab-router2
$

We see device eth2 is attached to the network namespace clab-frrlab-router1. To bring down the device eth2 in clab-frrlab-router1 run the following command:

$ sudo ip netns exec clab-frrlab-router1 ip link set dev eth2 down

We see the traceroute from PC1 to PC3 again passes through Router1, Router2, and Router3 just like it did when we disabled Router2‘s eth2 link from inside the conbtainer.

$ sudo docker exec clab-frrlab-PC1 traceroute 192.168.13.2
traceroute to 192.168.13.2 (192.168.13.2), 30 hops max, 46 byte packets
 1  192.168.11.1 (192.168.11.1)  0.007 ms  0.006 ms  0.005 ms
 2  192.168.1.2 (192.168.1.2)  0.006 ms  0.009 ms  0.006 ms
 3  192.168.3.2 (192.168.3.2)  0.005 ms  0.008 ms  0.004 ms
 4  192.168.13.2 (192.168.13.2)  0.004 ms  0.007 ms  0.004 ms

Then, bring the device back up:

$ sudo ip netns exec clab-frrlab-router1 ip link set dev eth2 up

$ sudo docker exec -it clab-frrlab-PC1 /bin/ash

Then, see that the traceroute from PC1 to PC3 goes back to the normal route, passing through Router1 and Router3.

$ sudo docker exec -it clab-frrlab-PC1 /traceroute 192.168.13.2
traceroute to 192.168.13.2 (192.168.13.2), 30 hops max, 46 byte packets
 1  192.168.11.1 (192.168.11.1)  0.008 ms  0.006 ms  0.003 ms
 2  192.168.3.2 (192.168.3.2)  0.005 ms  0.008 ms  0.005 ms
 3  192.168.13.2 (192.168.1377.2)  0.005 ms  0.006 ms  0.005 ms

So, we see we can impact network behavior using ip commands on the host system.

Stop the network emulation

To stop a Containerlab network, run the clab destroy command using the same topology file you used to deploy the network:

$ sudo clab destroy --topo frrlab.yml

Persistent configuration

Containerlab will import and save configuration files for some kinds of nodes, such as the Nokia SR linux kind. However, Linux containers only have access to standard Docker tools like volume mounting, although Containerlab facilitates mounting volumes by allowing users to specify bind mounts in the lab topology file.

Persistent configuration for FRR routers

The routers in this example are based on FRR, which uses the configuration files /etc/frr/deamons and etc/frr/frr.conf.

Create an frr.conf file for each router and save each file in its lab folder’s router directory.

Router1:

Create the configuration file for Router1 and save it in router1/frr.conf.

frr version 7.5.1_git
frr defaults traditional
hostname router1
no ipv6 forwarding
!
interface eth1
 ip address 192.168.1.1/24
!
interface eth2
 ip address 192.168.2.1/24
!
interface eth3
 ip address 192.168.11.1/24
!
interface lo
 ip address 10.10.10.1/32
!
router ospf
 passive-interface eth3
 network 192.168.1.0/24 area 0.0.0.0
 network 192.168.2.0/24 area 0.0.0.0
 network 192.168.11.0/24 area 0.0.0.0
!
line vty
!

Router2:

Create the configuration file for Router2 and save it in router2/frr.conf.

frr version 7.5.1_git
frr defaults traditional
hostname router2
no ipv6 forwarding
!
interface eth1
 ip address 192.168.1.2/24
!
interface eth2
 ip address 192.168.3.1/24
!
interface eth3
 ip address 192.168.12.1/24
!
interface lo
 ip address 10.10.10.2/32
!
router ospf
 passive-interface eth3
 network 192.168.1.0/24 area 0.0.0.0

 network 192.168.3.0/24 area 0.0.0.0
 network 192.168.12.0/24 area 0.0.0.0
!
line vty
!

Router3:

Create the configuration file for Router3 and save it in router3/frr.conf.

frr version 7.5.1_git
frr defaults traditional
hostname router3
no ipv6 forwarding
!
interface eth1
 ip address 192.168.2.2/24
!
interface eth2
 ip address 192.168.3.2/24
!
interface eth3
 ip address 192.168.13.1/24
!
interface lo
 ip address 10.10.10.3/32
!
router ospf
 passive-interface eth3
 network 192.168.2.0/24 area 0.0.0.0
 network 192.168.3.0/24 area 0.0.0.0
 network 192.168.13.0/24 area 0.0.0.0
!
line vty
!

Modify the topology file

Edit the frrlab.yml file and add the mounts for the frr.conf files for each router:

name: frrlab

topology:
  nodes:
    router1:
      kind: linux
      image: frrouting/frr:v7.5.1
      binds:
        - router1/daemons:/etc/frr/daemons
        - router1/frr.conf:/etc/frr/frr.conf
    router2:
      kind: linux
      image: frrouting/frr:v7.5.1
      binds:
        - router2/daemons:/etc/frr/daemons
        - router2/frr.conf:/etc/frr/frr.conf
    router3:
      kind: linux
      image: frrouting/frr:v7.5.1
      binds:
        - router3/daemons:/etc/frr/daemons
        - router3/frr.conf:/etc/frr/frr.conf
    PC1:
      kind: linux
      image: praqma/network-multitool:latest
    PC2:
      kind: linux
      image: praqma/network-multitool:latest
    PC3:
      kind: linux
      image: praqma/network-multitool:latest

  links:
    - endpoints: ["router1:eth1", "router2:eth1"]
    - endpoints: ["router1:eth2", "router3:eth1"]
    - endpoints: ["router2:eth2", "router3:eth2"]
    - endpoints: ["PC1:eth1", "router1:eth3"]
    - endpoints: ["PC2:eth1", "router2:eth3"]
    - endpoints: ["PC3:eth1", "router3:eth3"]

Persistent configuration for PC network interfaces

To permanently configure network setting on an Alpine Linux system, one would normally save an interfaces configuration file on each PC in the /etc/network directory, or save a startup script in one of the network hook directories such as /etc/network/if-up.d.

However, Docker containers do not have permission to manage their own networking with initialization scripts. The user must connect to the container’s shell and run ip commands or must configure the container’s network namespace. I think it is easier to work with each container using Docker commands.

To create a consistent initial network state for each PC container, create a script that runs on the host that will configure the PCs’ eth1 interface and set up some static routes.

Create a file named PC-interfaces and save it in the lab directory. Make it executable. The file contents are shown below:

#!/bin/sh
sudo docker exec clab-frrlab-PC1 ip link set eth1 up
sudo docker exec clab-frrlab-PC1 ip addr add 192.168.11.2/24 dev eth1
sudo docker exec clab-frrlab-PC1 ip route add 192.168.0.0/16 via 192.168.11.1 dev eth1
sudo docker exec clab-frrlab-PC1 ip route add 10.10.10.0/24 via 192.168.11.1 dev eth1

sudo docker exec clab-frrlab-PC2 ip link set eth1 up
sudo docker exec clab-frrlab-PC2 ip addr add 192.168.12.2/24 dev eth1
sudo docker exec clab-frrlab-PC2 ip route add 192.168.0.0/16 via 192.168.12.1 dev eth1
sudo docker exec clab-frrlab-PC2 ip route add 10.10.10.0/24 via 192.168.12.1 dev eth1

sudo docker exec clab-frrlab-PC3 ip link set eth1 up
sudo docker exec clab-frrlab-PC3 ip addr add 192.168.13.2/24 dev eth1
sudo docker exec clab-frrlab-PC3 ip route add 192.168.0.0/16 via 192.168.13.1 dev eth1
sudo docker exec clab-frrlab-PC3 ip route add 10.10.10.0/24 via 192.168.13.1 dev eth1

Make the file executable:

$ chmod +x PC-interfaces.sh

After you start this lab using the Containerlab topology file, run the PC-interfaces.sh script to configure the PCs. The routers will get their initial configuration from each one’s mounted frr.conf file.

Create a small script that starts everything. For example, I created an executable script named lab.sh and saved it in the lab directory. The script is shown below:

#!/bin/sh
clab deploy --topo frrlab.yml
./PC-interfaces.sh

Now, when I want to start the FRR lab in a known state, I run the command:

$ sudo ./lab.sh

Get lab information

You can get some information about the lab using the inspect and graph functions.

$ sudo containerlab inspect --name frrlab
+---+-----------------+----------+---------------------+--------------+---------------------------------+-------+-------+---------+----------------+----------------------+
| # |    Topo Path    | Lab Name |        Name         | Container ID |              Image              | Kind  | Group |  State  |  IPv4 Address  |     IPv6 Address     |
+---+-----------------+----------+---------------------+--------------+---------------------------------+-------+-------+---------+----------------+----------------------+
| 1 | frrlab.clab.yml | frrlab   | clab-frrlab-PC1     | 02eea96ab0f0 | praqma/network-multitool:latest | linux |       | running | 172.20.20.4/24 | 2001:172:20:20::4/64 |
| 2 |                 |          | clab-frrlab-PC2     | 9987d5ac6bd9 | praqma/network-multitool:latest | linux |       | running | 172.20.20.6/24 | 2001:172:20:20::6/64 |
| 3 |                 |          | clab-frrlab-PC3     | 66c24d270c1a | praqma/network-multitool:latest | linux |       | running | 172.20.20.7/24 | 2001:172:20:20::7/64 |
| 4 |                 |          | clab-frrlab-router1 | 4936f56d28b2 | frrouting/frr:v7.5.1            | linux |       | running | 172.20.20.2/24 | 2001:172:20:20::2/64 |
| 5 |                 |          | clab-frrlab-router2 | 610563b7052a | frrouting/frr:v7.5.1            | linux |       | running | 172.20.20.3/24 | 2001:172:20:20::3/64 |
| 6 |                 |          | clab-frrlab-router3 | 9f501e040a65 | frrouting/frr:v7.5.1            | linux |       | running | 172.20.20.5/24 | 2001:172:20:20::5/64 |
+---+-----------------+----------+---------------------+--------------+---------------------------------+-------+-------+---------+----------------+----------------------+

The graph function, however, does not appear to work

Run:

$ sudo containerlab graph --topo frrlab.yml

Open a web browser to URL: `https://localhost:50080`. You will see a web page with a network diagram and a table with management information.

For small networks, this is not very useful because it does not show the port names on each node. I think it would be more useful in large network emulation scenarios with dozens or hundreds of nodes.

Packet capture on lab interfaces

To capture network traffic on one of the containerlab network connections, one must again access interfaces in the network namespaces for each container.

For example, we know that traffic from PC1 to PC3 will, when all links are up, pass via the link between Router1 and Router3. Let’s monitor the traffic on one of the interfaces that make up that connection.

We know, from our topology file, that interface eth2 on Router1 is connected to eth1 on Router3. So, let’s look at the traffic on Router3 eth1.

Router3’s network namespace has the same name as the container that run Router3: clab-frrlab-router3. Follow the directions from the Containerlab documentation and run the following command to execute tcpdump and forward the tcpdump output to Wireshark:

$ sudo ip netns exec clab-frrlab-router3 tcpdump -U -n -i eth1 -w - | wireshark -k -i -

In the above command, tcpdump will send an unbuffered stream (the -U option) of packets read on interface eth1 (the -i eth1 option) without converting addresses to names (the -n option) to standard output (the -w – option), which is piped to Wireshark which reads from standard input (the -i – option) and starts reading packets immediately (the -k option).

You should see a Wireshark window open on your desktop, displaying packets captured from Router3’s eth1 interface.

Stop the capture and Wireshark with the Ctrl-C key combination in the terminal window.

Stopping a network emulation

To stop a Containerlab network, run the clab destroy command using the same topology file you used to deploy the network:

$ sudo clab destroy --topo frrlab.yml

Contributing to Containerlab

If you create an interesting network emulation scenario, you may wish to contribute it to the lab examples in the Containerlab project.

In my case, I opened pull request #417 on Containerlab’s GitHub project page to offer them the files that create this example and hope it will be accepted.

Conclusion

Containerlab is a new network emulation tool that can create large, complex network emulation scenarios using a simple topology file. It leverages the strengths of Docker and Linux networking to build a lightweight infrastructure in which the emulated nodes run. The Containerlab developers include strong integrations for the SR Linux network operating system and also built in basic support for other commercial network operating systems.

Containerlab would be most interesting to network engineers who need to automate the setup of test networks as part of a development pipeline for network changes. The topology file for the test network can be included with the network configurations that need to be tested.

Containerlab does not abstract away all the complexity, however. Users may still need to have intermediate-level knowledge of Linux networking commands and Docker to emulate network failures and to capture network traffic for analysis and verification.

↧

2021 IT Blog Awards finalist!

January 11, 2022, 10:08 am

≫ Next: Learning to use Python classes

≪ Previous: Use Containerlab to emulate open-source routers

I was surprised but very honoured to learn that my blog was selected as a finalist in the IT Blog Awards. I started this blog to help with my learning during a personal research project and to contribute to the open-source networking community as best I could. I never imagined that someone else might consider it for an honour such as this!

If you have gotten value from reading this blog, please go to the IT Blog Awards voting page and vote for the “Open Source Routing and Network Simulation” blog. Thank you so much!

↧

Learning to use Python classes

February 8, 2023, 2:55 pm

≫ Next: Twenty-five open-source network emulators and simulators you can use in 2023

≪ Previous: 2021 IT Blog Awards finalist!

This tutorial demonstrates object-oriented programming and Python classes.

I think that most people learn best when working on a practical project, so I will show readers how to build a simple program that they can share with their friends and family. While building the program, I demonstrate the types of problems solved by using Python classes and I use Python classes to build and manage multiple game elements.

NOTE: I realize this is off-topic for my blog. I used the Pyxel game framework as an tool to introduce Python programming to my child. After using Pyxel to build a game, I thought that it provided a good example of using Python classes in an easy-to-understand way.

I assume the reader has already learned the basics of Python programming.

Python Classes

A Python class is a type of Python object used in object-oriented programming. Programmers create new objects by instantiating, or calling, classes. They may then use or modify those instances’ attributes in their programs.

Each instance of a class is a unique object that may contain variables, called data attributes, and functions, called methods.

Each class also contains an initialization function, called a constructor, that runs when a new instance is created. It defines the initial state of the instance, based on code defined in the constructor function and any data that may be passed into the instance, when it is created.

To demonstrate using Python classes, this tutorial will show you how to build a game using Python and the Pyxel framework. You will use Python classes and learn fundamental object-oriented programming concepts such as inheritance.¹

The Pyxel Framework

Pyxel is a retro game engine for Python. I chose Pyxel for this tutorial because it takes only a few minutes to learn enough about Pyxel to build a simple game or animation.

Pyxel enables programmers to develop pixel-based games similar to video games from the 1980s and early 1990s. Pyxel provides a set of functions that do most of the work of managing the game loop, displaying graphics, and playing sounds. Pyxel also offers the Pyxel Editor: an all-in-one solution for creating sprites, tiles, tile maps, sounds, and music for Pyxel games.

The Pyxel web page contains everything you need to know about using Pyxel and the Pyxel Editor. It will take about ten minutes to read the documentation.

Please stop here and read the Pyxel documentation. Then, continue with this tutorial.

If you would like to spend more time learning about Pyxel, you may look at the resources listed at the end of this post.

Install Pyxel and create an environment

To work with Pyxel and follow the examples in this post, first create a Python virtual environment and install Pyxel in that environment. Then, install the Pyxel example files so you can re-use some of the assets from the examples in this tutorial. Execute the following commands: ²

$ mkdir learn_pyxel
$ cd learn_pyxel
$ python3 -m venv env
$ source ./env/bin/activate
$ pip install pyxel

Copy the Pyxel example files to the project folder:

$ pyxel copy_examples

List the contents of the learn_pyxel directory:

$ ls
env  pyxel_examples

Pyxel resource files

Some example game resource files, which contain game assets such as sprites, tiles, tile maps, sounds, and music, are stored in the pyxel_examples/assets directory. Pyxel resource files have the file extension, .pyxres.

$ ls -1 pyxel_examples/assets
cat_16x16.png
jump_game.pyxres
noguchi_128x128.png
offscreen.pyxres
platformer.pyxres
pyxel_logo_38x16.png
sample.pyxres
tileset_24x32.png

In the examples below, use the assets in the platformer.pyxres file because it contains a simple set of sprites.

Create a new folder for your first game project and copy the resource file into your project folder:

$ mkdir first_game
$ cp pyxel_examples/assets/platformer.pyxres first_game
$ cd first_game

The Pyxel Editor

You can view the resource file in the Pyxel Editor using the following Pyxel command:

$ pyxel edit platformer.pyxres

You should see a new window appear on your desktop that looks like the image below:

The Pyxel Editor

This is the Pyxel Editor. It is displaying the contents of the platformer.pyxres file. You may use it to view and create sprites, tiles, tile maps, sounds, and music for Pyxel games.

This tutorial focuses mostly on Python programming and using Python classes. It will not cover how to use the Pyxel Editor to create new game assets. In this tutorial, you will use the Pyxel Editor to find existing game assets in the existing platformer.pyxres resource file. ³

Quit the editor by pressing the Escape key.

First Pyxel program

First, write the program in the procedural style so you can contrast this version to a program written in the object-oriented style, later.

Create a small, procedural Pyxel program that displays an animation of a bird flapping its wings.

The bird sprite

Re-use the bird sprites in the resource file platformer.pyxres. The three bird sprites are on Image 0 and are in (x, y) positions (0, 16), (8, 16), and (16, 16). Each sprite is eight pixels high, eight pixels wide, and shows the bird in a different animated position.

Pyxel window

First, import the pyxel module, initialize the screen size and frame rate (per second), and load the Pyxel resource file platformer.pyxres:

import pyxel

pyxel.init(64, 32, fps=30)
pyxel.load("platformer.pyxres")

Next, create the update and draw functions required by the Pyxel framework and pass them into the pyxel.run function, which manages the game loop:

def update():
    pass

def draw():
    pass

pyxel.run(update, draw)

If you were to save and run this program right now, you would see a new window is created that is twice as wide as it is tall. The window contains nothing because we did not define anything in the draw function.

Quit the program by pressing the Escape key.

First sprite

To display one of the bird sprites, change the draw function to the following:

def draw():
    pyxel.cls(6)
    pyxel.blt(28, 12, 0, 0, 16, 8, 8)

See the Pyxel Graphics documentation for a description of the pyxel.cls() function, which clears the screen and replaces everything with a specified color, and the pyxel.blt() function, which copies a bitmap area from the resource file and places is in the Pyxel game screen. If you save and run the program now, you will see the eight by eight-pyxel bird sprite appears on the screen. This sprite was copied from an eight by eight-pixel area starting at x and y coordinates 0 and 16 in the resource file’s Image 0. On the game screen, the upper right corner of the sprite is placed at x and y coordinates of 28 and 12 on the screen, making it appear like the bird is centered in the screen.

The pyxel.blt() function has an optional parameter that lets you specify a transparent color on the sprite so it looks better on various backgrounds. In this case, the sprite’s background color is color number 2. Add that parameter to the pyxel.blt function, as shown below:

def draw():
    pyxel.cls(6)
    pyxel.blt(28, 12, 0, 0, 16, 8, 8, 2)

If you save and run the program now, you will see a window similar to the one below:

The bird sprite

Sprite animation

Now, animate the bird by changing which sprite image is displayed in each frame. Since the three bird sprites are all in a line whose top edge is 16 pixels down from the top of Image 0 in the Pyxel resource file, we just to change the value for the x-position of each sprite in the pyxel.blt function.

The usual place to store logic that updates the positions or properties of game elements is the update function.

Change the update function to the following:

def update():
    sprite_u = 8 * (pyxel.frame_count % 3)

The pyxel.frame_count increments by one each time Pyxel runs through the game loop. The modulo operator returns the remainder of division so will result in a value of 0, 1, or 2, depending on the frame count. Multiply that by eight and you get a sprite_u value of 0, 8, or 16 depending on the frame.

Replace the sprite hard-coded x value in pyxel.blt function in the play function with the sprite_u variable.

def draw():
    pyxel.cls(6)
    pyxel.blt(28, 12, 0, sprite_u, 16, 8, 8, 2)

When you save and run the program, you see the first problem you need to solve: Python stops the program with an error because the variable sprite_u is not available outside the scope of the update and draw functions.

Now, you are at the point where you have to choose between managing global variables in a program, or using classes.

Global variables

One way to make a variable assigned in a function available to other functions, and to the main program, is to explicitly define it to be a global variable, using the global keyword. Global variable may be used in the main program’s namespace and in any function that declares them.

Change the update and draw functions as shown below:

def update():
    global sprite_u
    sprite_u = 8 * (pyxel.frame_count % 3)

def draw():
    global sprite_u
    pyxel.cls(6)
    pyxel.blt(28, 12, 0, sprite_u, 16, 8, 8, 2)

As shown above, you declared the sprite_u variable to be a global variable in both functions. This solves the problem for now, but global variables will become difficult to manage as the program gets more complex. Generally, programmers do not want to use global variables to store program state. ⁴

Now the program runs, the variable sprite_u can be assigned in the update function and its value can be read in the draw function.

Changing the speed of sprites

However, the animation is moving too fast. We could reduce the animation speed by lowering the game’s frame rate but that is not the best solution.

Managing the speed of game elements relative to the game frame rate is one of the first problems you need to solve in game development. One solution is to create yet another global variable that tracks the sprite frame index.

Increment the global frame index variable once every ten frames. When the frame index has incremented to 3, reset it to zero so it can continue to be used to calculate the sprite animations. For example:

animation_index = 0

def update():
    global sprite_u
    global animation_index
    if pyxel.frame_count % 10 == 0:
        if animation_index > 2:
            animation_index = 0
        sprite_u = 8 * animation_index
        animation_index += 1

Note that you had to assign a value to the animation_index variable in the main body of the program because you must assign a Python variable before you use it. This is OK because, after it the variable is initially assigned, that initialization code does not run again. The Pyxel framework only runs code that is inside the update and draw functions during the game.

After you save and run the program, the sprite_u variable iterates between 0, 8, 16, and back to 0 every ten frames, or third of a second.

Bird sprite animation

You will use this algorithm multiple times when you have different sprites moving at different speeds. You can imagine how complex it will get if you have to manage it with global variables.

Pyxel program using classes

In many cases, Python classes make it easier to organize and use data in your program. This is evident when we compare the examples above, written in a procedural style, with the examples below, written in an object-oriented style.

The Pyxel documentation recommends that you wrap pyxel code in a class so developers can avoid using global variables to pass data from the update() function to the draw() function in a Pyxel program. If the Pyxel code is wrapped in a class, one can store data in the object instance created when the class is called. That data can be accessed by the rest of the program.

The Pyxel App class

Refactor the first Pyxel program you previously wrote into an program that places the program logic in a class named App. See the example below:

import pyxel

class App:
    def __init__(self):
        pyxel.init(64, 32, fps=30)
        pyxel.load("platformer.pyxres")
        self.animation_index = 0
        pyxel.run(self.update, self.draw)

    def update(self):
        if pyxel.frame_count % 10 == 0:
            if self.animation_index > 2:
                self.animation_index = 0
            self.sprite_u = 8 * self.animation_index
            self.animation_index += 1

    def draw(self):
        pyxel.cls(6)
        pyxel.blt(28, 12, 0, self.sprite_u, 16, 8, 8, 2)

App()

You defined a class named App. In it, you defined the constructor method, named init, which initializes an instance of the App class in a known state. Since the class does not have any parameters, other than the self parameter, the initial state will be the same in every time the class is called, or instantiated. The program calls the class when it is run.

The Self object

The self object represents the instance of the class that will be created when it is instantiated, or called. This object is passed into every method in the class so that all variables in the class are accessible to all the class’s methods, such as the update and draw methods. This eliminates the need for global variables because all variables are now attributes of the self object.

Multiple sprites

You realize another benefit of using classes when you create multiple instances of a class. For example, you can define a Sprite class, which separates all the logic and data associated with the bird sprites from the main program, and create multiple instances of birds on the screen, each with its own position data.

import pyxel

class Sprite:
    def __init__(self, x, y, index):
        self.sprite_x = x
        self.sprite_y = y
        self.animation_index = index

    def update(self):
        if pyxel.frame_count % 10 == 0:
            if self.animation_index > 2:
                self.animation_index = 0
            self.sprite_u = 8 * self.animation_index
            self.animation_index += 1

    def draw(self):
        pyxel.blt(self.sprite_x, self.sprite_y, 0, self.sprite_u, 16, 8, 8, 2)

We could further extend the Sprite class to include methods that change its position on the screen as time passes and to detect and respond to other game elements. All the information about position, speed, animation is managed separately by each instance of the Sprite class so it is possible to manage many birds in the same program.

The App class is now simplified because it does not need to manage the state of each bird sprite. You are beginning to see the benefits of information hiding, which we will discuss more later. When the App class is called, it’s initialization method instantiates two bird sprite objects by twice calling the Sprite class with different parameters. Then we just call the bird sprite objects’ update and draw methods in the App class during each game loop cycle, or frame.

class App:
    def __init__(self):
        pyxel.init(64, 32, fps=30)
        pyxel.load("platformer.pyxres")
        self.bird1 = Sprite(6,6,0)
        self.bird2 = Sprite(28,12,1)
        pyxel.run(self.update, self.draw)

    def update(self):
        self.bird1.update()
        self.bird2.update()

    def draw(self):
        pyxel.cls(6)
        self.bird1.draw()
        self.bird2.draw()

App()

You see in the example above, each bird object is initialized with data parameters representing its x and y coordinates on the game screen, and with the animation index. When you run the program, you see two birds on the screen in different locations, with each bird seeming to flap it’s wings at different times because each bird starts its animation sequence at a different frame set by the animation index.

Many moving sprites

You can easily add yet another bird, with its own position and animation index, with just one line of code in each of the App class’s constructor, update and draw methods. Or, you could add a for loop that creates hundreds of bird sprites and saves them in a list. Then, you could update and draw those sprites by iterating through the sprite list in each of the update and draw methods.

For example, if we change the App class as shown below, we can generate a dozen bird sprites in random locations on the screen:

class App:
    def __init__(self):
        pyxel.init(64, 32, fps=30)
        pyxel.load("platformer.pyxres")
        self.sprite_list = []
        for i in range(12):
            a = pyxel.rndi(0,56)
            b = pyxel.rndi(0,24)
            c = pyxel.rndi(0,2)
            self.sprite_list.append(Sprite(a, b, c))
        pyxel.run(self.update, self.draw)

    def update(self):
        for i in range(12):
            self.sprite_list[i].update()

    def draw(self):
        pyxel.cls(6)
        for i in range(12):
            self.sprite_list[i].draw()

App()

Running the program shows twelve bird sprites in random locations around the screen, all flapping their wings independently and slowly moving around.

Many bird sprites

The Sprite class can be modified to change the behavior of the bird sprites without changing the rest of the program code. For example, make the bird sprites move:

class Sprite:
    def __init__(self, x, y, index):
        self.sprite_x = x
        self.sprite_y = y
        self.animation_index = index

    def move(self):
        self.sprite_x += pyxel.rndi(-1,1)
        self.sprite_y += pyxel.rndi(-1,1)

    def animate(self):
        if self.animation_index > 2:
            self.animation_index = 0
        self.sprite_u = 8 * self.animation_index
        self.animation_index += 1

    def update(self):
        if pyxel.frame_count % 10 == 0:
            self.animate()
            self.move() 

    def draw(self):
        pyxel.blt(self.sprite_x, self.sprite_y, 0, self.sprite_u, 16, 8, 8, 2)

In this case, you added a move method that changes the bird sprite’s x and y coordinates by one pixel in a random direction. Then you moved the sprite animation code from the update method into its own animate method. Finally, you called the animate and move methods in the modified update method.

You did not need to modify the main application class, App, to change the behavior of all the bird sprites. You may be starting to see how Python classes and object-oriented programming enable programmers to build objects that can hide information from each other so that the code in one object does not need to know about all the code and data in another object.

Information hiding

Information-hiding makes it easier for multiple programmers to work together on the same project.

Information hiding is also called encapsulation. It is usually accomplished by breaking a large program up into smaller files, called modules. Programmers who are working together agree on how code in one module can access code in another module. This agreement is called an interface. As long as you do not change a module’s interface, you can add or change the rest of the code to improve the functionality of your module, without negatively impacting the functionality of your colleagues’ code.

For example, you can split your current program into two files, or modules, named sprites.py and game.py. The sprites.py file contains all the code for the Sprite class, and the game.py file contains all the main program code, including the Pyxel App class.

The game.py module

To make it clear how we can limit what the main program needs to know about each sprite, modify the code in each file so that all the logic related to positioning and animating the sprites is in the Sprite class in the sprites.py module.

First, you need to import the Sprite class from the sprites.py module.

import pyxel
from sprites import Sprite

Then, simplify the App class so it no longer needs to know the position and animation index of each sprite. In it’s constructor, the App object instantiates new sprites simply by calling the Sprite class and appending the returned sprite objects to a list. All the code that randomly assigns position and animation index will be encapsulated inside the Sprite class and the actual values for those attributes, which are different for each sprite object, will be managed and updated within each sprite object.

class App:
    def __init__(self):
        pyxel.init(64, 32, fps=30)
        pyxel.load("platformer.pyxres")
        self.sprite_list = []
        for _ in range(12):
            self.sprite_list.append(Sprite())
        pyxel.run(self.update, self.draw)

Simplify the main game logic so it just instantiates new sprite objects and calls each sprite’s update and draw methods during the game loop.

    def update(self):
        for i in range(12):
            self.sprite_list[i].update()

    def draw(self):
        pyxel.cls(6)
        for i in range(12):
            self.sprite_list[i].draw()

App()

Now, whomever maintains the game.py file can concentrate on adding and removing sprites. The game.py module developer can add game features like different screens or interesting backgrounds while leaving the work of improving sprite animation and movement to another programmer who maintains the sprites.py module.

The sprites.py module

Modify the Sprite class in the sprites.py module so that it no longer accepts parameters. Add to the Sprite class’s constructor method the code that assigns the initial position and animation index. To make the sprite class more customizable, add separate timers for animation and movement and express the timer values as variables, which become object attributes when the constructor runs, instead of hard-coded numbers.

Also, generalize the animation logic by creating an animation sequence containing sets of x and y coordinates pointing to the upper right corner of each sprite in the animation. Assign the sprite width and height to variables in the constructor. This will make is possible for programmers who use the Sprite class to customize it in their game program.

import pyxel

class Sprite:
    def __init__(self):
        self.x = pyxel.rndi(0,56)
        self.y = pyxel.rndi(0,24)
        self.w = 8
        self.h = 8
        self.col = 2
        self.animate_interval = 10
        self.move_interval = 25
        self.animation = ( (16, 16), (0, 16), (8, 16), (0, 16) )
        self.animation_index = pyxel.rndi(0,len(self.animation))

    def move(self):
        if pyxel.frame_count % self.move_interval == 0:
            self.x += pyxel.rndi(-1,1)
            self.y += pyxel.rndi(-1,1)

    def animate(self):
        if pyxel.frame_count % self.animate_interval == 0:
            if self.animation_index == len(self.animation):
                self.animation_index = 0
            self.u, self.v = self.animation[self.animation_index]
            self.animation_index += 1

    def update(self):
        self.animate()
        self.move() 

    def draw(self):
        pyxel.blt(self.x, self.y, 0, self.u, self.v, self.w, self.h, self.col)

When you run the game.py program you will see the same result as before: twelve bird sprites animating and moving around on the screen.

Inheritance

Inheritance is an object-oriented programming feature that enables you to add new types of sprites to your game program without modifying any code in the sprites.py file. You can build new classes based on existing classes where you inherit all the functionality of the base class and then add new code that changes some of the base class’ attributes or methods in the new class.’

Find a new sprite

For example, in the game program, Add a new type of sprite taht looks like an ball that flashes different colors. Open the Pyxel resource file and find the three different-colored ball sprites:

$ pyxel edit platformer.pyxres

See that each ball sprite is six pixels wide and six pixels high, and that the green ball sprite is located at coordinates (1, 9), the red ball is located at coordinates (9, 9), and the yellow ball sprite is located at coordinates (17, 9). Quit the Pyxel Edit program and use the information you gathered to build a new sprite type.

The Ball class

You do not need to write a whole new class for the ball sprite. Build the new Ball class by inheriting all the attributes and methods from the Sprite class and then just change the sprite width, height, and animation sequence information in the Ball class constructor.

Insert the code below, which creates the new class. before the App class in the game.py file:

class Ball(Sprite):
    def __init__(self):
        super().__init__()
        self.w = 6
        self.h = 6
        self.animation = ( (1, 9), (9, 9), (17, 9) )
        self.animation_index = pyxel.rndi(0,len(self.animation))

The Ball class inherits the Sprite class’s functionality by using the super built-in function to call the super class’s constructor method in the new class’s constructor.

The super function provides a general-purpose way to call the parent class’s constructor method. It is recommended practice to use the super function instead of “hard coding” the Ball class’s constructor with the statement, Sprite.__init__(self), which explicitly calls the Sprite class’s constructor method.

Change the sprite list creation loop in the App class constructor to the following, which creates a list with twelve elements: six birds and six balls.

        self.sprite_list = []
        for _ in range(6):
            self.sprite_list.append(Sprite())
            self.sprite_list.append(Ball())

The new game.py file will look like the file below.

import pyxel
from sprites import Sprite

class Ball(Sprite):
    def __init__(self):
        super().__init__()
        self.w = 6
        self.h = 6
        self.animation = ( (1, 9), (9, 9), (17, 9) )
        self.animation_index = pyxel.rndi(0,len(self.animation))

class App:
    def __init__(self):
        pyxel.init(64, 32, fps=30)
        pyxel.load("platformer.pyxres")
        self.sprite_list = []
        for _ in range(6):
            self.sprite_list.append(Sprite())
            self.sprite_list.append(Ball())           
        print(self.sprite_list)
        pyxel.run(self.update, self.draw)

    def update(self):
        for i in range(12):
            self.sprite_list[i].update()

    def draw(self):
        pyxel.cls(6)
        for i in range(12):
            self.sprite_list[i].draw()

App()

Python classes, and the concept of inheritance, enabled you to add a new sprite type with its own position data and its own animation and movement logic by adding just a few lines of code to your game program.

Different sprite types

You did not need to ask the other programmer who maintains the sprites.py file to make any changes to their file. You can see how using classes can make reusing code easier and how classes support the concept of code re-use and customization, resulting in program simplification.

Conclusion

You built an object-oriented program using Python classes and used concepts like information-hiding, encapsulation, and code re-use that help make developing complex programs easier. You got a taste of what it would be like to work on a larger project with other programmers and how the concepts you exercised in this tutorial can help.

You also learned about building games using the Pyxel framework and created a simple game animation. if you are interested, you will find it relatively easy to add more functionality to the game such as user input and collision detection. For example, see the following link to download and run the source code for a full-featured bird-drop game I created by extending the work already started in this tutorial.

More information about Pyxel

If you would like to spend some more time learning about Pyxel after reading this tutorial, the following resources will help.

Work through the official Pyxel examples. Pyxel’s developer, Takashi Kitao, recommends working through the Pyxel examples in the following order: 1, 5, 3, 4, 2, 9, and 10.
CaffeinatedTech produced a 2-hour video walking through the basics of Pyxel while building a snake game.
Emanoel Barreiros wrote an excellent blog with nine posts about using Pyxel. The first post is in English and the remaining are in Portuguese but you can translate them in your web browser.
Join the Pyxel community on the Pyxel Discord server, where you can find information and inspiration.

I ignore more complex object-oriented concepts such as composition and interfaces. Object inheritance is suitable for simple-to-intermediate complexity programs and is relatively easy to understand, compared to other object-oriented programming topics. It is also the correct way to manage objects in the game creates in this tutorial because each subclass created has an “is a” relationship to its parent class. ↩
I use a PC running Linux in all the examples. If you are using a Mac or a PC, you will use slightly different commands to launch Python or to activate a Python virtual environment on your computer. ↩
If you want a good introduction to creating new game assets in the Pyxel Editor, see the 2-hour video walking through the basics of Pyxel, as referenced in the resources listed at the end of this post, . ↩
From Stack Overflow. A good set of reasons to avoid global variables is: global variables can be altered by any part of the code in the Python module, making it difficult to anticipate problems related to its use; global variables make it difficult to share your code with other developers and make code harder to debug and maintain; and, global variables may make it very difficult to use more advanced programming techniques like automated testing or thread-safe programming. ↩

↧

Twenty-five open-source network emulators and simulators you can use in 2023

February 14, 2023, 3:58 am

≫ Next: Network simulators for high-school teachers

≪ Previous: Learning to use Python classes

I surveyed the current state of the art in open-source network emulation and simulation. I also reviewed the development and support status of all the network emulators and network simulators previously featured in my blog.

Of all the network emulators and network simulators I mentioned in my blog over the years, I found that eighteen of them are still active projects. I also found seven new projects that you can try. See below for a brief update about each tool.

Active projects

Below is a list of the tools previously featured in my blog that are, in my opinion, still actively supported.

Cloonix

Cloonix version 28 was released in January 2023. Cloonix stitches together Linux networking tools to make it easy to emulate complex networks by linking virtual machines and containers. Cloonix has both a command-line-interface and a graphical user interface.

The Cloonix web site now has a new address at: clownix.net and theCloonix project now hosts code on Github. Cloonix adopted a new release numbering scheme since I reviewed it in 2017. So it is now at “v28”.

Cloudsim

CloudSim is still maintained. Cloudsim is a network simulator that enables modeling, simulation, and experimentation of emerging Cloud computing infrastructures and application services. It is part of an ecosystem of projects and extensions, such as iFogSim. CloudSim release 6 was delivered in August, 2022.

cnet

The cnet network simulator is actively maintained. It enables development of and experimentation with a variety of data-link layer, network layer, and transport layer networking protocols in networks consisting of any combination of wide-area-networking (WAN), local-area-networking (LAN), or wireless-local-area-networking (WLAN) links ¹. The project maintainers say it is open source but you must provide you name and e-mail address to download the application source code. Version 3.5.3 was released in April 2022.

Containerlab

Containerlab is still very active. Containerlab is an open-source network emulator that quickly builds network test environments in a devops-style workflow. It provides a command-line-interface for orchestrating and managing container-based networking labs and supports containerized router images available from the major networking vendors. The most recent release was 0.36.1, released in January, 2023.

CORE

The Common Open Research Emulator (CORE) is still active. CORE consists of a GUI for drawing topologies of lightweight virtual machines, and Python modules for scripting network emulation ². The most recent CORE release, 9.0.1, was released in November 2022. The CORE community is very active on the CORE Discord server.

EVE-NG

EVE-NG Community Edition continues to receive updates. It is a network emulator that supports virtualized commercial router images, such as Cisco and NOKIA, and open-source routers. The EVE-NG team seems to focus on the commercial EVE-NG product but still supports the open-source EVE-NG Community version. EVE-NG Community Edition v5.0.1-13 was released in August 2022. I found a new project that creates a Python API for EVE-NG.

While I was refreshing this list, I realized EVE-NG Community Edition is not open-source software. It was originally an open-source project called UNetLab, but the developers turned it into a commercial project and renamed it. I am keeping EVE-NG on this list because the Community Edition is still free to use.

GNS3

GNS3 continues to deliver new versions. GNS3 is a very popular network emulation tool that is primarily used to emulate networks of commercial routers, but it also supports open-source routers. It is often used by professionals studying for certification exams. GNS3 version 2.2.37 was released in January 2023.

IMUNES

IMUNES is stable. It is a network emulator. IMUNES and CORE share the same code heritage and their user interfaces are similar, but they have diverged from each other since 2012. IMUNES has seen less development activity than CORE in the past few years. The IMUNES developer made an update a few months ago to support the Apple M1 processor on Ubuntu 20.04 LTS.

Kathará

Kathará is still being maintained. It is a network emulator that can run either on a single host leveraging Docker or on a cluster using Kubernetes. It can run network emulation scenarios on a variety of operating systems such as Windows, Mac, and Linux, and in other environments such as data centers or the public cloud. It allows configuration and deployment of virtual networks featuring SDN, NFV, and traditional routing protocols, such as BGP and OSPF. Kathará offers Python APIs that allow user to script the creation of network scenarios. Version 3.5.5 was released in January, 2023.

Kathará was created by the original developers of Netkit and is intended to be the next evolution in network emulation. A fork of the original Netkit is still being maintained by another author and has updated documentation.

Labtainers

Labtainers is still being maintained. It is a network emulator that enable researchers and students to explore network security topics. It has many lab scenarios based on security topics. Version 1.3.7 was released in January 2023

Linux Network Test Stack

The Linux Network Test Stack (LNTS), is still being maintained. It is a Python package that enables developers to build network emulation scenarios using a Python program. You may use LNTS to control a network of hardware nodes or to control an emulated network of containers. LNTS version 15.1 was released in August 2019 but the developer is merging pull requests in GitHub as recent as a few weeks ago so I believe this project is still active.

Mininet

Mininet published its last version, 2.3.0, two years ago but it is still being maintained and remains a popular network emulator. It is designed to support research and education in the field of Software Defined Networking systems. On Mininet’s Github repo, I see some minor development activity in recent months. Mininet Wifi has about the same development activity. Both the Mininet mailing list and Mininet WiFi forum are still active. I also found some examples of building Mininet labs using Python and FRR

Mini-NDN is a fork of Mininet designed for emulating Named Data Networking. It’s most recent release was at the end of 2021.

Containernet is a fork of Mininet that allows to use Docker containers as hosts in emulated network topologies. It is still being maintained. It’s last release was in December, 2019, but its GitHub repository has seen a few pull requests merged in 2022.

NEmu

NEmu, the Network Emulator for Mobile Universes, is still being maintained. It creates QEMU VMs to build a dynamic virtual network and does not require root access to your computer. NEmu users write Python scripts to describe the network topology and functionality. Version 0.8.0 was released in January 2023.

Netlab

NetLab is actively maintained. NetLab uses Libvirt and Vagrant to set up a simulated network of configured, ready-to-use devices. It brings DevOps-style infrastructure-as-code and CI/CD concepts to networking labs. Netlab v1.5 was released in February, 2023.

ns-3

ns-3 is actively maintained and supported. It is a free, open-source discrete-event network simulator for Internet systems, targeted primarily for research and educational use. Version 3.37 was released in November 2022. The ns-3 source code is on GitLab.

OMnet++

Omnet++ is in active development. It is a discreet-event network simulator used by many universities for teaching and research. It is published under a license called the Academic Public License, which appears to be unique to the Omnet++ project. Commercial users must pay for a license, but academic or personal use is permitted without payment. Non-commercial developers have rights similar to the GPL. OMNeT++ 6.0.1 was released in September 2022.

OpenConfig-KNE

OpenConfig-KNE, Kubernetes Network Emulation, is actively maintained. It is a network emulator developed by the OpenConfig foundation. It extends basic Kubernetes networking so it can support point-to-point virtual connections between nodes in an arbitrary network topology. Additionally, the OpenConfig organization encourages the major networking equipment vendors like Nokia, Cisco, and Juniper to produce standard data models, for configuration, and standard container implementations, for deployment. OpenConfig-KNE also supports standard containers so it can emulate networks comprised of open-source appliances. Version 0.1.7 was released in December 2022.

Shadow

Shadow is still under active development. It is a discrete-event network simulator that directly executes real application code, enabling you to simulate distributed systems with thousands of network-connected processes in realistic and scalable private network experiments using your laptop, desktop, or server running Linux ³. Shadow v2.4.0 was released in January 2023.

VNX

Virtual Networks over Linux (VNX) is stable since 2020. But, new filesystems were added in January 2023 so there is still support. VNX is an open-source network simulation tool that builds and modifies virtual network test beds automatically from a user-created network description file. The latest version of VNX was released on Sep 14th, 2020

vrnetlab

vrnetlab has slowed down development activity. The last commit was in December 2021, which is recent enough. However, on the GitHub repository there are many pull requests open and many issues that have not received a response. I think, for now, I keep listing vrnetlab on the sidebar because some parts of vrnetlab and the vrnetlab documentation may still be useful to users of Containerlab

New tools

I surveyed the Internet for information about network emulators and simulators that were created after 2019, which was the last time I did a broad survey of available simulation tools.

I found seven tools that were new to me, and list them all below. Most are related to the emulation of wireless networks and core networks, which is very interesting to me because I could not find emulators for these types of networks back in 2019.

Colosseum

Colosseum provides open-source wireless software for wireless network emulation. The software appears to be based on standard PC hardware and radios. I wonder if one can emulate the radios and build a completely virtual lab, maybe by combining it with ns-O-RAN or GNUradio.

This project looks interesting to me because it seems to have open-source versions of key components in wireless RAN and Core networks. The project is made up of many different sub-projects. srsRAN 22.10 was released in November 2022.

Cooja

The Cooja IoT network emulator is part of the new Contiki-ng project. Cooja enables fine-grained simulation/emulation of IoT networks that use the Contiki-NG IOT operating system. The Contiki-NG forum is very active, with most questions receiving a reply. Cooja has not yet had an official release but the most recent pull requests were merged in February 2023.

CrowNet

CrowNet is an open-source simulation environment which models pedestrians using wireless communication. It can be used to evaluate pedestrian communication in urban and rural environments. It is based on Omnet++. Development is active. Version 0.9.0 was released in May, 2022.

CupCarbon

CupCarbon simulates wireless networks in cities and integrates data from OpenStreetMap. The code is available on GitHub but there is no license information and there has been no official release, although some of the recent commit refers to Version 5.2.

Meshtasticator

Meshtasticator is an emulator for Meshtastic software. Meshtastic is a project that enables you to use inexpensive LoRa radios as a long range off-grid communication platform in areas without existing or reliable communications infrastructure. This project is 100% community driven and open source! ⁴ Meshtasticator enables you to emulate the operation of a network of Meshtastic devices communicating with teach other over LoRa radio. It is actively being developed. There is no tagged release but GitHub pull requests have been merged as recently as February 2023.

MimicNet

MimicNet is a network simulator that uses machine learning to estimate the performance of large data centre networks. It was released in July 2019 but has had no updates since then. MimicNet is the result of a research project and, now that the paper is published, the project appears to be in maintenance mode. Developers still respond to issues and the last commit was in July 2022.

Tinet

Tinet, or Tiny Network, is another container-based network emulator that has a few good scenarios described in the examples folder in its repository. It is intended to be a simple tool that takes a YAML config file as input and generates a shell script to construct virtual network. Version 0.0.2 was released in July 2020 but development has continued since then, with GitHub pull requests being merged as recently as January 2023

Removed from my list

I removed two projects from my list of network emulators and simulators.

Antidote and NRE Labs are retired. See the announcement on the NRE Labs site

Wistar seems to have been abandoned. There have been no updates in four years and no activity in the Wistar Slack channel

Conclusion

I refreshed my list of network emulators and simulators. I now have twenty projects on my active list. I found seven new projects that I will look at in the future and determine if any should be added to my list. I removed two projects from my list.

From https://www.csse.uwa.edu.au/cnet/index.php on February 15, 2023 ↩
From https://github.com/coreemu/core#about on February 12, 2023 ↩
From https://shadow.github.io/docs/guide/ on February 12, 2023 ↩
From Meshtastic Introduction: https://meshtastic.org/docs/introduction; February 2023 ↩

↧