Skip to content

Source code study

Version History
DateDescription
Mar 3, 2020add FreeBSD, some fpga stuff
Feb 4, 2020add io_uring, firecracker
Jan 31, 2020Add some good stuff
Jan 18, 2020Initial

🐋

Beautiful code is art. Recently I started forking good open source code into my own Github account and started casual reading and taking notes. In general, GNU projects are very hard to read, they have their own coding style which isn’t for everyone. My personal favorite is linux kernel coding style, and many linux-related projects follow this, e.g., CRIU, rdma-core.

Either way, happy hacking!

Misc

Projects supporting our day-to-day work without us realizing it.

  • glibc: libc, elf, and dynamic linker
    • Some juicy information about GOT/PLT
    • and explains what has happend before main() is called
  • binutils: gas, static linker, and more
    • assembler is amazing
    • static linker.. the magic thing is its linker script!
  • strace
    • System call tracer at userspace
    • I’ve designed one for LegoOS in kernel space
  • vim
  • tmux
  • git
  • Network
    • iperf3
      • iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool
    • tcpdump
      • the TCPdump network dissector
    • OpenSSH
      • ssh it is.
    • scapy
      • Python-based interactive packet manipulation program & library
  • C for life
    • cJSON
      • A lightweight JSON parser in C.
      • I think iperf3 is using it.
  • Outliers
    • CRIU: Checkpoint and Restore in Userspace
      • The reason I love this repo is because it has so many interesting pieces on how to interact with kernel, save states, and restore them. In addition, it shows how to properly use many less well known syscalls.
    • GRUB2: bootloader
      • Learn how modern bootloader works.
      • Detailed analysis of Linux booting sequence (how it transit from real-mode to protected mode, and finally to 64-bit mode, how to navigate Linux source code etc.)
    • io_uring

Operating Systems

image_unix_timeline (Image source: https://commons.wikimedia.org/wiki/File:Unix_timeline.en.svg)

Virtualization

Compilers

Firmware

I’m obsessed with firmware projects, maybe because that’s where I got started. First it’s SeaBIOS, the default one used by QEMU. Then UEFI, something I have never used (!).

  • SeaBIOS: the default BIOS used by QEMU
  • qboot: an alternative and lightweight BIOS for QEMU
    • Those are massive hackers, respect.
    • My experience about BIOS is calling them while the kernel (LegoOS) is running at 16-bit. BIOS is the OS for a just-booted kernel. I remember the lower 1MB is never cleared, maybe we could invoke the BIOS at 32 or 64-bit mode?
  • UEFI EDK II
    • “EDK II is a firmware development environment for the UEFI and UEFI Platform Initialization (PI) specifications”
    • Part of the TianoCore project, an open-source UEFI platform
    • The Unified Extensible Firmware Interface (UEFI) is a specification that defines a software interface between an operating system and platform firmware. UEFI is designed to replace the Basic Input/Output System (BIOS) firmware interface.
    • OVMF: OVMF is an EDK II based project to enable UEFI support for Virtual Machines. OVMF contains sample UEFI firmware for QEMU and KVM.
  • Microsoft Project Mu, a separate fork of EDK II
    • “Project Mu is a modular adaptation of TianoCore’s edk2 tuned for building modern devices using a scalable, maintainable, and reusable pattern”
    • It’s homepage explains the motivation behind it.
  • A book: Beyond BIOS Developing with the Unified Extensible Firmware Interface.

FPGA

Web Servers

Key Value Stores

Point of interests: 1) in-memory, and can it extend to use disk/ssd? 2) persistence support 3) network support

RDMA and More

  • rdma-core
    • Userspace IB verbs library (e.g., libibverbs)
    • Commands such as ibv_devinfo, rc_pingpong
    • Learn how userspace IB layer communicate with kernel, but also bypass kernel. The technique replies on ioctl() and mmap(), standard. But the ABI interface (i.e., data structures) are quite complex.
    • This is beautiful code
    • Kernel Infiniband stack
  • DPDK
    • DPDK uses VFIO to directly access physical device. Just like how we directly assign device to guest OS in QEMU.
    • Even though both DPDK and RDMA bypass kernel, their control path is very different. For DPDK, there is a complete device driver in the user space, and this driver communicate with the device via MMIO. After VFIO ioctls, all data and control path bypass kernel. For rdma-core, a lot control-path IB verbs (e.g., create_pd, create_cq) communicate with kernel via Infiniband device file ioctl. And you can see all those uverb hanlders in drivers/infiniband/core/uverbs.c Those control verbs will mmap some pages between user and kernel, so all following datapath IB verbs (e.g., post_send) will just bypass kernel and talk to device MMIO directly. Although rdma-core also has some vendor-specific “drivers”, but this is really different from the above DPDK’s userspace PCIe driver, per se. Userspace “rdma-core” vendor-driver deals with the kernel devel vendor-level driver details.
    • FWIW, if you are using a Mellanox VPI card in Ethernet mode (e.g. CX3-5), DPDK will use its built-in mlx driver, which further use libibverbs, which further relies on kernel IB stack. It’s not a complete user solution somehow. Note that DPDK built-in mlx driver uses RAW_PACKET QPs.
    • image

Comments