Assignment 1 Shell
Due: Tuesday February 6, 2018 11:59PM
Overview
In this assignment, you will sharpen your Rust skills by fixing Rust programs, writing useful utilities and libraries, and writing a simple shell for your Raspberry Pi. You’ll also write generic drivers for GPIO, UART, and the built-in timer. Finally, you’ll write a “bootloader” using your new drivers that loads program binaries over UART using the XMODEM protocol and executes them.
Before starting assignment 1, we strongly recommend that you complete the required course readings:
- Rust Syntax Walkthrough
- TRPL v2 chapters 1 - 11 and chapters 13 - 19
- Excerpts from Rusty Types for Solid Safety
Phase 0: Getting Started
As with assignment 0, ensure that you are using a compatible machine:
- Runs a modern Unix natively: Linux, BSD, or macOS
- Runs a 64-bit variant of the OS
- Has a USB-A port or USB-C to USB-A adapter
And has the following software installed: git
, wget
, tar
, screen
,
make
, and the software from assignment 0.
For assignment 1, you will also need to install the socat
utility.
If you are running Windows and successfully completed assignment 0, it is likely you will have no issue completing assignment 1. As before, please note that we won’t be providing support for or answering questions about Windows configurations.
Getting the Skeleton Code
Clone the assignment 1 skeleton git repository to your development machine:
git clone https://cs140e.sergio.bz/assignments/1-shell/skeleton.git 1-shell
Feel free to explore the contents of the repository.
Questions
This and future assignments include writing questions that you must respond to. Here’s an example of such a question:
assignment0
How do you set other GPIO pins?
In assignment 0, you enabled GPIO pin 16 as an output and then repeatedly set
and cleared it by writing to registers GPFSEL1
, GPSET0
, and GPCLR0
.
Which three registers would you write to to do the same for GPIO pin 27?
Which physical pin on the Raspberry Pi maps to GPIO pin 27?
The badge on the right of the question panel indicates the name of a file
located inside of a questions/
directory relative to the phase in which the
question is being asked. For instance, phase 1’s directory is ferris-wheel/
.
As such, a question named modules
in phase 1 would have its answer recorded in
ferris-wheel/questions/modules
. Note that we have pre-generated empty files
for every question.
Practice responding to questions now by answering the assignment0
question
above.
Phase 1: Ferris Wheel
In this phase, you will modify Rust programs so that they compile, fail to
compile, or pass tests. Answers to questions in this phase should be recorded in
the ferris-wheel/questions/
directory.
In the ferris-wheel/
directory, you will have find four subdirectories:
compile-fail
- contains programs that you should modify so that they do not compilecompile-pass
- contains programs that you should modify so that they compilerun-pass
- contains programs that you should modify so that the assertions passquestions
- directory containing your answers to questions for this phase
You will also find a test.sh
script. This script checks whether you have
properly modified these programs. Here’s what its output should look like on the
first run:
$ ./test.sh
ERROR: compile-pass/borrow-1.rs failed to compile!
ERROR: compile-pass/const.rs failed to compile!
...
0 passes, 25 failures
The test script accepts a -v
flag. With -v
, the script will emit the error
output from the Rust compiler for each failure:
$ ./test.sh -v
ERROR: compile-pass/borrow-1.rs failed to compile!
---------------------- stderr --------------------------
warning: unused variable: `z`
--> /.../ferris-wheel/compile-pass/borrow-1.rs:9:9
|
9 | let z = *y;
| ^
|
= note: #[warn(unused_variables)] on by default
= note: to avoid this warning, consider using `_z` instead
error[E0507]: cannot move out of borrowed content
--> /.../ferris-wheel/compile-pass/borrow-1.rs:9:13
|
9 | let z = *y;
| ^^
| |
| cannot move out of borrowed content
| help: consider using a reference instead: `&*y`
error: aborting due to previous error
...
0 passes, 25 failures
Finally, the script accepts a filter string. When a filter is provided, only
paths ($directory/$filename
) that contain the filter string will be tested:
./test.sh trait
ERROR: compile-pass/trait-namespace.rs failed to compile!
ERROR: run-pass/trait-impl.rs failed to compile!
0 passes, 2 failures
You can combine the two options as well: ./test.sh -v filter
.
Diff Budget
Every file contains a comment indicating the diff budget for that file. The budget is the maximum number of changes that you are allowed to make to fix the program. Solutions that exceed the diff budget will receive no credit.
As an example, consider compile-pass/try.rs
, which contains the comment:
// FIXME: Make me compile. Diff budget: 12 line additions and 2 characters.
This means that you can add at most 12 lines (including new lines) to the file
as well as change (add/remove/modify) at most 2 characters. Use git diff
to
display line changes and git diff --word-diff-regex=.
to display character
changes.
As another example, consider the comment in compile-pass/privacy.rs
:
// FIXME: Make me compile! Diff budget: 1 line.
This means you can change (add/remove/modify) at most one line. Note that new lines count towards your line budget.
The Rules
All of your changes should preserve the intended functionality of the program in
question. For instance, if the body of some function must be modified so that it
compiles, it is not a valid solution to fill the body with unimplemented!()
.
When in doubt, use your best judgement or ask during lab, office hours, or on
Piazza.
Under no circumstances should you:
- Modify any
assert!
s. - Modify any statement, expression, block, or item marked with “do not modify”.
- Violate comments in the source code about acceptable changes.
- Change the name of any file or move files between directories.
- Add any file to the
ferris-wheel
directory.
Modify Away!
Modify each of the 25 programs in the ferris-wheel/
subdirectories so that
they compile, fail to compile, or pass tests. When you have correctly modified
all of the files, test.sh
will report 25 passes, 0 failures
.
Hint: A file’s name may contain clues towards a solution!
*
What was wrong? How did you fix it? Why did that work?
For every program you modify, explain what was wrong with the original
program. Then, explain what changes you made to fix the program and why your
changes fix the program. Record your explanations in the file with the same
name as the program in the questions/
directory. Pithy explanations are
appreciated.
Phase 2: Oxidation
In this phase, you will write two libraries, one command-line utility, and
review one library. You will be working in the stack-vec
, volatile
,
ttywrite
, and xmodem
skeleton subdirectories. Subdirectories may contain a
questions
directory in which you should record your answers to questions
corresponding to the respective subphase.
All projects are being managed with Cargo. You will find the following cargo
commands useful:
cargo build
- build an application or librarycargo test
- test an application or librarycargo run
- run an applicationcargo run -- $flags
- run an application and pass arbitrary flags to it
For more information on using Cargo and how Cargo works, see the Cargo Book.
Subphase A: StackVec
One important facility that operating systems provide is memory allocation. When
a C, Rust, Java, Python, or just about any application calls malloc()
and
malloc()
has run out of memory from the operating system, a system call is
eventually made to request additional memory. The operating system determines if
there is memory available, and if so, fulfills the request for memory.
Memory allocation is a complicated story.
In practice, modern operating systems like Linux have a complicated relationship with memory allocation. For instance, as an optimization, most requests for memory allocation are only “virtually” handled: no physical memory is actually allocated until the application tries to use the newly allocated memory. Nonetheless, most operating systems aim to provide the illusion that they are allocating memory in the simplistic manner we’ve described. Operating systems are master liars (🍰).
Heap-allocated structures like Vec
, String
, and Box
internally call
malloc()
to allocate memory as necessary. This means that these structures
require operating system support to function. In particular, they require the
operating system to support memory allocation. We haven’t yet started writing
our operating system, so clearly there’s no memory allocation support for our
tiny bare-metal programs to make use of. As such, we can’t use heap-allocated
structures like Vec
until our operating system is further along.
This is a real shame because Vec
is a nice abstraction! It allows us to think
about push
ing and pop
ing elements without having to keep track of memory
ourselves. How we can get the benefits of the Vec
abstraction without
supporting memory allocation?
One common technique is to pre-allocate memory and then hand that memory to a
structure to abstract away. Some ways to pre-allocate memory include using
static
declarations to set apart memory in the static section of a binary or
through stack allocations from local variable declarations. In any case, the
allocations is of a fixed, predetermined size.
In this subphase, you will implement the StackVec
structure, a structure that
exposes a Vec
-like API when given pre-allocated memory. You will use the
StackVec
type later in phase 3 when implementing a shell for your Raspberry
Pi. You will work in the stack-vec
skeleton subdirectory. The subdirectory
contains the following files:
Cargo.toml
- configuration file for Cargosrc/lib.rs
- where you will write your codesrc/tests.rs
- tests that will run whencargo test
is calledquestions/
- where you should record answers to questions
The StackVec
Interface
A StackVec<T>
is created by calling StackVec::new()
, passing in a mutable
slice to values of any type T
. The StackVec<T>
type implements many of the
methods that Vec implements and is used in much the same way. Here’s an
example of a StackVec<u8>
being used:
let mut storage = [0u8; 1024];
let mut vec = StackVec::new(&mut storage);
for i in 0..10 {
vec.push(i * i).expect("can push 1024 times");
}
for (i, v) in vec.iter().enumerate() {
assert_eq!(*v, (i * i) as u8);
}
let last_element = vec.pop().expect("has elements");
assert_eq!(last_element, 9 * 9);
We’ve declared the StackVec
structure for you already:
pub struct StackVec<'a, T: 'a> {
storage: &'a mut [T],
len: usize
}
Understanding StackVec
The following questions test your understanding about the StackVec
interface:
push-fails
Why does push
return a Result
?
The push
method from Vec
in the standard libary has no return value, but
the push
method from our StackVec
does: it returns a Result
indicating
that it can fail. Why can StackVec::push()
fail where Vec::push()
does
not?
lifetime
Why is the 'a
bound on T
required?
The compiler would reject the following StackVec
declaration:
struct StackVec<'a, T> { buffer: &'a mut [T], len: usize }
If we add an 'a
bound to T
, however, the compiler is satisifed:
struct StackVec<'a, T: 'a> { buffer: &'a mut [T], len: usize }
Why is the bound required? What could go wrong if that bound wasn’t enforced by Rust?
clone-for-pop
Why does StackVec
require T: Clone
to pop()
?
The pop
method from Vec<T>
in the standard libary is implemented for all
T
, but the pop
method from our StackVec
is only implemented when T
implements the Clone
trait. Why might that be? What goes wrong when the
bound is removed?
Implementing StackVec
Implement all of the unimplemented!()
StackVec
methods in
stack-vec/src/lib.rs
. Each method is documented in the source code. We have
also provided tests in src/tests.rs
that help ensure that your implementations
are correct. You can run these tests with cargo test
. You’ll also need to
implement the Deref
, DerefMut
, and IntoIterator
traits for StackVec
as
well as the IntoIterator
trait for &StackVec
for all of the cargo test
tests to pass. Once you feel confident that you implementation is correct and
have answered this subphase’s questions, proceed to the next subphase.
deref-in-tests
Which tests make use of the Deref
implementations?
Read through the tests we have provided in src/tests.rs
. Which tests would
fail to compile if the Deref
implementation did not exist? What about the
DerefMut
implementation? Why?
Our unit tests are incomplete!
Our unit tests provide a baseline truth, but they are not complete! We will run additional tests when we grade your assignment. You may wish to find the gaps in our tests and add additional tests of your own to fill them.
Hint: You may find your solutions to the lifetimes
challanges from phase 0 helpful.
Subphase B: volatile
In this subphase, you will learn about volatile memory accesses, read the source
code in the volatile
skeleton subdirectory, and answer questions related to
the source code. You won’t be writing any code in this subphase.
Like operating systems, compilers are masters at making things appear as if they’re doing what you think they’re doing when in reality, they’re really doing something entirely different for the sake of optimization. One such optimization is dead-access elimination: compilers remove memory accesses (reads and writes) when they can prove doing so has no observable effect on the program’s execution. For instance, consider the following program:
fn f() {
let mut x = 0;
let y = &mut x;
*y = 10;
}
The compiler can completely eliminate the write to *y
by reasoning that *y
is never read after it’s written. The compiler concludes that as a result, the
write cannot possibly effect the program, and eliminates it in the compiled
binary. For the same reason, it can then proceed to eliminate the declaration
for y
, the declaration for x
, and calls to f()
entirely.
These kinds of optimizations are almost exclusively beneficial: they speed up
our programs without affecting their outcome. But sometimes these optimizations
can have unintended consequences. Say, for example, that y
was pointing to a
write-only memory-mapped register. Then, writes to *y
will have observable
effects without having to read *y
thereafter. If the compiler is not aware
of this, it will optimize away these writes, and our program will not function
correctly.
How can we force the compiler to keep around reads and writes that appear to
have no effects at the source code level? This is where volatile
memory
accesses come in: the compiler promises not to optimize away volatile memory
accesses. So if we want to ensure a read or write occurs at runtime, we must
perform a volatile memory access.
Rusty volatile
In Rust, we use the read_volatile
and write_volatile
methods to perform
volatile reads and writes to a raw pointer.
What’s a raw pointer?
By now you’re familiar with references (&T
and &mut T
). A raw pointer in
Rust (*const T
and *mut T
) is a “reference” that isn’t tracked with
lifetimes by Rust’s borrow checker. Because of this, read or writes to these
pointers may be invalid, just as in C. Rust considers them unsafe
, and code
that reads or writes them must be annotated with unsafe
to indicate this.
You can read more about raw pointers in the rustdocs.
Calling read_volatile
and write_volatile
every time we want to perform a
volatile read or write is error prone and frustrating. Thankfully Rust provides
us the tools to make this easier and safer. Ideally we can simply declare a
pointer as volatile (as in C) and ensure that every read or write thereafter is
volatile. Even better, we should be able declare a pointer as read-only,
write-only (unlike in C), or read/write and ensure only the appropriate memory
accesses can be made.
Introducing Volatile
, ReadVolatile
, WriteVolatile
, and UniqueVolatile
The volatile
crate in the volatile/
skeleton subdirectory implements these
four types that allow us to do just this. Read the documentation for these types
now by running cargo doc --open
inside of the volatile/
directory.
unique-volatile
Why does UniqueVolatile
exist?
Both Volatile
and UniqueVolatile
allow read/write volatile accesses to an
underlying pointer. According to the documentation, what is the difference
between these two types?
Now open the source code in src/lib.rs
. Read through the source code to the
best of your abilities. When you’re ready, answer the following questions. Once
you have answered these questions, you’re ready to move on to the next subphase.
enforcing
How are read-only and write-only accesses enforced?
The ReadVolatile
and WriteVolatile
types make it impossible to write and
read, respectively, the underlying pointer. How do they accomplish this?
traits
What is the benefit of using traits instead of inherent methods?
You’ll notice that each type only implements one inherent method: new
. The
rest of a type’s methods come from implementations of the Readable
,
Writeable
, and ReadableWriteable
traits. What is the benefit to doing
this? Describe at least two benefits to this approach.
safety
Why are read
and write
safe while new
is unsafe?
What must be true of the inputs to new
for read
and write
to be safe?
Would it be safe to instead mark new
as safe and read
/write
as unsafe
?
Hint: You may find the documentation for these methods helpful.
pub-constructor
Why force construction through new
?
If the Volatile
type had been declared as follows instead:
struct Volatile<T>(pub *mut T);
A value of type Volatile
could be constructed with Volatile(ptr)
instead
of having to call new
. What benefit is there to ensuring that all values are
constructed through the new
static method?
Hint: Consider the safety implications.
macros
What do the macros do?
What do the readable!
, writeable!
, and readable_writeable!
macros do?
Subphase C: xmodem
In this subphase, you will implement the XMODEM file transfer protocol in the
xmodem
library in the xmodem/
skeleton subdirectory. You will primarily be
working in xmodem/src/lib.rs
.
XMODEM is a simple file transfer protocol originally developed in 1977. It features packet checksums, cancellation, and automatic retries. It is widely implemented and used for transfers through serial interfaces. Its best feature, however, is its simplicity. For more about its history, see the XMODEM Wikipedia article.
We will use the XMODEM protocol to transfer files to the Raspberry Pi. While we could use existing implementations of the XMODEM protocol to send data to the Pi, we will still need to write our own receiver. So, while we’re at it, we’ll be implementing XMODEM transmission as well.
The Protocol
The XMODEM protocol is described in detail in the Understanding The X-Modem File Transfer Protocol txt file. We describe it again here, for posterity.
Do not base your implementation off of Wikipedia’s explanation!
While Wikipedia’s explanation is helpful at a high level, many of the details presented there are different from the protocol we’ll be implementing here. As such, do not use the article as a reference for this subphase.
XMODEM is a binary protocol: bytes are sent and received in the raw. It is also “half duplex”: at any point in time, either the sender or receiver is sending data, but never both. Finally it is packet-based: data is separated into 128 byte chunks known as packets. The protocol dictates which bytes are sent when, what they mean, and how they’re interpreted.
First, we define a few constants:
const SOH: u8 = 0x01;
const EOT: u8 = 0x04;
const ACK: u8 = 0x06;
const NAK: u8 = 0x15;
const CAN: u8 = 0x18;
To start the file transfer, the receiver sends a NAK
byte while the sender
waits for a NAK
byte. Once the sender has received the NAK
byte, packet
transmission begins. The receiver only sends a NAK
byte to begin the file
transfer, not once for every packet.
Once file transfer has begun, each packet’s transmission and reception is
identical. Packets are numbered in sequential order starting at 1
and wrap
around to 0
after 255
.
To send a packet, the sender:
- Sends an
SOH
byte. - Sends the packet number.
- Sends the 1s complement of the packet number (
255 - $packet_number
). - Sends the packet itself.
- Sends the packet checksum.
- The checksum is the sum of all of the bytes in the packet mod 256.
- Reads a byte from the receiver.
- If the byte is
NAK
, transmission for the same packet is retried up to 10 times. - If the byte is
ACK
, the next packet is sent.
- If the byte is
The receive a packet, the receiver performs the inverse:
- Waits for an
SOH
orEOT
byte from the sender.- If a different byte is received, the receiver cancels the transfer.
- If an
EOT
byte is received, the receiver performs end of transmission.
- Reads the next byte and compares it to the current packet number.
- If the wrong packet number is received, the receiver cancels the transfer.
- Reads the next byte and compares it to the 1s complement of the packet number.
- If the wrong number is received, the receiver cancels the transfer.
- Reads a packet (128 bytes) from the sender.
- Computes the checksum for the packet.
- The checksum is the sum of all of the bytes in the packet mod 256.
- Reads the next byte and compares it to the computed checksum.
- If the checksum differs, sends a
NAK
byte and retries reception for the same packet. - If the checksum is the same, sends an
ACK
byte and receives the next packet.
- If the checksum differs, sends a
To cancel a transfer, a CAN
byte is sent by either the receiver or sender.
When either side receives a CAN
byte, it errors out, aborting the connection.
To end the transmission, the sender:
- Sends an
EOT
byte. - Waits for a
NAK
byte. If a different byte is received, the sender errors out. - Sends a second
EOT
byte. - Waits for an
ACK
byte. If a different byte is received, the sender errors out.
To end the transmission, the receiver performs the following after receiving the
first EOT
:
- Sends a
NAK
byte. - Waits for a second
EOT
byte. If a different byte is received, the receiver cancels the transfer. - Sends an
ACK
byte.
Implementing XMODEM
We have provided an unfinished implementation of the XMODEM protocol in the
xmodem
skeleton subdirectory. Your task is to complete the implementation by
writing the expect_byte
, expect_byte_or_cancel
, read_packet
, and
write_packet
methods in src/lib.rs
. Your implementations should make use of
the internal state of the Xmodem
type: packet
and started
. We recommend
reading over the existing code before starting.
You should begin by implementing the expect_byte
and expect_byte_or_cancel
methods. You should then make use of all four of the helper methods (including
read_byte
and write_byte
) to implement read_packet
and write_packet
. To
see how these methods are used, read the transmit
and receive
implementations which transmit or receive a complete data stream using XMODEM
via these methods. Be mindful of the specifications in the doc-comments. You can
test your implementation using cargo test
. Once you are confident that your
implementation is correct, proceed to the next subphase.
Do not use any additional items from std
.
Your implementation should only use items from std::io
. It should not use
other items from std
or any other libraries.
Hint: Our reference implementations for {read,write}_packet
are roughly 33 lines of code each.
Hint: Use the ?
operator generously.
Hint: The test source code can be a helpful guide.
Subphase D: ttywrite
In this subphase, you will write a command line utility, ttywrite
, that will
allow you to send data to your Raspberry Pi in the raw or via the XMODEM
protocol. You will use your xmodem
library from the previous subphase in your
implementation. You will write your code in ttywrite/src/main.rs
. To test your
ttywrite
implementation, use the provided test.sh
script. You will need to
install the socat
utility to use this script.
What is a serial device?
A serial device is any device that accepts communication one bit at a time. This is known as serial communication. In contrast, in parallel communication multiple bits are being transferred at any point in time in parallel. We will be communicating with our Raspberry Pi via its UART device, a serial communication device.
What is a TTY?
A TTY is a “teletypewriter”. It is a vestigial term that was adopted in computing to describe computer terminals. The term later become more general, coming to describe any device intended to be communicated with over serial. For this reason, your computer calls the device mapping to your Raspberry Pi a TTY.
Command-Line Interface
The skeleton code we have provided for ttywrite
already parses and validates
command-line arguments. To do so, it uses the structopt crate from crates.io
which itself uses clap. You’ll notice that we list it as a dependency in the
Cargo.toml
file. structopt works through code generation. We simply annotate
a structure and its fields with a declaration of our command-line arguments and
structopt generates the code to actually parse the command-line flags.
To see the interface that structopt generates, call the application with
--help
. Remember that you can pass arbitrary flags when using cargo run
:
cargo run -- --help
. Take a look at the interface now. Then, take a look at
the Opt
structure in main.rs
and compare the interface with its definition.
invalid
What happens when a flag’s input is invalid?
Try passing in some invalid values for flags. For instance, it should not be
possible to set -f
to idk
. How does structopt
know to reject invalid
values?
You’ll notice that there are plenty of options. All of these correspond to settings available on a serial device. For now it’s not important to know exactly what these settings do.
Talking to a Serial Device
In main
, you’ll see a call to serial::open. This is calling the open
function from the serial crate, also on crates.io. This open
function
returns a TTYPort which allows you to read and write to the serial device (via
its io::Read
and io::Write
trait implementations) as well as read and set
settings on a serial device (via its SerialDevice
trait implementation).
Writing the Code
Implement the ttywrite
utility. Your implementation should set all of the
appropriate settings passed in via the command-line stored in the opt
variable
in main
. It should read from stdin
if no input file is passed in or from the
input file if one is passed in. It should write the input data to the passed in
serial device. If the -r
flag is set, it should send the data as it is.
Otherwise, you should use your xmodem
implementation from the previous
subphase to send the data using the XMODEM protocol. You should print the number
of bytes sent on a successful transmission.
To transmit using the XMODEM protocol, your code should use either the
Xmodem::transmit
or Xmodem::transmit_with_progress
methods from the xmodem
library. We recommend using transmit_with_progress
so that your utility
indicates progress throughput the transmission. In its simplest form, this might
look as follows:
fn progress_fn(progress: Progress) {
println!("Progress: {:?}", progress);
}
Xmodem::transmit_with_progress(data, to, progress_fn)
You can test the baseline correctness of your implementation using the test.sh
script in the ttywrite
directory. When your implementation is at least
somewhat correct, you will see the following when the script is run:
Opening PTYs...
Running test 1/10.
wrote 333 bytes to input
...
Running test 10/10.
wrote 232 bytes to input
SUCCESS
Hint: You can retrieve a handle to stdin
with io::stdin().
Hint: You may find the io::copy() function useful.
Hint: The main()
function in our reference implementation is roughly 35 lines of code.
Hint: Keep the TTYPort documentation open while writing your code.
bad-tests
Why does the test.sh
script always set -r
?
The test.sh
script that we have provided always uses the -r
flag; it
doesn’t test that your utility uses the XMODEM protocol when it is asked to.
Why might that be? What does the XMODEM protocol expect that sending data in
the raw doesn’t that makes testing its functionality difficult?
Phase 3: Not a Seashell
In this phase, you will be implementing drivers for the built-in timer, GPIO, and UART devices. You’ll use then these drivers to implement a simple shell. In the next phase, you’ll use the same drivers to implement a bootloader.
What’s a driver?
The term driver, or device driver, describes software that directly interacts with and controls a hardware device. Drivers expose a higher-level interface to the hardware they control. Operating systems may interact with device drivers to expose an even higher-level interface. For instance, the Linux kernel exposes ALSA (Advanced Linux Sound Architecture), an audio API, which interacts with device drivers that in-turn interact directly with sound cards.
Subphase A: Getting Started
In the remainder of the assignment, you will be working inside a repository,
os
, that is assignment-independent. This is the beginning of the operating
system that you will complete by the end of the course. Responses to questions
in the remainder of the assignment should be recorded in the questions
directory of the assignment 1 skeleton.
We recommend the following directory structure for your assignments, where
0-blinky
and 1-shell
are the repositories for assignment 0 and 1,
respectively, and os
is the repository you will soon clone:
cs140e
├── 0-blinky
│ ├── Makefile
│ ├── phase3
│ └── phase4
├── 1-shell
│ ├── ferris-wheel
│ ├── getting-started
│ ├── stack-vec
│ ├── ttywrite
│ ├── volatile
│ └── xmodem
└── os
├── Makefile
├── bootloader*
├── kernel
├── pi
├── std
└── volatile
Recreate this directory structure now, which may involve renaming your
assignment 0/1 repositories. You’ll need to clone the os
repository:
git clone https://cs140e.sergio.bz/os.git os
Confirm that your directories are properly laid out by running make
inside the
os/kernel
directory now. If all is well, the command will return successfully.
Project Structure
The os
directory contains the following subdirectories:
pi
- library containing drivers and other low-level code for the OSvolatile
- a second version of thevolatile
library from phase 2 suitable for writing device driversstd
- a minimal, dependency-free version of Rust’sstd
librarybootloader*
- the bootloader you will write in phase 4 (note: not yet available)kernel
- the main operating system kernel
All of your driver code will reside in the pi
library. The pi
library makes
use of the volatile
library. It also optionally depends on the std
library.
bootloader
and kernel
make use of the pi
library to communicate with
hardware. They also depend on std
. The volatile
library has no dependencies.
The diagram below illustrates these relationships:
Firmware
We’ll need to update the firmware on our Raspberry Pi before continuing.
Download the new firmware files by running make fetch
inside of the os
repository. The command will download and extract files to the files/
subdirectory. Copy firmware/bootcode.bin
, firmware/config.txt
, and
firmware/start.elf
to the root of your MicroSD card. You can also copy the
act-led-blink.bin
file, renamed to kernel8.img
, to the root of the MicroSD
card to confirm that the new firmware is loaded properly. When started, you
should see a green LED on the Raspberry Pi blink on and off repeatedly.
Updated volatile
The volatile
library in the os
directory is subtly different than the
volatile
library in the assignment 1 skeleton repository. These changes make
it easier and more convenient to use the volatile
library for writing device
drivers. On the whole, the library is effectively the same, and your intuitions
and answers to questions from phase 2 all carry over. The main differences are:
UniqueVolatile
was replaced withUnique<Volatile>
.- A
Reserved
type with no capabilities was added.
The third major difference is that wrapper types wrap a T
, not a *mut T
.
This allows you to treat a raw address as volatile by casting it instead of
wrapping it: 0x1000 as *mut Volatile<T>
. This implies that we can cast an
address into a structure containing Volatile
wrappers and have writes to its
fields transparently be volatile:
#[repr(C)]
struct Registers {
REG_A: Volatile<u32>,
REG_B: Volatile<u8>
}
// This says that there's a `Registers` structure at address `0x4000`. In this
// case, this means that we expect a `u32` followed by a `u8` at that address.
let x: *mut Registers = 0x4000 as *mut Registers;
// We need to use `unsafe` because we're dereferencing a raw pointer.
unsafe {
// Rust does not automatically dereference raw pointers.
(*x).REG_A.write(434);
let val: u8 = (*x).REG_B.read();
}
What’s with #[repr(C)]
?
The #[repr(C)]
annotation forces Rust to lay out the structure’s fields in
the same way that C would. In general, Rust optimizes the order and padding
between fields of structures in an unspecified way. When we cast a raw address
to a pointer to a structure, we typically have a very specific memory layout
in mind. The #[repr(C)]
annotation lets us confide that Rust will arrange
the structure as we intend it to, not as it wishes.
Kernel
The os/kernel
directory contains the code for the operating system kernel: the
core of your operating system. Calling make
inside this directory builds the
kernel. The build output is stored in the build/
directory. To run the kernel,
copy the build/kernel.bin
file to the root of the MicroSD card as
kernel8.img
. At present, the kernel does absolutely nothing. By the end of
this phase, the kernel will start up a shell which you can communicate with.
The kernel
crate depends on the pi
library: you can see the extern crate
pi
statement declaring this dependency in kernel/src/kmain.rs
. As a result,
you can use all of the types and items from the pi
library in the kernel.
Documentation
While writing your device drivers, you’ll want to keep the BCM2837 ARM Peripherals Manual open.
Subphase B: System Timer
In this subphase, you will write a device driver for the ARM system timer. You
will primarily be working in os/pi/src/timer.rs
and os/kernel/src/kmain.rs
.
The ARM system timer is documented on page 172 (section 12) of the BCM2837 ARM
Peripherals Manual.
Start by looking at the existing code in os/pi/src/timer.rs
. In particular,
note the relationship between the following sections:
const TIMER_REG_BASE: usize = IO_BASE + 0x3000;
#[repr(C)]
struct Registers {
CS: Volatile<u32>,
CLO: ReadVolatile<u32>,
CHI: ReadVolatile<u32>,
COMPARE: [Volatile<u32>; 4]
}
pub struct Timer {
registers: &'static mut Registers
}
impl Timer {
pub fn new() -> Timer {
Timer {
registers: unsafe { &mut *(TIMER_REG_BASE as *mut Registers) },
}
}
}
The one line of unsafe
in this program is very important: it casts the
TIMER_REG_BASE
address to a *mut Registers
and then casts that to an
&'static mut Registers
. We are telling Rust that we have a static reference to
a Registers
structure at address TIMER_REG_BASE
.
What is at the TIMER_REG_BASE
address? On page 172 of the BCM2837 ARM
Peripherals Manual, you’ll find that 0x3000
is the peripheral offset for the
ARM system timer. Thus, TIMER_REG_BASE
is the address at which the ARM system
timer registers start! After this one line of unsafe
, we can use the
registers
field to access the timer’s registers safely. We can read the CLO
register with self.registers.CLO.read()
and write the CS
register with
self.registers.CS.write()
.
restricted-reads
Why can’t you write to CLO or CHI?
The BCM2837 documentation states that the CLO
and CHI
registers are
read-only. Our code enforces this property. How? What prevents us from writing
to CLO
or CHI
?
What exactly is unsafe?
In short, unsafe
is a marker for the Rust compiler that you’re taking
control of memory safety: the compiler won’t protect you from memory issues.
As a result, in unsafe
sections, Rust lets you do anything you can do in C.
In particular, you can cast between types with more freedom, dereference raw
pointers, and fabricate lifetimes.
But note that unsafe
is very unsafe. You must ensure that everything you
do in an unsafe
section is, in fact safe. This is more difficult than it
sounds, especially when Rust’s idea of safe is much stricter than in other
languages. As such, you should try not to use unsafe
at all. For operating
systems, unfortunately, we must use unsafe
so that we can directly speak
to hardware, but we’ll typically limit our use to one line per driver.
If you want to learn more about unsafe
, read Chapter 1 of the Nomicon.
Implement the Driver
Implement the Timer::read()
method and current_time()
, spin_sleep_us()
,
and spin_sleep_ms()
in os/pi/src/timer.rs
. The signatures on these items
indicate their expected functionality. You’ll need to read the timer’s
documentation in the BCM manual to implement Timer::read()
. In particular, you
should understand which registers to read to obtain the timer’s current u64
value. You can build the pi
library with cargo build
. You can also use
cargo check
to type-check the library without actually compiling it.
Testing Your Driver
Let’s test your driver by ensuring that spin_sleep_ms()
is accurate. We’ll
write the code to do this in kernel/src/kmain.rs
.
Copy your LED blinky code from phase 4 of assignment 0 into kmain.rs
. Instead
of the for
loop based sleep function, use your newly written spin_sleep_ms()
function to pause between blinks. Compile the kernel, load it onto the MicroSD
card as kernel8.img
, and then run it on the Raspberry Pi. Ensure that the LED
blinks at the frequency that you intended it to. Try other pause times and
ensure that they all work as expected. Until you write the bootloader in phase
4, you’ll need to keep swapping the MicroSD card between the Pi and your
computer to try out different binaries.
If your timer driver is working as expected, proceed to the next subphase.
Subphase C: GPIO
In this subphase, you will write a generic, pin-independent device driver for
GPIO. You will primarily be working in os/pi/src/gpio.rs
and
os/kernel/src/kmain.rs
. The GPIO subsystem is documented on page 89 (section
6) of the BCM2837 ARM Peripherals Manual.
State Machines
All hardware devices are state machines: they begin at a predetermined state and transition to different states based on explicit or implicit inputs. The device exposes different functionality depending on which state it is in. In other words, only some transitions are valid in some states. Importantly, this implies that some transitions are invalid when the device is in a given state.
Most programming languages make it impossible to faithfully encode the semantics of a state machine in hardware, but not Rust! Rust lets us perfectly encode state machine semantics, and we’ll take advantage of this to implement a safer-than-safe device driver for the GPIO subsystem. Our driver will ensure that a GPIO pin is never misused, and it will do so at compile-time.
This seems a little…researchy.
You got me. This is, in fact, my active area of research! - Sergio
Below is the state diagram for a subset of the GPIO state machine for a single pin:
Our goal is to encode this state machine in Rust. Let’s start by interpreting the diagram:
- The GPIO starts in the
START
state. - From the
START
state it can transition to one of three states:ALT
- no transitions are possible from this stateOUTPUT
- two “self” transitions are possible:SET
andCLEAR
INPUT
- one “self” transition is possible:LEVEL
blinky-states
Which transitions did you follow in your assignment 0 blinky
?
When you implementing the blinky code in phase 4 of assignment 0, you implicitly implemented a subset of this state machine. Which transitions did your code implement?
We’ll use Rust’s type system to ensure that a pin can only be SET
and
CLEAR
ed if it has been transitioned to the OUTPUT
state and the LEVEL
read
if it is in the INPUT
state. Take a look at the declaration for the GPIO
structure in pi/src/gpio.rs
:
pub struct Gpio<State> {
pin: u8,
registers: &'static mut Registers,
_state: PhantomData<State>
}
The structure has one generic argument, State
. Except for PhantomData
,
nothing actually uses this argument. This is what PhantomData is there for:
to convince Rust that the structure somehow uses the generic even though it
otherwise wouldn’t. We’re going to use the State
generic to encode which state
the Gpio
device is in. Unlike other generics, we must control this parameter
and ensure that a client can never fabricate it.
The state!
macro generates types that represent the states a Gpio
can be in:
states! {
Uninitialized, Input, Output, Alt
}
// Each parameter expands to an `enum` that looks like:
enum Input { }
This is also weird; why would we create an enum
with no variants? enum
’s
with no variants have a nice property: they can never be instantiated. In this
way, these types act purely as markers. No one can ever pass us a value of type
Input
because such a value can never be constructed. They exist purely at the
type-level.
We can then implement methods corresponding to valid transitions given that a
Gpio
is in a certain state:
impl Gpio<Output> {
/// Sets (turns on) the pin.
pub fn set(&mut self) { ... }
/// Clears (turns off) the pin.
pub fn clear(&mut self) { ... }
}
impl Gpio<Input> {
/// Reads the pin's value.
pub fn level(&mut self) -> bool { ... }
}
This ensures that a Gpio
can only be set
and clear
ed when it is a
Gpio<Output>
and its level
read when it is a Gpio<Input>
. Perfect! But how
do we actually transition between states? Hello, Gpio::transition()
!
impl<T> Gpio<T> {
fn transition<S>(self) -> Gpio<S> {
Gpio {
pin: self.pin,
registers: self.registers,
_state: PhantomData
}
}
}
This method lets us transition a Gpio
from any state to any other state. Given
a Gpio
in state T
, this method returns a Gpio
in state S
. Note that it
works for all S
and T
. We must be very careful when calling this method.
When called, we are encoding the specification of a transition in the state
diagram. If we get the specification or encoding wrong, our driver is wrong.
To use the transition()
method, we need to tell Rust which type we want as an
output S
in Gpio<S>
. We do this by giving Rust enough information so that it
can infer the S
type. For instance, consider the implementation of the
into_output
method:
pub fn into_output(self) -> Gpio<Output> {
self.into_alt(Function::Output).transition()
}
This method requires its return type to be Gpio<Output>
. When the Rust type
system inspects the call to transition()
, it will search for a
Gpio::transition()
method that returns a Gpio<Output>
to satisfy the
requirement. Since our transition
method returns Gpio<S>
for any S
, Rust
will replace S
with Output
and use that method. The result is that we’ve
transformed our Gpio<Alt>
(from the into_alt()
call) into a Gpio<Output>
.
fake-states
What would go wrong if a client fabricates states?
Consider what would happen if we let the user choose the initial state for a
Gpio
structure. What could go wrong?
Why is this only possible with Rust?
Notice that the into_
transition methods take a Gpio
by move. This means
that once a Gpio
is transitioned into a another state, it can never be
accessed in the previous state. Rust’s move semantics make this possible. As
long as a type doesn’t implement Clone
, Copy
, or some other means of
duplication, there is no coming back from a transition. No other language, not
even C++, affords us this guarantee at compile-time.
Implement the Driver
Implement the unimplemented!()
methods in pi/src/gpio.rs
. The signatures on
these items indicate their expected functionality. You’ll need to read the GPIO
documentation (page 89, section 6 of the BCM2837 ARM Peripherals Manual) to
implement your driver. Remember that you can use cargo check
to type-check the
library without actually compiling it.
Hint: Remember that you can create arbitrary lexical scopes with { .. }
.
Testing Your Driver
We’ll again write code in kernel/src/kmain.rs
to ensure that our driver works
as expected.
Instead of reading/writing to raw memory addresses, use your new GPIO driver to
set and clear GPIO pin 16. Your code should get a lot cleaner. Compile the
kernel, load it onto the MicroSD card as kernel8.img
, run it on the Raspberry
Pi, and ensure your LED blinks as before.
Now, connect more LEDs to your Raspberry Pi. Use GPIO pins 5, 6, 13, 19, and 26. Refer to the pin numbering diagram from assignment 0 to determine their physical location. Have your kernel blink all of the LEDs in a pattern of your choice.
led-pattern
Which pattern did you choose?
What pattern did you have your LEDs blink in? If you haven’t yet decided, one fun idea is to have them imitate a “loading spinner” by arranging the LEDs in a circle and turning them on/off in a sequential, circular pattern.
Once your GPIO driver is working as expected, proceed to the next subphase.
Subphase D: UART
In this subphase, you will write a device driver for the mini UART device on the
Raspberry Pi. You will primarily be working in os/pi/src/uart.rs
and
os/kernel/src/kmain.rs
. The mini UART is documented on page 8 and page 10
(sections 2.1 and 2.2) of the BCM2837 ARM Peripherals Manual.
UART: Universal Asynchronous RX/TX
A UART, or universal asynchronous receiver-transmitter, is a device and serial protocol for communicating over two wires. These are the two wires (rx/tx) that you used in phase 1 of assignment 0 to connect the UART device on the CP2102 USB module to the UART device on the Pi. You can send any kind of data over UART: text, binaries, images, anything! As an example, in the next subphase, you’ll implement a shell by reading from the UART device on the Pi and writing to the UART device on the CP2102 USB module. In phase 4, you’ll read from the UART on the Pi to download a binary being sent via the UART on the CP2102 USB module.
The UART protocol has several configuration parameters, and both the receiver and transmitter need to be configured identically to communicate. These parameters are:
- Data Size: length of a single data frame (8 or 9 bits)
- Parity Bit: whether to send a parity (checksum) bit after the data
- Stop Bits: how many bits to use to signal the end of the data (1 or 2)
- Baud Rate: transmission rate in bits/second
The mini UART on the Pi does not support parity bits and only supports 1 stop bit. As such, only the baud rate and data frame length need to be configured. To learn more about UART, see the Basics of UART Communication article.
Implement the Driver
At this point, you have all of the tools to write a device driver without additional background information (congratulations! 🎉).
Implement the mini UART device driver in pi/src/uart.rs
. You’ll need to
complete the definition of the Registers
structure. Ensure that you use the
Volatile
type with the minimal set of capabilities for each register:
read-only registers should use ReadVolatile
, write-only registers should use
WriteVolatile
, and reserved space should use Reserved
. Then, initialize the
device in new()
by setting the baud rate to 115200
(a divider of 270
) and
data length to 8
bits. Finally, implement the remaining unimplemented!()
methods and the fmt::Write
, io::Read
and io::Write
traits for MiniUart
.
Hint: You’ll need to write to the LCR
, BAUD
, and CNTL
registers in new
.
Hint: Use your GPIO driver from the previous subphase.
Testing Your Driver
Test your driver by writing a simple “echo” program in kernel/src/kmain.rs
:
sit in a hot loop writing out every byte you read in. In pseudocode, this looks
like:
loop {
write_byte(read_byte())
}
Use screen /dev/<your-path> 115200
to communicate over UART. screen
sends
every keypress over the TTY, so if your echo program works correctly, you’ll see
every character you type. It might help to send an extra character or two each
time you receive a byte to convince yourself things are working as you expect:
loop {
write_byte(read_byte())
write_str("<-")
}
Once your driver works as expected, proceed to the next subphase.
Subphase E: The Shell
In this subphase, you’ll use your new UART driver to implement a simple shell
that will be the interface to your operating system. You will be working in
os/kernel/src/console.rs
, os/kernel/src/shell.rs
, and
os/kernel/src/kmain.rs
.
The Console
To write our shell, we’ll need some notion of a global default input and output.
Unix and friends typically refer to this is as stdin
and stdout
; we’ll be
calling it Console
. Console
will allow us to implement the kprint!
and
kprintln!
macros, our kernel-space versions of the familiar print!
and
println!
, and give us a default source for reading user input. We’ll use
Console
and these macros to implement our shell.
Take a peek at os/kernel/src/console.rs
. The file contains an unfinished
implementation of the Console
struct. Console
is a singleton wrapper
around a MiniUart
: only one instance of Console
will ever exist in our
kernel. That instance will be globally available, for use anywhere and by
anything. This will allow us to read and write to the mini UART without
explicitly passing around an instance of MiniUart
or Console
.
Global Mutability
The notion of a globally mutable structure is a scary thought, especially in the
face of Rust. After all, Rust doesn’t allow more than one mutable reference to a
value, so how can we possibly convince it to allow as many as we want? The
trick, of course, relies on unsafe
. The idea is as follows: we’ll tell Rust
that we’re only going to read a value by using an immutable reference, but what
we actually do is use unsafe
to “cast” that immutable reference to a mutable
reference. Because we can create as many immutable references as we want, Rust
will be none the wiser, and we’ll have all of the mutable references we desire!
Such a function might look like this:
// This function must never exist.
fn make_mut<T>(value: &T) -> &mut T {
unsafe { /* magic */ }
}
Your alarm bells should be ringing: what we’ve proposed so far is wildly unsafe.
Recall that we still need to ensure that everything we do in unsafe
upholds
Rust’s rules. What we’ve proposed thus far clearly does not. As it stands, we’re
violating the “at most one mutable reference at a time” rule. The rule states
that at any point in the program, a value should have at most one mutable
reference to it.
The key insight to maintaining this rule while meeting our requirements is as
follows: instead of the compiler checking the rule for us with its borrow and
ownership checker, we will ensure that the rule is upheld dynamically, at
run-time. As a result, we’ll be able to share references to a structure as many
times as we want (via an &
reference) while also being able to safely retrieve
a mutable reference when we need it (via our &T -> &mut T
dynamic borrow
checking function).
There are many concrete implementations of this idea. One such implementation ensures that only one mutable reference is returned at a time using a lock:
fn lock<T>(value: &T) -> Locked<&mut T> {
unsafe { lock(value); cast value to Locked<&mut T> }
}
impl Drop for Locked<&mut T> {
fn drop(&mut self) { unlock(self.value) }
}
This is known as Mutex in the standard library. Another way is to abort the program if more than one mutable reference is about to be created:
fn get_mut<T>(value: &T) -> Mut<&mut T> {
unsafe {
if ref_count(value) != 0 { panic!() }
ref_count(value) += 1;
cast value to Mut<&mut T>
}
}
impl Drop for Mut<&mut T> {
fn drop(&mut self) { ref_count(value) -= 1; }
}
This is RefCell::borrow_mut(). And yet another is to only return a mutable reference if it is known to be exclusive:
fn get_mut<T>(value: &T) -> Option<Mut<&mut T>> {
unsafe {
if ref_count(value) != 0 { None }
else {
ref_count(value) += 1;
Some(cast value to Mut<&mut T>)
}
}
}
impl Drop for Mut<&mut T> {
fn drop(&mut self) { ref_count(value) -= 1; }
}
This is RefCell::try_borrow_mut(). All of these examples implement some form
of “interior mutability”: they allow a value to be mutated through an immutable
reference. For our Console
, we’ll be using Mutex
to accomplish the same
goal. Since the std::Mutex
implementation requires operating system support,
we’ve implemented our own Mutex
in kernel/src/mutex.rs
. Our implementation
is correct for now, but we’ll need to fix it when we introduce caching or
concurrency to continue to uphold Rust’s rules. You don’t need to understand the
Mutex
implementation for now, but you should understand how to use one.
The global singleton is declared as CONSOLE
in kernel/src/console.rs
. The
global variable is used by the kprint!
and kprintln!
macros defined below
below. Once you’ve implemented Console
, you’ll be able to use kprint!
and
kprintln!
to print to the console. You’ll also be able to use CONSOLE
to
globally access the console.
Rust also requires static
globals to be Sync
.
In order to store a value of type T
in a static
global, T
must
implement Sync
. This is because Rust also guarantees data race safety at
compile-time. Because global values can be accessed from any thread, Rust
must ensure that those accesses are thread-safe. The Send
and Sync
traits, along with Rust’s ownership system, ensure data race freedom.
drop-container
Why should we never return an &mut T
directly?
You’ll notice that every example we’ve provided wraps the mutable reference in
a container and then implements Drop
for that container. What would go wrong
if we returned an &mut T
directly instead?
write-fmt
Where does the write_fmt
call go?
The _print
helper function calls write_fmt
on an instance of
MutexGuard<Console>
, the return value from Mutex<Console>::lock()
. Which
type will have its write_fmt
method called, and where does the method
implementation come from?
Implement and Test Console
implement all of the unimplemented!()
methods in kernel/src/console.rs
. once
you’ve implemented everything, use the kprint!
and kprintln!
macros in
kernel/src/kmain.rs
to write to the console when you receive a character. you
can use these macros exactly like print!
and println!
. use screen
/dev/<your-path> 115200
to communicate with your pi and ensure that your kernel
works as expected.
If this were C…
The fact that we get a println!
implementation for free with zero effort is
just another advantage to using Rust. If this were C, we’d need to implement
printf
ourselves. In Rust, the compiler provides a generic, abstracted, and
safe OS-independent implementation. Whew!
Hint: Your Console
implementations should be very short: about a line each.
Implement the Shell
You’re now ready to implement the shell in kernel/src/shell.rs
. We’ve provided
a Command
structure for your use. The Command::parse()
method provides a
simple command-line argument parser, returning a Command
struct. The parse
method splits the passed in string on spaces and stores all of the non-empty
arguments in the args
field as a StackVec
using the passed in buf
as
storage. You must implement Command::path()
yourself.
Use all of your available libraries (Command
, StackVec
, Console
via
CONSOLE
, kprint!
, kprintln!
, and anything else!) to implement a shell in
the shell
function. Your shell should print the prefix
string on each line
it waits for input. In the GIF above, for instance, "> "
is being used as the
prefix. Your shell should then read a line of input from the user, parse the
line into a command, and attempt to execute it. It should do this ad-infinitum.
Since our operating system is only just beginning, we can’t run any interesting
commands just yet. We can, however, build known commands like echo
into the
shell.
To complete your implementation, your shell should…
- implement the
echo
built-in:echo $a $b $c
should print$a $b $c
- accept both
\r
and\n
as “enter”, marking the end of a line - accept both backspace and delete (ASCII
8
and127
) to erase a single character - ring the bell (ASCII
7
) if an unrecognized non-visible character is sent to it - print
unknown command: $command
for an unknown command$command
- disallow backspacing through the prefix
- disallow typing more characters than allowed
- accept commands at most 512 bytes in length
- accept at most 64 arguments per command
- start a new line, without error, with the
prefix
if the user enters an empty command - print
error: too many arguments
if the user passes in too many arguments
Test your implementation by calling your new shell()
function in
kernel/src/kmain.rs
. Minus the “SOS” banner, you should be able to replicate
the GIF above. You should also be able to test all of the requirements we’ve
set. Once your shell works as expected, revel in your accomplishments. Then,
proceed to the next phase.
Hint: A byte literal, b'a'
is the u8
ASCII value for a character 'a'
.
Hint: Use \u{b}
in a string literal to print any character with ASCII byte value b
.
Hint: You must print both \r
and \n
to begin a new line at the line start.
Hint: To erase a character, backspace, print a space, then backspace again.
Hint: Use StackVec
to buffer the user’s input.
Hint: You’ll find the std::str::from_utf8() function useful.
You’re not using the real std
!
Recall that we’re using a custom implementation of std
in our kernel. The
implementation is, necessarily, incomplete, lacking lots of useful
functionality. Run xargo doc --open
in the os/std
directory to see what
it does contain.
shell-lookback
How does your shell tie the many pieces together?
Your shell makes use of much of the code you’ve written. Briefly explain: which pieces does it makes use of and in what way?
Phase 4: Boot ‘em Up
In this phase, you’ll use everything you’ve written thus far to implement a
bootloader for your Raspberry Pi. You’ll be working primarily in
os/bootloader/src/kmain.rs
.
You’ve likely become frustrated with the monotonous motions of swapping MicroSD cards to load a new binary onto your Pi. The bootloader you will write in this phase eliminates that process entirely. You’ll replace the binary on the MicroSD one more time, this time with the bootloader. From then on, you can load new binaries remotely from your computer without ever touching the MicroSD card again.
The bootloader itself is a “kernel” of sorts that accepts XMODEM file transfers
over UART. It writes the data received into memory at a known address and then
executes it. We’ll use our ttywrite
utility to send it binaries. As a result,
the process to load a new binary onto the Pi will be as simple as:
- Resetting the Pi to start the bootloader.
- Running
ttywrite -i my-new-binary.bin /dev/<your-device>
.
Loading Binaries
By default, the Raspberry Pi 3 loads files named kernel8.img
at address
0x80000
. Said another way, the Pi will sequentially copy the contents of
kernel8.img
to 0x80000
and, after some initialization, set the ARM’s program
counter to 0x80000
. As a result, we must ensure that our binary expects to be
loaded at this address. This means that all of the addresses in the binary
should begin at 0x80000
.
Because the linker is what decides the addresses for all symbols in our
binary, we must somehow inform the linker of this desire. To do this, we use a
linker script: a file read by the linker that describes how we want it to
assign addresses to symbols in our binary. Our kernel’s linker script can be
found in os/kernel/ext/layout.ld
. You’ll notice the address 0x80000
on the
second line. Indeed, this line instructs the linker to begin allocating
addresses at 0x80000
.
To maintain compatibility with these defaults, our bootloader will also load
binaries at address 0x80000
. But this raises an issue: if our bootloader’s
binary is at address 0x80000
, loading a different binary at the same address
will result in overwriting our bootloader as we’re executing it! To avoid this
conflict, we must use different start addresses for the bootloader and the
binaries it loads. We’d like to maintain compatibility with the Pi’s defaults,
so we’ll need to change the start address of the bootloader. How?
Making Space
The first step is to choose a new address. As you can see in
os/bootloader/ext/layout.ld
, we’ve chosen 0x4000000
as the start address
for our bootloader. While this fixes the addresses in the binary, the Pi will
continue to load it at 0x80000
. Thankfully, we can ask the Pi to load our
binary at a different address via a kernel_address
parameter in the firmware’s
config.txt
. We’ve included a fixed version of the file in
bootloader/ext/config.txt
that does just this. Ensure you use this
config.txt
when running your bootloader by copying it to the root of the
MicroSD card.
As a result of this change, the memory between 0x80000
and 0x4000000
will be
entirely unused by the bootloader, and we can load binaries up to 0x4000000 -
0x80000
bytes in size without conflict.
small-kernels
Is 63.5MiB really enough?
You might be thinking that the free space we’ve set apart isn’t enough. This is a fair concern. One way to answer the question is to look at the file size of kernels from successful operating systems. Would they fit?
Determine how large the kernel binary is for the operating system you’re
running now. On newer versions of macOS, the binary is
/System/Library/Kernels/kernel
. On older versions of macOS, the binary is
/mach_kernel
. On Linux, the binary is usually located in /boot/
and is
named either vmlinuz
, vmlinux
, or bzImage
. How big is your kernel’s
binary? Would it fit in the 63.5MiB free space we’ve created?
Implement the Bootloader
Implement the bootloader in bootloader/src/kmain.rs
. We’ve declared the
bootloader’s start address, the loaded binary’s start address, and the maximum
binary size in const
declarations at the top of the file. We’ve also provided
a jump_to
function that unconditionally branches to the address addr
. This
has the effect of setting the program counter to that address. Your bootloader
should use these declarations along with your existing code from the pi
and
xmodem
libraries to receive a transmission over UART and write it to the
memory address the binary expects to be loaded at. When the transmission is
complete, your bootloader should execute the new binary.
Be aware that your bootloader should continuously attempt to initiate an XMODEM
reception by setting a low timeout value (say, 750ms) and attempting a new
reception if a timeout occurs. If a reception fails for any other reason, print
an error message and try again. Once you’ve implemented the bootloader, test it
by sending your kernel binary from os/kernel/build/kernel.bin
to your Pi using
your ttywrite
utility. If all is well, you should see your shell when you
screen
into your Pi.
bootloader-timeout
Why is the timeout necessary?
Without the bootloader timing out and retrying a reception, it is possible for the transmitter to stall indefinitely under some conditions. What are those conditions, and why would the transmitter stall indefinitely?
Remember to use the version of config.txt
compatible with bootloader binaries!
Hint: Our reference kmain()
function is 15 lines of code.
Hint: You’ll find the std::slice::from_raw_parts_mut() function useful.
Hint: The &mut [u8]
type implements io::Write
.
Submission
Once you’ve completed the tasks above, you’re done and ready to submit! Congratulations!
From inside of the 1-shell
assignment 1 skeleton directory, you can call make
check
to check if you’ve answered every question and make test
to run the
unit tests for projects in 1-shell
. Note that there are no unit tests for
projects in os
. You’re responsible for ensuring that they work as expected.
When you’re ready, commit your changes. Any uncommitted changes will not be
submitted with your assignment. Then, run make submission
from the 1-shell
directory and proceed to the submission page to upload your submission.