Paul Bodily About Courses Research Outreach Tips for Communicating Teaching Philosophy Vitae

Labs

Lab 4: The Bomb Lab

  • Lab Specification
  • Request a bomb here
    • If the site doesn't load, try a few page refreshes—it should eventually work
    • If you have trouble downloading a bomb, try removing spaces from the username (use underscores instead)
  • Check the scoreboard here

Helpful hints

Template for annotating assembly

Feel free to download and use a copy of my code annotation Excel spreedsheet. The Code annotation spreadsheet shows you how I might suggest walking through to annotate and reverse-engineer a piece of assembly. Here is a screenshot:

Code annotation spreedsheet screenshot

Everything in black is how the spreadsheet should look before you start. I have a column for addresses (pasted in). I usually skinny this column down to just see the offsets. Then I add a column where I can annotate labels. Then a column for instructions (pasted in). Then a column for annotations and a column for the reverse-engineered C code. I also have a space for taking notes about the purposes/values of the registers and the stack. I find that for me, just annotating what they contain or represent is sufficient, but you might want to insert and update values in these locations as you walk through an execution of the code by hand.

Everything in blue are the actual annotations themselves. Note that what you see is the final form of my annotations for this code. I often start out with bland labels (e.g., .L1, .L2, etc) or bland variable names (e.g., v, x, y, etc.) and then come back to change them to something more descriptive when I start to see what the code is doing.

Everything in red are extra notes I've added just to explain my process.

Assorted Q&A

Q: Is there a good way to determine if the registers are holding arrays or structs vs just data?

A: To the machine, it's all just 0's and 1's. Even in the machine instructions, it doesn't know data types. Data types are an abstraction created by the C language (or any other language) and the C compiler's job is to deconvert from that abstraction. Since the machine doesn't know data types, then they're not reflected in the machine code.

The way you determine if a register points to (note a register can't hold an array or a struct—they're too big) an array or a struct is by observing and drawing inference from how the instructions manipulate and reference that register.

Q: I'm not sure what information can be ignored like the system doing checks or do we look in the phases or everywhere?

A: What we observed today in class will be true for the rest of the phases, namely that the system checks and other ignorable instructions will show up at the beginning and end of the functions.

Still stuck?

Follow the process: 1) make sure you understand the literal interpretation of the instruction, 2) take a step beyond the literal and interpret what that instruction means in the broader context (e.g., setting an argument, assigning a variable value, etc.), 3) after annotating a sufficient number of instructions to see what's happening, reverse-engineer the C code.

A few suggestions here for getting unstuck on any phase:

  1. Figure out the general format expected as the input for that phase by looking at the arguments passed to the scanf procedure
  2. Look for patterns that are indicative of loops, procedure calls, recursion, arrays, etc. We covered these in previous lectures (go back and review slides/videos), in the homeworks, and they're in the textbook.
  3. Once you have the input format and see the general control flow of the procedure, annotate the control flow by labeling the beginning and ending of loops (i.e., using that label column). This helps you map out generally what the flow of the procedure looks like. Take particular note of anything that looks like a loop variable (i.e., a variable set to 1 or 0 and which gets incremented each time through a loop). This helps you see how many times a loop gets executed, and then you can start to try to intuit why it might execute that many times. Also, take note of variables that update themselves by adding 0x4 or 0x8. This usually indicates some sort of iteration through an array (updating the variable to point to the next element in the array).
  4. There really is no substitute for just working through, line by line, and doing detailed annotations and reverse-engineering. This is intentionally built into the lab because the goal is for you to learn this skill.

Lab 5: The Attack Lab

  • Lab Specification
  • Request a target here
    • If the site doesn't load, try a few page refreshes—it should eventually work
    • There should be no spaces in the username (use underscores instead)
  • Check the scoreboard here
    • Please note the scoreboard does not necessarily reflect successful completion of phases (which will be evaluated on submission of the exploit files via Moodle) but does generally indicate progress towards completion of the lab.

Helpful hints

  • If you're having trouble sshing or scping, be sure you're connected to the VPN.
  • Remember, the exploits will only work if you're running in gdb, not from the commandline.
  • Remember: a hex character is 4 bits (or half a byte). It takes 2 hex characters to encode one byte. Your addresses should be 8 bytes.