Feel free to download and use a copy of my code annotation Excel spreedsheet. The Code annotation spreadsheet shows you how I might suggest walking through to annotate and reverse-engineer a piece of assembly. Here is a screenshot:
Everything in black is how the spreadsheet should look before you start. I have a column for addresses (pasted in). I usually skinny this column down to just see the offsets. Then I add a column where I can annotate labels. Then a column for instructions (pasted in). Then a column for annotations and a column for the reverse-engineered C code. I also have a space for taking notes about the purposes/values of the registers and the stack. I find that for me, just annotating what they contain or represent is sufficient, but you might want to insert and update values in these locations as you walk through an execution of the code by hand.
Everything in blue are the actual annotations themselves. Note that what you see is the final form of my annotations for this code. I often start out with bland labels (e.g., .L1, .L2, etc) or bland variable names (e.g., v, x, y, etc.) and then come back to change them to something more descriptive when I start to see what the code is doing.
Everything in red are extra notes I've added just to explain my process.
Q: Is there a good way to determine if the registers are holding arrays or structs vs just data?
A: To the machine, it's all just 0's and 1's. Even in the machine instructions, it doesn't know data types. Data types are an abstraction created by the C language (or any other language) and the C compiler's job is to deconvert from that abstraction. Since the machine doesn't know data types, then they're not reflected in the machine code.
The way you determine if a register points to (note a register can't hold an array or a struct—they're too big) an array or a struct is by observing and drawing inference from how the instructions manipulate and reference that register.
Q: I'm not sure what information can be ignored like the system doing checks or do we look in the phases or everywhere?
A: What we observed today in class will be true for the rest of the phases, namely that the system checks and other ignorable instructions will show up at the beginning and end of the functions.
Follow the process: 1) make sure you understand the literal interpretation of the instruction, 2) take a step beyond the literal and interpret what that instruction means in the broader context (e.g., setting an argument, assigning a variable value, etc.), 3) after annotating a sufficient number of instructions to see what's happening, reverse-engineer the C code.
A few suggestions here for getting unstuck on any phase: