<!--
Abstract:

CS 211
Computer Architecture
Lecture 03: Pointers and Memory
-->

# CS 211 - Lecture 03 - Integers in Memory

Bernhard Firner

2026-01-28

---

## Review!

* Memory
  * `argc` and `argv`
  * arrays and strings
  * pointers
* Then use that to examine integers
  * Sections 2.2-2.3 in the book

---

## Looking at argc and argv

```C
/*
 * This is an example program!
 * To compile code:
 * gcc example.c -o example
 */

#include <stdio.h>

int main(int argc, char** argv) {
	// Your program goes here.

// Say hello.
    printf("Hello world!\n");
    // Print out argc, the count of the program arguments
    printf("There are %d arguments.\n", argc);
    // What's the first argument?
    printf("The first argument is %s.\n", argv[0]);
    // Return something.
    return 123;
}
```

---

## What Are argc and argv?

* The main function has two arguments
* `argc` is the argument count, the number of arguments to your program
* `argv` is the argument vector, the string representations of the arguments
* These are not known at `compile time`, only at `run time`.

---

## What is argv's type?

* argv is an array of strings
  * But that just means that it is a number that refers to some memory
* Pointers and arrays are equivalent in C
  * Both ways to refer to memory
* *Everything* in C has a memory location
  * And the `&` operator (address of) can be used to fetch them

---

## What is a string in C?

* The argument vector is an array of strings
* A string is an array of characters
  * Each string ends with a special, null character
    * Value of 0
    * Written `\0`
* Use `man ascii` on linux to see all ASCII characters

---

## Array Access

* Pointers are numbers to memory locations
  * e.g. 0x7fff82006545
* Those memory locations are arbitrary
  * The OS chooses an arbitrary range for a program
  * This allows for virtual memory, swap, virtual machines, etc
  * Not an OS course, so we'll leave it at that

---

## Diving into arrays

* argv looks something like this:

<table>
<tr>
<td><b>index</b></td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>...</td>
<tr>
<td><b>content</b></td>
<td>pointer1</td>
<td>pointer2</td>
<td>pointer3</td>
<td>...</td>
</tr>
</table>

<br/>

* And `argv[0]` points to something like this:

<table>
<tr>
<td><b>index</b></td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<tr>
<tr>
<td><b>character</b></td>
<td>a</td>
<td>.</td>
<td>o</td>
<td>u</td>
<td>t</td>
<td>\0</td>
</tr>
</table>

---

## Arrays

* To access elements in an array we specify an index inside of square brackets
  * These things: `[ ]`
* The first index is 0

```C
//Access the first element
argv[0]
//Access the third element
argv[2]
```

---

## Arrays Vs Pointers

* Arrays are pointers.
  * Associated with a C type
  * That means `char*` is a different type  than `int*`
    * Why?

---

## More Pointer Math

* `argv` is a `char**`
  * That means it stores 8 byte numbers, since this is a 64-bit system
    * Every pointer must be able to refer to anywhere in memory
* Adding 1 to a `char**` increases the memory location by 8
* Subtracting 1 would decrease it by 8

---

## More Pointer Math

* `argv[0]` is a `char*`
  * A `char` is 8 bits
  * Adding 1 will increase the pointer value by 1

---

## Data sizes

* It is easy to check the size of a type or variable
* Use the `sizeof` operator
* `sizeof(argv)` will return 8
* `sizeof(char**)` will return 8
* `sizeof(char*)` will return 8
* `sizeof(char)` will return 1
* [https://cppreference.com/w/c/language/sizeof.html](https://cppreference.com/w/c/language/sizeof.html)

---

## Access Operators

* We've used the array operator `[]`
  * But it's just syntactic sugar for **pointer dereference**
    * Which, confusingly, uses the * operator
  * `argv[x]` is equivalent to `*(argv + x)`
* This is basically the reverse of the `&` operator
* [https://cppreference.com/w/c/language/operator_member_access.html](https://cppreference.com/w/c/language/operator_member_access.html)

```C
int a = 6;
// b points to a
int *b = &a;
// a is now 2
b[0] = 2;
// a is now 4
*b = 4;
```

---

## Casting Tricks

* We can tell the compiler to treat a variable of one type as another type
* This is especially useful with pointers

```C
#include <stdio.h>

/*
 * This function doesn't return anything, so it returns 'void'
 * 'char' is the smallest unit of memory.
 * size_t is a special type that we use along with pointers.
 * It is guaranteed to have the same range as memory.
 */
void show_bytes(unsigned char* data, size_t len) {
    for (int i = 0; i < len; ++i) {
        printf("Byte %i is %02x\n", i, data[i]);
    }
    printf("\n");
}

int main(void) {
    // One byte
    char a = 10;
    // Two bytes (also called a word)
    short b = 10;
    // Four bytes on most systems (also called a double word)
    int c = 10;
    // Eight bytes
    long long int d = 10;
    // The sizeof operator returns the number of bytes in a type or variable.
    show_bytes(&a, sizeof(char));
    // For the non-char types we need to cast to unsigned char*
    show_bytes((unsigned char*)&b, sizeof(short));
    show_bytes((unsigned char*)&c, sizeof(int));
    show_bytes((unsigned char*)&d, sizeof(long long int));
}
```

---

## Examining Numbers

* Let's change that program so that we provide the number to inspect
* But user inputs are strings. We'll have to convert.
* We'll use the `atoi` (ascii to int) function from `stdlib.h`
  * `int atoi(const char* nptr);`
  * [https://cppreference.com/w/c/string/byte/atoi.html](https://cppreference.com/w/c/string/byte/atoi.html)
  * `man atoi`

---

## New Program

```C
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    // Verify that there are enough user arguments
    if (argc < 2) {
        printf("There aren't enough arguments.\n");
        return 10;
    }
    int user_input = atoi(argv[1]);
    printf("Received user input %i\n", user_input);
    show_bytes((unsigned char*)&user_input, sizeof(int));
}
```

---

## Numeric Representations

* What is 1?
  * The least significant byte is the first one, so this is little Endian

<pre>
$ ./a.out 1
Received user input 1
Byte 0 is 01
Byte 1 is 00
Byte 2 is 00
Byte 3 is 00
</pre>

---

## More Numbers

* What is $2^{31} - 1$?
* This is `INT_MAX`, from limits.h

<pre>
$ ./a.out 2147483647
Received user input 2147483647
Byte 0 is ff
Byte 1 is ff
Byte 2 is ff
Byte 3 is 7f
</pre>

---

## More! Bigger!

<pre>
$ ./a.out 2147483648
Received user input -2147483648
Byte 0 is 00
Byte 1 is 00
Byte 2 is 00
Byte 3 is 80
</pre>

---

## Overflow

* Weird! Why?
* An `int`, using 4 bytes, cannot store 2147483648
  * Range is from $-2^{31}$ to $2^{31}-1$
* We "wrapped around" from the largest positive value to the most negative value
  * But why is 0x80000000 the largest magnitude negative number?

---

## Negative Numbers

<pre>
$ ./a.out -1
Received user input -1
Byte 0 is ff
Byte 1 is ff
Byte 2 is ff
Byte 3 is ff
</pre>

* 0x80000000 is more negative than 0xffffffff
* Integers are stored in something called **two's complement**

---

## Two's Complement

* Two's complement allows for elegant numeric representation and consistent rules
* 2147483647+1 does not exist, so it wraps to the end
  * 2147483647 is 0x7fffffff
  * 2147483647 + 1 is 0x80000000
    * So the operation looks reasonable in hex, but we've wrapped around from positive to negative

---

## More Two's Complement

* 0x80000000 + 1 is 0x80000001
* Keep adding 1s, and you'll get to 0xffffffff, which is -1
  * Add one more and you get 0x00000000, which is 0

---

## Complements

* Look at -1 and 1: 0xffffffff and 0x00000001
* What happens if you add them together?
  * They sum to 0!
* Subtract one from -1 and add 1 to 1: 0xfffffffe and 0x00000002
  * They sum to 0!
* Add any complements and they sum to 0

---

## Converting to Two's Complement

* Two find a negative number, start with the positive value
  * E.g. 1 = 0x00000001
  * The highest bit represents the sign, limiting the range of values
* Flip every bit
  * 0xfffffffe
* Add 1
  * 0xffffffff

---

## Example Negation

```C
    a = (a ^ 0xffffffff) + 1;
    printf("%i is 0x%x\n", a, a);
```

* `^` is xor, or `exclusive or`
  * It means a or b, but not both
  * 1 ^ 1 = 0
  * 1 ^ 0 = 1
  * 0 ^ 0 = 0
* What we've done is equivalent to negation, so we could have used the ~ operator

-v-

```C
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
    // Verify that there are enough user arguments
    if (argc < 2) {
        printf("There aren't enough arguments.\n");
        return 10;
    }
    int user_input = atoi(argv[1]);
    printf("Received user input %i\n", user_input);
    int compl = (user_input ^ 0xffffffff) + 1;
    printf("Two's complement is %i\n", compl);
    show_bytes((unsigned char*)&user_input, sizeof(int));
}
```

---

## Implications

* Two's complement simplifies hardware
* Easy to check for negative values as well
  * Just check the highest order bit
  * In C, you can use the `binary and` (`&`) operator to check for a bit
  * `a & 0x80000000`
* Addition, subtraction, and multiplication are the same for signed and unsigned numbers

-v-

```C
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
    // Verify that there are enough user arguments
    if (argc < 2) {
        printf("There aren't enough arguments.\n");
        return 10;
    }
    int user_input = atoi(argv[1]);
    printf("Received user input %i\n", user_input);
    int compl = (user_input ^ 0xffffffff) + 1;
    printf("Two's complement is %i\n", compl);
    if (user_input & 0x80000000) {
        printf("This number is negative!\n");
    }
    show_bytes((unsigned char*)&user_input, sizeof(int));
}
```

---

## Why not simpler?

* Why not just use the most significant bit for the sign and the rest for the magnitude?
* First, it's wasteful
  * There would be two values for 0
* Second, conceptually 2's complement is more complicated, but in hardware it is easier
  * Two's complement has been dominant since the 1960s
    * Since IBM System/360 and PDP computers

---

## More details

* What happens when we cast integers?
* And how is 2's complement simpler in hardware?

---

## Truncation and Expansion

* We were working with 4 byte ints
  * What if we go down to a 2 byte short or 1 byte char?
  * Or up to an 8 byte long long int?

---

## Fancier Program

```C
#include <stdio.h>
#include <stdlib.h>

char as_char = (char)user_input;
    long long int as_llint = (long long int)user_input;
    show_bytes((unsigned char*)&user_input, sizeof(int));
    show_bytes((unsigned char*)&as_char, sizeof(char));
    show_bytes((unsigned char*)&as_llint, sizeof(long long int));
}
```

---

## +1000

<pre>
$ ./a.out 1000
Received user input 1000
Byte 0 is e8
Byte 1 is 03
Byte 2 is 00
Byte 3 is 00

Byte 0 is e8

Byte 0 is e8
Byte 1 is 03
Byte 2 is 00
Byte 3 is 00
Byte 4 is 00
Byte 5 is 00
Byte 6 is 00
Byte 7 is 00
</pre>

---

## -1000

<pre>
$ ./a.out -1000
Receiver user input -1000
This number is negative!
Byte 0 is 18
Byte 1 is fc
Byte 2 is ff
Byte 3 is ff

Byte 0 is 18

Byte 0 is 18
Byte 1 is fc
Byte 2 is ff
Byte 3 is ff
Byte 4 is ff
Byte 5 is ff
Byte 6 is ff
Byte 7 is ff
</pre>

---

## Truncation

* We saw before that casting to a lower precision type leads to truncation
  * 0x000003e8 turned into 0xe8
  * That's $14\times16 + 8 = 232$
  * Has nothing to do with the original 1000, other than sharing the least significant byte

---

## Extension

* Casting to a larger type does preserve the value
  * 0x000003e8 turns into 0x00000000000003e8
  * 0xfffffc18 turns into 0xfffffffffffffc18
* The positive value is 0-padded, which doesn't change the value
* The negative value is f-padded, which also doesn't change the value

---

## Bit Operations

* Since we're talking about similar topics, let's cover this
  * We've seen xor (^ operator)
  * And used binary and (& operator)
Let's go over the rest of the bit operators

---

## Binary Or

* Binary or (| operator)
* This is often used to combine things or set particular bits
  * 0xaa00 | 0x00bb = 0xaabb

---

## Bit shift

* The left shift (<<) and right shift (>>) operators are equivalent to pushing a 0 onto the right or left of a number
  * This is actually the same as multiplying or dividing by 2
* 0x001 << 1 = 0x002
* 0x008 >> 2 = 0x004

---

## Together

* The bit operations are often used together
* For example, let's toggle the most significant bit of an int

```C
#include <stdio.h>

int main(void) {
    int a = 9834234;
    int b = a ^ (1 << 31);
    printf("a is 0x%08x, and b is 0x%08x\n", a, b);
    return 0;
}
```

---

## Output

* Doesn't seem super exciting, but this is very handy sometimes

---

## Last Operator

* There is one more bit operator that does negation
  * The ~ operator will negate a value
  * It is equivalent to xor with a value with all bits set

```C
#include <stdio.h>

int main(void) {
    int a = 9834234;
    printf("a is 0x%08x, ~a is 0x%08x\n", a, ~a);
    printf("a is 0x%08x, (a ^ 0xffffffff) is 0x%08x\n", a, (a ^ 0xffffffff));
    return 0;
}
```

---

## Next Time:

* Methods of memory allocation
  * Variable size arrays
  * Dynamic memory management
  * The stack and the heap

<!--
## Compiler Options

Note:
-Wall
-Werror
-fsanitize=address
-std=c99 (do we want to do this?)
-lm
-->