int main(int argc, char** argv) {
if (argc < 2) {
printf("This program expects a number as an argument!\n");
return 1;
}
int a = atoi(argv[1]);
printf("%i is 0x%x\n", a, a);
unsigned char* bytes = (char*)&a;
for (int i = 0; i < sizeof(a); ++i) {
printf("Byte %i is 0x%x\n", i, bytes[i]);
}
return 0;
}
```
---
## Numeric Representations
* What is 1?
$ ./a.out 1
1 is 0x1
Byte 0 is 0x1
Byte 1 is 0x0
Byte 2 is 0x0
Byte 3 is 0x0
---
## Endianness
* The `least significant byte` is first
* Meaning that `1` is stored at index 0
* `bytes[0]`
* A system that stores the least significant byte first is called `little-endian`
* The opposite would be `big-endian`
* `most significant byte` first
---
## More on Endianness
* The internet generally stores things in big-endian format
* This leads to headaches when sending data online
* Some systems also store data in mixed-endian formats
* Thankfully not on the ilab machines
---
## More Numbers
* What is $2^{31} - 1$?
* This is `INT_MAX`, from limits.h
$ ./a.out 2147483647
2147483647 is 0x7fffffff
Byte 0 is 0xff
Byte 1 is 0xff
Byte 2 is 0xff
Byte 3 is 0x7f
---
## More! Bigger!
$ ./a.out 2147483648
-2147483648 is 0x80000000
Byte 0 is 0x0
Byte 1 is 0x0
Byte 2 is 0x0
Byte 3 is 0x80
---
## Overflow
$ ./a.out 2147483648
-2147483648 is 0x80000000
* Why?
* An `int`, using 4 bytes, cannot store 2147483648
* Range is from $-2^{31}$ to $2^{31}-1$
* An operation outside of the range makes it negative
* By why is 0x80000000 the largest negative number?
---
## Negative Numbers
$ ./a.out -1
-1 is 0xffffffff
Byte 0 is 0xff
Byte 1 is 0xff
Byte 2 is 0xff
Byte 3 is 0xff
* Weird!
* 0x80000000 is more negative than 0xffffffff?
---
## Two's Complement
* Remember modulo arithmetic?
* Two's complement allows for elegant numeric representation
* 2147483647+1 does not exist, so it wraps to the end
* 2147483647 is 0xffffff7f
* 2147483647 + 1 is 0x00000080
* Your hardware likes this!
---
## Remember Endianness!
* Adding 1 to 0xffffff7f begins at the left
* Since this is a little endian system
* 0xff + 1 is 0x00 with a bit overflowing to the adjacent byte
* That overlow reaches 0x7f, turning it to 0x80
---
## More Two's Complement
* 0x80000000 + 1 is 0x80000001
* That continues until -1, which is 0xffffffff
* Any number with the highest bit set is negative
* Any negative number plus its positive counterpart has value 0
* 0xffffffff + 0x00000001 = 0x00000000
* So elegant!
---
## Converting to Two's Complement
* Two find a negative number, start with the positive value
* E.g. 1 = 0x00000001
* The highest bit represents the sign, limiting the range of values
* Flip every bit
* 0xfffffffe
* Add 1
* 0xffffffff
---
## Example Negation
```C
a = 0x80000000 + (a ^ 0x7fffffff) + 1;
printf("%i is 0x%x\n", a, a);
```
* `^` is xor, or `exclusive or`
* It means a or b, but not both
* 1 ^ 1 = 0
* 1 ^ 0 = 1
* 0 ^ 0 = 0
---
## Implications
* Two's complement simplifies hardware
* Easy to check for negative values as well
* Just see if the most significat bit is set
* In C, you can use the `binary and` (`&`) operator to check for a bit
* `a & 0x80000000`
* `Addition, subtraction, and multiplication are the same for signed and unsigned numbers`
---
## Why not simpler?
* Why not just use the most significant bit for the sign?
* The rest of the number would stay the same
* First, it's wasteful
* There would be two values for 0
* Second, it requires more complicated hardware
* Two's complement has been dominant since the 1960s
* Since IBM System/360 and PDP computers
---
## Summing Examples
```C
#include
#include
int main(int argc, char** argv) {
if (argc < 3) {
printf("This program expects 2 numbers as arguments!\n");
return 1;
}
// This reads a long long.
// The second argument is used for error checking.
// NULL means we aren't checking for errors and the function ignores it.
// The last argument, 16, says that we are reading values in base 16.
long long a = strtoll(argv[1], NULL, 16);
long long b = strtoll(argv[2], NULL, 16);
printf("0x%llx + 0x%llx = 0x%llx\n", a, b, a+b);
printf("%lli + %lli = %lli\n", a, b, a+b);
return 0;
}
```
---
## Some Sums
$ ./a.out 7fffffffffffffff 1
0x7fffffffffffffff + 0x1 = 0x8000000000000000
9223372036854775807 + 1 = -9223372036854775808
$ ./a.out -1 1
0xffffffffffffffff + 0x1 = 0x0
-1 + 1 = 0
$ ./a.out -ffffffffffffffff -1
0x8000000000000000 + 0xffffffffffffffff = 0x7fffffffffffffff
-9223372036854775808 + -1 = 9223372036854775807
---
## Pause
* Any questions or confusion so far?
---
## Let's Write an Algorithm
* We've covered just about enough topics to do something interesting
* So let's find some prime numbers!
* With an algorithm called the `Sieve of Eratosthenes`
---
## The Sieve of Eratosthenes
* Finds all of the primes up to some number
* Algorithm:
1. Mark all numbers from 2 to N as being possibly prime
2. Start at 2 (the first prime)
3. This number is prime; mark all of its multiples as not prime
4. Go to the next number still marked as possibly prime
5. Repeat steps 3 and 4 until reaching N
---
## Storing information
* We could make variables for each number
```C
bool two_is_prime = false;
bool three_is_prime = false;
bool four_is_prime = false;
```
* That would be silly
* Instead, we should use an array
---
## Initializing arrays
* [https://cppreference.com/w/c/language/array.html](https://cppreference.com/w/c/language/array.html)
* Three types:
* Known constant size
* Variable length
* Unknown size
* Let's just look at the first for now
* Save some fun for next class!
---
## Constant size arrays
```C
#include
#include
void print_array(bool array[10]) {
for (int i = 0; i < 10; ++i) {
printf("%i: %i\n", i, array[i]);
}
}
int main(int argc, char** argv) {
{
// This has a fixed size
bool bool_array[10];
print_array(bool_array);
}
{
// We could initialize with a set of values
bool bool_array[] = {false, false, false, false, false,
true, true, true, true, true};
print_array(bool_array);
}
{
// Or we could just set some values
bool bool_array[10] = {false, false, true};
print_array(bool_array);
}
return 0;
}
```
---
## Always Initialize Arrays
```
0: 0
1: 0
2: 112
3: 84
4: 50
5: 106
6: 252
7: 127
8: 0
9: 0
0: 0
1: 0
2: 0
3: 0
4: 0
5: 1
6: 1
7: 1
8: 1
9: 1
0: 0
1: 0
2: 1
3: 0
4: 0
5: 0
6: 0
7: 0
8: 0
9: 0
```
---
## Initialization
* Any initialization will force C to initialize all elements with a default
* For non-default values, use `memset` from ``
---
## Proof Through Assembly
> $ gcc -g -c example12.c -o example12.o
>
> $ objdump -d -M intel -S example12.o
```
example12.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 :
#include
#include
void print_array(bool array[10]) {
0: f3 0f 1e fa endbr64
4: 55 push rbp
5: 48 89 e5 mov rbp,rsp
8: 48 83 ec 20 sub rsp,0x20
c: 48 89 7d e8 mov QWORD PTR [rbp-0x18],rdi
for (int i = 0; i < 10; ++i) {
10: c7 45 fc 00 00 00 00 mov DWORD PTR [rbp-0x4],0x0
17: eb 30 jmp 49
printf("%i: %i\n", i, array[i]);
19: 8b 45 fc mov eax,DWORD PTR [rbp-0x4]
1c: 48 63 d0 movsxd rdx,eax
1f: 48 8b 45 e8 mov rax,QWORD PTR [rbp-0x18]
23: 48 01 d0 add rax,rdx
26: 0f b6 00 movzx eax,BYTE PTR [rax]
29: 0f b6 d0 movzx edx,al
2c: 8b 45 fc mov eax,DWORD PTR [rbp-0x4]
2f: 89 c6 mov esi,eax
31: 48 8d 05 00 00 00 00 lea rax,[rip+0x0] # 38
38: 48 89 c7 mov rdi,rax
3b: b8 00 00 00 00 mov eax,0x0
40: e8 00 00 00 00 call 45
for (int i = 0; i < 10; ++i) {
45: 83 45 fc 01 add DWORD PTR [rbp-0x4],0x1
49: 83 7d fc 09 cmp DWORD PTR [rbp-0x4],0x9
4d: 7e ca jle 19
}
}
4f: 90 nop
50: 90 nop
51: c9 leave
52: c3 ret
0000000000000053 :
int main(int argc, char** argv) {
53: f3 0f 1e fa endbr64
57: 55 push rbp
58: 48 89 e5 mov rbp,rsp
5b: 48 83 ec 30 sub rsp,0x30
5f: 89 7d dc mov DWORD PTR [rbp-0x24],edi
62: 48 89 75 d0 mov QWORD PTR [rbp-0x30],rsi
66: 64 48 8b 04 25 28 00 mov rax,QWORD PTR fs:0x28
6d: 00 00
6f: 48 89 45 f8 mov QWORD PTR [rbp-0x8],rax
73: 31 c0 xor eax,eax
{
// This has a fixed size
bool bool_array[10];
print_array(bool_array);
75: 48 8d 45 ee lea rax,[rbp-0x12]
79: 48 89 c7 mov rdi,rax
7c: e8 00 00 00 00 call 81
}
{
// We could initialize with a set of values
bool bool_array[] = {false, false, false, false, false,
81: c6 45 ee 00 mov BYTE PTR [rbp-0x12],0x0
85: c6 45 ef 00 mov BYTE PTR [rbp-0x11],0x0
89: c6 45 f0 00 mov BYTE PTR [rbp-0x10],0x0
8d: c6 45 f1 00 mov BYTE PTR [rbp-0xf],0x0
91: c6 45 f2 00 mov BYTE PTR [rbp-0xe],0x0
95: c6 45 f3 01 mov BYTE PTR [rbp-0xd],0x1
99: c6 45 f4 01 mov BYTE PTR [rbp-0xc],0x1
9d: c6 45 f5 01 mov BYTE PTR [rbp-0xb],0x1
a1: c6 45 f6 01 mov BYTE PTR [rbp-0xa],0x1
a5: c6 45 f7 01 mov BYTE PTR [rbp-0x9],0x1
true, true, true, true, true};
print_array(bool_array);
a9: 48 8d 45 ee lea rax,[rbp-0x12]
ad: 48 89 c7 mov rdi,rax
b0: e8 00 00 00 00 call b5
}
{
// Or we could just set some values
bool bool_array[10] = {false, false, true};
b5: 48 c7 45 ee 00 00 00 mov QWORD PTR [rbp-0x12],0x0
bc: 00
bd: 66 c7 45 f6 00 00 mov WORD PTR [rbp-0xa],0x0
c3: c6 45 f0 01 mov BYTE PTR [rbp-0x10],0x1
print_array(bool_array);
c7: 48 8d 45 ee lea rax,[rbp-0x12]
cb: 48 89 c7 mov rdi,rax
ce: e8 00 00 00 00 call d3
}
return 0;
d3: b8 00 00 00 00 mov eax,0x0
}
d8: 48 8b 55 f8 mov rdx,QWORD PTR [rbp-0x8]
dc: 64 48 2b 14 25 28 00 sub rdx,QWORD PTR fs:0x28
e3: 00 00
e5: 74 05 je ec
e7: e8 00 00 00 00 call ec
ec: c9 leave
ed: c3 ret
```
---
## Important Notes
* The compiler reuses the same memory multiple times
* An unitialized array created at the end would see the third arrays values
* Array access is going from a smaller address to a larger address
* rbp-0x12 to rbp-0x9
* Look at instructions starting at 0x81
---
## Extra Notes
* Our `i < 10` is converted to `i <= 9`
* At 0x4d, `jle` is jump if less than equal
* The `for` loop jumps to the comparison before doing anything
* At 0x17 it `jmp` (jumps) to 0x49
* We know this is a waste because `i` begins at `0` and is less than `10`
* Compile again with the `-O1` flag, and the compiler optimizes this away
---
## Sieve to 100
```C
#include
#include
#include
int main(int argc, char** argv) {
// Find and print the primes to 100
bool maybe_prime[101] = {false, false};
// Fill the next 99 values with true
memset(maybe_prime+2, true, 99);
// Loop over all numbers to 100
for (int i = 2; i <= 100; ++i) {
// Loop over all multiples of any prime numbers
if (maybe_prime[i]) {
// Mark all multiples as not prime
for (int j = i+1; j <= 100; ++j) {
// A remainder of 0 means i evenly divides into j
if (j % i == 0) {
maybe_prime[j] = false;
}
}
}
}
// Print the primes
for (int i = 0; i <= 100; ++i) {
if (maybe_prime[i]) {
printf("%i is prime.\n", i);
}
}
return 0;
}
```
---
## Primes
```
2 is prime.
3 is prime.
5 is prime.
7 is prime.
11 is prime.
13 is prime.
17 is prime.
19 is prime.
23 is prime.
29 is prime.
31 is prime.
37 is prime.
41 is prime.
43 is prime.
47 is prime.
53 is prime.
59 is prime.
61 is prime.
67 is prime.
71 is prime.
73 is prime.
79 is prime.
83 is prime.
89 is prime.
97 is prime.
```
---
## Next Time:
* Variable size arrays
* Memory management