<!--
Abstract:

CS 211
Computer Architecture
Lecture 06: File I/O and Structs
-->

# CS 211 - Lecture 07 - Bit Fields, Unions, and Fractional Values

Bernhard Firner

2026-02-11

---

## Struct Review

* It is intuitive to group related data together
  * And indeed, our hardware is optimized for this
* In C, structs are blocks of related memory
  * They do not include member function, inheritance, polymorphism, etc
  * So they are similar, but not the same as class in other languages

---

## Struct Practice

* We can go through a couple of struct examples
  * This will highlight what we've learned about

---

## Memory in Structs

* We can use structs to help us remember array sizes
  * This also cleans up arguments and returns to and from functions
* Recall that only the pointer value is copied when passed, not the memory

```C
typedef struct nicePtr nicePtr;
struct nicePtr {
    unsigned int* memory;
    size_t elements;
};
```

---

## Better names with typedef

* Recall that the actual name of the struct is "struct nicePtr"
* C has a way to make new names, so we can simplify things
  * `typedef <existing name> <new name>`
  * You can do this before making the struct, which is a `forward declaration`

---

## Pointer Passing

* Arguments are passed by their values
* The contents of a struct are copied
* Since pointers are just numbers, we can safely pass structs with pointers
  * Makes it simple to return something like nicePtr
* If you pass via a pointer, you can use **const** to prevent modification

---

## Const example

```C
/*
 * Demonstrate const with struct pointers
 */

#include <stdio.h>

typedef struct bigData bigData;
struct bigData {
    long long numbers[1000];
};

// This function modifies data's members, so it cannot be const.
void modifyData(bigData* data) {
    for (int i = 0; i < sizeof(data->numbers)/sizeof(long long); ++i) {
        data->numbers[i] = i;
    }
}

// This function does not modify data's members, so it can be const.
long long sumData(const bigData* data) {
    long long sum = 0;
    for (int i = 0; i < sizeof(data->numbers)/sizeof(long long); ++i) {
        sum += data->numbers[i];
    }
    return sum;
}

int main(void) {
    bigData data = {.numbers = {}};

// Modify the data
    modifyData(&data);

// Sum the data
    printf("Sum is %lli\n", sumData(&data));

return 0;
}
```

---

## New Example

* Let's simplify a bit, then mix in what we've learned
* Let's read in the contents of a file
  * Make everything upper case
  * Reprint
* We can go one character at a time with `fgetc` and `putchar`

---

## Reading a File

* New functions to work with files:
  * `FILE *fopen(const char *restrict pathname, const char *restrict mode);`
    * Open a file. Mode is 'r' for read, 'w' for write.
  * `int fclose(FILE *stream);`
    * Close the file.

---

## Getting and Putting

* In stdio.h
  * `int fgetc(FILE *stream);`
    * Get a single character
  * `int putchar(int c);`
    * Print a single character

---

## Example file

<pre>
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
</pre>

---

## Making something uppercase

* We could use the `toupper` function from `ctype.h`
* I want us to do a little math, though

---

## man ascii

```
       For convenience, below are more compact tables in hex and decimal.

2 3 4 5 6 7       30 40 50 60 70 80 90 100 110 120
        -------------      ---------------------------------
       0:   0 @ P ` p     0:    (  2  <  F  P  Z  d   n   x
       1: ! 1 A Q a q     1:    )  3  =  G  Q  [  e   o   y
       2: " 2 B R b r     2:    *  4  >  H  R  \  f   p   z
       3: # 3 C S c s     3: !  +  5  ?  I  S  ]  g   q   {
       4: $ 4 D T d t     4: "  ,  6  @  J  T  ^  h   r   |
       5: % 5 E U e u     5: #  -  7  A  K  U  _  i   s   }
       6: & 6 F V f v     6: $  .  8  B  L  V  `  j   t   ~
       7: ' 7 G W g w     7: %  /  9  C  M  W  a  k   u  DEL
       8: ( 8 H X h x     8: &  0  :  D  N  X  b  l   v
       9: ) 9 I Y i y     9: '  1  ;  E  O  Y  c  m   w
       A: * : J Z j z
       B: + ; K [ k {
       C: , < L \ l |
       D: - = M ] m }
       E: . > N ^ n ~
       F: / ? O _ o DEL
```

---

## Ascii Math

* The lowercase and uppercase letters are all in a row
* The distance from 'a' to 'A' is the same as from 'z' to 'Z'
  * 'a' - 'A' = 97 - 65 = 32
* So we can just subtract that distance from a lowercase letter to get an uppercase letter

---

## Upper Case Code

```C
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
    if (argc < 2) {
        printf("This program requires a data file.\n");
        return 1;
    }

// Open datafile for reading.
    FILE* datafile = fopen(argv[1], "r");
    if (datafile == NULL) {
        printf("Error opening file.\n");
        return 2;
    }

// We don't need to get complicated with sscanf
    // Just read in one character at a time
    // fgetc returns some special numbers, so we read into an int rather than a char
    int input = fgetc(datafile);
    // EOF is defined in stdio.h, means "End Of File"
    while (input != EOF) {
        // Print a single char
        // Ascii characters appear in numerical order, so we can do math to capitalize
        if (input >= 'a' && input <= 'z') {
            // Increase by the distance from the lower case letters to the upper case
            input += 'A' - 'a';
        }
        putchar(input);
        // For the next loop iteration
        input = fgetc(datafile);
    }

// Finished with the file
    fclose(datafile);

return 0;
}
```

---

## Reversing

* What if we wanted to reverse the characters?
  * We could make a giant array, shove the characters into it, and then loop backwards over it.
  * Not very elegant
* Instead we'll use a stack, which you've hopefully seen before!

---

## Stack (data structure)

* We push onto the top to add
* Pop to retrieve and remove from the top
* Last in, first out
* This shows up in our architecture, so let's get experience working with one

---

## Stack Definitions

* stack.h

```C
// For the size_t type definition
#include <stddef.h>

typedef struct Stack Stack;
struct Stack {
    char *memory;
    // Remember how much memory has been allocated
    size_t memsize;
    // Remember the top of the stack
    size_t index;
};

Stack newStack(size_t size);

void freeStack(Stack s);

void push(Stack* s, char element);

char pop(Stack* s);

size_t len(Stack s);
```

---

## Push and dynamic memory

* What if we push and there isn't enough memory?
* We are going to allocate more memory
  * Double whatever we have currently
* Then copy over the existing memory values before free-ing it

---

## Stack Functions

```C
#include <stdlib.h>
#include <string.h>

#include "stack.h"

Stack newStack(size_t size) {
    Stack s = { .memory = NULL, .memsize = 0, .index = 0 };
    s.memory = calloc(size, sizeof(char));
    s.memsize = size;
    return s;
}

void freeStack(Stack s) {
    free(s.memory);
}

void push(Stack* s, char element) {
    // Double the memory whenever it is filled
    if (s->index + 1 >= s->memsize) {
        char* new_memory = calloc(s->memsize * 2, sizeof(char));
        // Copy from the old memory into the new memory and then free the old memory
        memcpy(new_memory, s->memory, s->memsize);
        free(s->memory);
        s->memory = new_memory;
        s->memsize = s->memsize * 2;
    }
    s->memory[s->index] = element;
    s->index += 1;
}

char pop(Stack* s) {
    if (0 == s->index) {
        return '\0';
    }
    // Index is at the first empty position
    s->index -= 1;
    char output = s->memory[s->index];
    return output;
}

size_t len(Stack s) {
    return s.index;
}

```

---

## Using the stack

```C
#include <stdio.h>
#include <stdlib.h>

#include "stack.h"

int main(int argc, char** argv) {
    if (argc < 2) {
        printf("This program requires a data file.\n");
        return 1;
    }

// Open datafile for reading.
    FILE* datafile = fopen(argv[1], "r");
    if (datafile == NULL) {
        printf("Error opening file.\n");
        return 2;
    }

Stack s = newStack(10);

// We don't need to get complicated with sscanf
    // Just read in one character at a time
    int input = fgetc(datafile);
    // EOF is defined in stdio.h, means "End Of File"
    while (input != EOF) {
        push(&s, input);
        // For the next loop iteration
        input = fgetc(datafile);
    }

// Finished with the file
    fclose(datafile);

// Reverse
    while (0 < len(s)) {
        putchar(pop(&s));
    }
    putchar('\n');

freeStack(s);

return 0;
}
```

---

## Makefile

```Makefile
reverse: stack.h stack.c reverse.c
	gcc $^ -o $@
```

* (The indentation should be a tab)

---

## Structs and Memory

* C structs make it easier to work with memory
  * So of course they have memory-optimizing features
* We can specify the length of each struct field, in bits

```C
struct Bits {
    int a : 5;
};
```

* Bits.a is a 5 bit type

---

## Why?

* The compiler automatically packs `bool` into 1 bit
  * But not other types

```C
typedef struct Bits Bits;
struct Bits {
    bool a : 1;
    bool b : 1;
    bool c : 1;
    bool d : 1;
    bool e : 1;
    bool f : 1;
    bool g : 1;
    bool h : 1;
};
```

* `sizeof(Bits)` will be 8

---

## Bit packing

```C
typedef struct Numbers Numbers;
struct Numbers {
    int a : 10;
    int b : 10;
    int c : 10;
};
```

* `sizeof(Numbers)` will be 4
  * 30 bits fits into 4 bytes
  * Size will always be a whole number of bytes

---

## Memory Alignment

```C
typedef struct Aligned Aligned;
struct Aligned {
    int a : 10;
    int b : 10;
    int c : 10;
    long long int d : 64;
};
```

* `sizeof(Aligned)` will have a size of 16
  * On this architecture at least
  * Why? Should be 4 + 8 = 12, right?

---

## Words

* Data sizes are bits, nibbles, bytes, words, dwords, and qwords
  * 1 bit, 4 bits, 8 bits, 16 bits, 32 bits, and 64 bits
* Memory is fetched at particular byte boundaries

---

## Alignment

* Aligning memory to those boundaries makes a program faster
  * But less space efficient
* The compiler aligns the "Aligned" struct members to qword boundaries
  * So there is empty space between Aligned.c and Aligned.d

---

## Packing

* We can tell the compiler to pack things together to save space

```
#pragma pack(push, 4)
typedef struct Unaligned Unaligned;
struct Unaligned {
    int a : 10;
    int b : 10;
    int c : 10;
    long long int d : 64;
};
#pragma pack(pop)
```

* This tells the compiler to align to 4 byte (double word) boundaries
* Now the size is 12

---

## Code

```C
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

typedef struct Bits Bits;
struct Bits {
    bool a : 1;
    bool b : 1;
    bool c : 1;
    bool d : 1;
    bool e : 1;
    bool f : 1;
    bool g : 1;
    bool h : 1;
};

typedef struct Numbers Numbers;
struct Numbers {
    int a : 10;
    int b : 10;
    int c : 10;
};

typedef struct Aligned Aligned;
struct Aligned {
    int a : 10;
    int b : 10;
    int c : 10;
    long long int d : 64;
};

#pragma pack(push, 4)
typedef struct Unaligned Unaligned;
struct Unaligned {
    int a : 10;
    int b : 10;
    int c : 10;
    long long int d : 64;
};
#pragma pack(pop)

int main(int argc, char** argv) {
    printf("The size of Bits is %lu\n", sizeof(Bits));
    printf("The size of numbers is %lu\n", sizeof(Numbers));
    printf("The size of aligned is %lu\n", sizeof(Aligned));
    printf("The size of unaligned is %lu\n", sizeof(Unaligned));

return 0;
}
```

---

## Interpreting Memory

* One more trick
* The way data at a memory location is treated is arbitrary
* Normally we have to get a pointer and recast to change it

```C
int a = 10;
char* b = (char*)(&a);
```

---

## Unions

* In assembly, data is loaded into a register before use
* The type of register changes the way the data is treated
  * Not the source of the data!
  * All data is just data, only the registers have types!
* So we could treat memory any way we want, or in multiple ways
  * C calls this a Union

---

## Union Syntax

* Declaring a union looks like declaring a struct
* But every member is treated as having overlapping memory

---

## Union Example

```C
union together {
    int number;
    unsigned char chars[4];
};
```

* Equivalent to having an int and a char* to its memory.

---

## Useful Union

```C
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

typedef struct Bits Bits;
struct Bits {
    bool a : 1;
    bool b : 1;
    bool c : 1;
    bool d : 1;
    bool e : 1;
    bool f : 1;
    bool g : 1;
    bool h : 1;
};

typedef union LLUBits LLUBits;
union LLUBits {
    // 8 bytes to a long long
    Bits bits[8];
    long long number;
};

int main(int argc, char** argv) {
    if (argc < 2) {
        printf("Give me a number!");
        return 1;
    }

char* endptr = argv[1];
    LLUBits value;
    value.number = strtoll(argv[1], &endptr, 10);

for (int i = sizeof(LLUBits)-1; i >= 0; --i) {
        printf("%i", value.bits[i].h);
        printf("%i", value.bits[i].g);
        printf("%i", value.bits[i].f);
        printf("%i", value.bits[i].e);
        printf("%i", value.bits[i].d);
        printf("%i", value.bits[i].c);
        printf("%i", value.bits[i].b);
        printf("%i", value.bits[i].a);
        printf(" ");
    }
    printf("\n");

return 0;
}
```

---

## Outputs

<pre>
$ ./print_bits 1
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
$ ./print_bits 9223372036854775807
01111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111
$ ./print_bits -9223372036854775807
10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
$ ./print_bits -9223372036854775808
10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
</pre>

---

## Numeric Types

* Since we're working with numbers, let's talk about fractions
* If you were designing new numbers from scratch, you might consider this:

```C
typedef struct Fraction Fraction;
struct Fraction {
    unsigned int fraction : 4;
    unsigned int decimal : 28;
};
```

---

## Supporting Fractions

* The Fraction struct groups two different numbers:
  * Fractional
  * Regular integer
* The fraction is evenly divided into the range from 0 to 1
  * So with 4 bits, the step size is $1/2^4 = \frac{1}{16} = 0.0625$

---

## Fixed Point

* Because this uses a fixed precision past the decimal point, we call is a fixed point number

```C
typedef struct FixedFour FixedFour;
struct FixedFour {
    unsigned int fraction : 4;
    unsigned int decimal : 28;
};
```

---

## Doing Math

* But how do we perform math on our fixed point number?
* Let's consider 0.5
  * Fraction field is 8
    * Recall that each step is $\frac{1}{16} = 0.0625$
  * Decimal field is 0

---

## Addition and Multiplication

* If we added two structs equal to 0.5, the fraction field would overflow
  * And if the overflow went into the decimal portion, that would be correct
* So why don't we treat the two fields as a single number when doing math?

---

## Fixed Point Type

```C
typedef struct FixedFourStorage FixedFourStorage;
struct FixedFourStorage {
    unsigned int fraction : 4;
    unsigned int value : 28;
};

// If 0x1000 is 0.5, then 0x0001 must be 0.0625
// Always print with width 4 when using printf
const unsigned int ffour_fraction = 625;

typedef union FixedFour FixedFour;
union FixedFour {
    FixedFourStorage storage;
    // Used for math
    unsigned int raw;
};
```

---

## Math

* Addition and subtraction are unchanged
* But things go wrong if we multiply or divide
  * 0.5 * 0.5 should be 0.25
  * But b1000 * b1000 won't be b0100
* Why? Because multiply treats the lowest bit as the 1s place, but it should be the 0.0625s place
* Solution? Right shift by 4 after multiplication

---

## Fixed Point Multiply

```C
// Create 10.5 and 0.5. When multiplying, rescale afterwards
FixedFour ff = {.storage.value = 10, .storage.fraction = 1};
FixedFour other = {.storage.value = 0, .storage.fraction = 0x1<<3};
ff.raw = (ff.raw * other.raw) >> 4;
```

* Division is similar, but we need to upscale first before dividing or we lose precision
  * `result.raw = ((a.raw << 4) / b.raw);`

---

## Fixed Point Math

```C
#include <stdio.h>
#include <stdlib.h>

typedef struct FixedFourStorage FixedFourStorage;
struct FixedFourStorage {
    unsigned int fraction : 4;
    unsigned int value : 28;
};

// If 0x1000 is 0.5, then 0x0001 must be 0.0625
// Always print with width 4 when using printf
const unsigned int ffour_fraction = 625;

typedef union FixedFour FixedFour;
union FixedFour {
    FixedFourStorage storage;
    // Used for math
    unsigned int raw;
};

int main(void) {
    FixedFour ff = {.storage.value = 10, .storage.fraction = 1};

// Print with %04u means pad with leading zeros to width 4
    printf("ff value is %i.%04u\n", ff.storage.value, ff.storage.fraction * ffour_fraction);

FixedFour other = {.storage.value = 10, .storage.fraction = 1};
    ff.raw = ff.raw + other.raw;
    printf("Adding with another value.\n");
    printf("ff value is %i.%04u\n", ff.storage.value, ff.storage.fraction * ffour_fraction);

printf("Tripling.\n");
    ff.raw *= 3;
    printf("ff value is %i.%04u\n", ff.storage.value, ff.storage.fraction * ffour_fraction);

printf("Dividing by 7.\n");
    ff.raw /= 7;
    printf("ff value is %i.%04u\n", ff.storage.value, ff.storage.fraction * ffour_fraction);

printf("Multiplying by the other fixed point with value 0.5, then rescaling.\n");
    other.storage.value = 0;
    other.storage.fraction = 0x1 << 3;
    ff.raw = (ff.raw * other.raw) >> 4;
    printf("ff value is %i.%04u\n", ff.storage.value, ff.storage.fraction * ffour_fraction);

printf("Multiplying by -1.\n");
    other.storage.value = -1;
    other.storage.fraction = 0;
    ff.raw = (ff.raw * other.raw) >> 4;
    printf("ff value is %i.%04u\n", ff.storage.value, ff.storage.fraction * ffour_fraction);

return 0;
}
```

---

## Output

<pre>
ff value is 10.0625
Adding with another value.
ff value is 20.1250
Tripling.
ff value is 60.3750
Dividing by 7.
ff value is 8.6250
Multiplying by the other fixed point with value 0.5, then rescaling.
ff value is 4.3125
Multiplying by -1.
ff value is 16777211.6875
</pre>

---

## Negatives

* Our math fails on negatives
* When we right shift, we always lose the upper bits
  * Obviously wrong with negative numbers
* We could add a sign bit to fix this

---

## Negative Divisions

* Speaking of sign bits
  * When we divide, we don't want the math to use the MSB
  * It changes the number, and thus the math
  * So we have to mask it out before dividing

---

## Floats

* We could keep going, but doesn't this seem complicated?
* Is this how floating point values are implemented in C?
  * Not at all!

---

## Fixed Point Disadvantages

* Poor range; less than an int
* Poor precision; gave up 4 bits for increments of 0.0625
* Floating point numbers are the prefered alternative
  * More complicated than fixed point
  * Should be easier to understand if you understand fixed points

---

## Float Union

```C
#include <stdio.h>
#include <stdlib.h>

typedef struct FloatBits FloatBits;
struct FloatBits {
    unsigned int significand : 23;
    unsigned int exponent : 8;
    unsigned int sign : 1;
};

typedef union FloatIntBits FloatIntBits;
union FloatIntBits {
    float the_float;
    int the_int;
    FloatBits the_bits;
};

int main(int argc, char** argv) {
    if (argc < 2) {
        printf("Give me a float!");
        return 0;
    }

FloatIntBits fib = {.the_float = atof(argv[1])};

printf("The size of fib is %lu\n", sizeof(fib));
    printf("The size of FloatBits is %lu\n", sizeof(FloatBits));
    printf("The sign is %u, the exponent is %u, and the significand is %u\n", fib.the_bits.sign, fib.the_bits.exponent, fib.the_bits.significand);

printf("%f as an int is %i\n", fib.the_float, fib.the_int);
    return 0;
}
```

---

## Next Topics

* We'll go in depth on floating point number
  * This will finally finish chapter 2 in the text book