# CS 211 - Lecture 24 ## Computer Architecture ### Digital Logic Design Bernhard Firner 2026-04-22 --- ## Topic Intro * Not enough time to dive deeply into this topic * So why bother? * Give you a sense of "what is happening" when thinking about hardware * Move some earlier topics from the realm of memorization into the realm of understanding --- ## Types of Logic * We separate logic into two categories * combinational: stateless * sequential: has state * Let's begin with combinational * Some of this may be a review if you remember your truth tables --- ## Combinational Logic * Stateless part of our systems use combinational logic * Stateless means that an output only depends upon its current inputs * Recall the adder example? This is stateless --- ## Addition * Addition is done one bit at a time * This is called a *half adder* because it doesn't handle carry in
--- ## Full Adder * Need to also handle carry in from the previous bit * This is called a *full adder*
--- ## Critical Path * Notice that this logic uses more gates? * The pathways are also deeper * The longest path is the *critical path*, and will determine the overall delay of the circuit * The delay here is 2 gates
--- ## Ripple Carry Adder * Just cascade the one bit adders together to an arbitrary precision * No memory exists in this circuit, just the current inputs
--- ## Details * We could start making things complicated * For example, voltage takes some time to propagate across a gate * Each of those full adders is full of gates, so this adds up * The *critical path* was 2 for each full adder, but the longest pathway here isn't 8 --- ## Critical Path * It takes 2 gates to get the first carry out * but by then all of the adders are waiting for the carry input * With 4 full adders, the first has a delay of 2 gates * Then the carry bit moves through the rest in 3 more gates * So the critical path is 5 gates
--- ## Adder * Let's go back to a full adder and describe it with a truth table
Input A
Input B
Sum
Carry
0
0
0
0
0
1
1
0
1
0
1
0
1
1
0
1
* sum = A xor B * carry = A and B --- ## All Gate Symbols
--- ## Adder with Carry
* Our adder needs a carry input * Things are getting complicated already!
Input A
Input B
Carry In
Sum
Carry Out
0
0
0
0
0
0
0
1
1
0
0
1
0
1
0
0
1
1
0
1
1
0
0
1
0
1
0
1
0
1
1
1
0
0
1
1
1
1
1
1
--- ## Complexity and Optimization
* We could muddle our way through this * First represent as an equation * $Sum = (C_{in} \land \bar{B} \land \bar{A}) \lor (B \land \overline{C_{in}} \land \bar{A}) ...$ * $Sum = (C_{in} \oplus A \oplus B)$ * $\land$ means logical and, $\lor$ is logical or, $\oplus$ is xor, and a line is negation * Is that minimization obvious?
Input A
Input B
Carry In
Sum
Carry Out
0
0
0
0
0
0
0
1
1
0
0
1
0
1
0
0
1
1
0
1
1
0
0
1
0
1
0
1
0
1
1
1
0
0
1
1
1
1
1
1
--- ## Circuit Minimization * It turns out that circuit minimization is difficult * There are some algorithms we could learn * Karnough maps, Quine-McCluskey algorithm, etc * In practice, humans use software to do our circuit minimization * So with limited time, let's focus on other fundamentals --- ## Improved Addition * Is there a minimized circuit for addition of multiple bits? * Of course! * We need the carry in to calculate the sum, and on the last bit that causes the delay * So let's shorten this circuit
--- ## Propagate or Generate * Let's describe the carry out behavior for an adder as either generating a carry out, or propagating a carry out * If A and B are both 1, then the adder has a carry out * If either A or B are 1, then the adder will propagate a carry out if there is a carry in * Call those $P_0$ and $G_0$ * Can we calculate the second adder's output faster now? --- ## Lookahead * $C_1 = A_0B_0 + (A_0\oplus B_0)C_0 = P_0 + G_0C_0$ * $S_1 = C_1 \oplus (A_1 \oplus B_1) = C_1 \oplus P_1$ * $C_2 = A_1B_1 + (A_1 \oplus B_1)G_0 + (A_1 \oplus B_1)(A_0 \oplus B_0)C_0 * = A_1B_1 + P_1G_0 + P_1P_0C_0$ * The values from P and G have all of the information from the carry, but are available 1 gate sooner * This is called a *carry-lookahead adder* --- ## Carry-Lookahead Adder
* We are trading off circuit elements for speed * Trace the longest paths and you will see that they do not grow * We saved 1 step at each full adder * 4 steps for sum bits other than $S_0$
From
wikimedia
--- ## Numbers
* P and G took 1 gate to calculate * And the sum calculation is 1 more gate once the carry is available * The carry lookahead calculations take multiple AND gates, followed by an OR gate
From
wikimedia
--- ## Assumptions
* We are assuming that the OR gate can take more than 2 inputs * The input count is the *fan-in* of that gate * In real hardware, it is not infinite * So eventually we will need to nest the OR gates * But for small numbers of bits, lookahead adders have constant delays
From
wikimedia
--- ## Why Logic Gates? * You may wonder why we need to make this complicated * If we just added those 1s directly, we wouldn't need logic gates * Except that isn't how voltage works * The implementation of something else would be even more difficult --- ## Voltage and Logic * Your computer, phone, watch, toaster, etc use two logic states * Voltage low (or ground) * Voltage high (or Vcc) * This is a fundamental of digital circuits --- ## Digital Vs Analog * So why not use analog? * Can you add Vcc + Vcc and get 2Vcc? * No. That's not how voltage works. * Maybe if our building blocks were amplifiers, but that would be a tough system to build * Our fundamental building block is the transistor --- ## Transistors * Transistors have 3 terminals, with one terminal acting as a control switch * I'm simplifying a lot here, there are other uses and form factors * One input controls the transistor state * Think of it as one switch and two terminals for output levels * We are actually changing the conductivity of the transistor junction * Take a class on transport theory to learn more! --- ## Gates * We assemble our gates with transistors, so they naturally work with bistate logic * But it is easier to fabricate systems with a subset of the different logic gates * Homogeneity simplifies the hardware design process --- ## Universal Gates * We do not need the complete set of possible gates to represent all of our digital logic * The subset of AND, OR, and NOT is sufficient * That makes them a "universal set"
--- ## Proof
| C | B | A | Out | | :----: | :---: | :---: | :---: | | 0 | 0 | 0 | 1 | | 0 | 0 | 1 | 0 | | 0 | 1 | 0 | 1 | | 0 | 1 | 1 | 0 | | 1 | 0 | 0 | 0 | | 1 | 0 | 1 | 0 | | 1 | 1 | 0 | 1 | | 1 | 1 | 1 | 0 |
* We can write a truth table for any expression * Let's say we want to express this one * We can **always** write a truth table as a sum of products * AND together the elements of each row where the output is 1 * OR those together to get a possible circuit
--- ## Proof
| C | B | A | Out | | :----: | :---: | :---: | :---: | | 0 | 0 | 0 | 1 | | 0 | 0 | 1 | 0 | | 0 | 1 | 0 | 1 | | 0 | 1 | 1 | 0 | | 1 | 0 | 0 | 0 | | 1 | 0 | 1 | 0 | | 1 | 1 | 0 | 1 | | 1 | 1 | 1 | 0 |
* Products: * $\overline{C}~ AND~ \overline{B}~ AND ~\overline{A}$ * $\overline{C}~ AND ~B ~AND ~\overline{A}$ * $C ~AND ~B ~AND ~\overline{A}$ * Sum * $\overline{CBA} ~ OR ~\overline{C}B\overline{A} ~ OR ~ CB\overline{A}$ * Often written * $\overline{CBA} + \overline{C}B\overline{A} + CB\overline{A}$ * The sum of products is called the *disjunctive normal form*
--- ## Other Options * But maybe our manufacturing processes more easily fabricate a single type of gate * In that case, either NAND or NOR are also universal
--- ## Demonstration * We can simply show that the other gates are representable with NAND and NOR
| x | y | NAND | | :----: | :---: | :---: | | 0 | 0 | 1 | | 0 | 1 | 1 | | 1 | 0 | 1 | | 1 | 1 | 0 |
| x | y | NOR | | :----: | :---: | :---: | | 0 | 0 | 1 | | 0 | 1 | 0 | | 1 | 0 | 0 | | 1 | 1 | 0 |
--- ## Not Representation
| A | A | NAND A| | :----: | :---: | :---: | | 0 | 0 | 1 | | 1 | 1 | 0 |
| A | A | NOR A | | :----: | :---: | :---: | | 0 | 0 | 0 | | 1 | 1 | 1 |
--- ## OR Representation
| A | B | NAND A | NAND B | (NAND A) NAND (NAND B) | | :----: | :---: | :---: | :---: | :---: | | 0 | 0 | 1 | 1 | 0 | | 0 | 1 | 1 | 0 | 1 | | 1 | 0 | 0 | 1 | 1 | | 1 | 1 | 0 | 0 | 1 |
| A | B | A NOR B | NOR(A NOR B) | | :----: | :---: | :---: | :---: | | 0 | 0 | 1 | 0 | | 0 | 1 | 0 | 1 | | 1 | 0 | 0 | 1 | | 1 | 1 | 0 | 1 |
--- ## AND Representation
| A | B | A NAND B | NAND(A NAND B) | | :----: | :---: | :---: | :---: | | 0 | 0 | 1 | 0 | | 0 | 1 | 1 | 0 | | 1 | 0 | 1 | 0 | | 1 | 1 | 0 | 1 |
| A | B | NOR A | NOR B | (NOR A) NOR (NOR B) | | :----: | :---: | :---: | :---: | :---: | | 0 | 0 | 1 | 1 | 0 | | 0 | 1 | 1 | 0 | 0 | | 1 | 0 | 0 | 1 | 0 | | 1 | 1 | 0 | 0 | 1 |
--- ## Important Logic * A few structures tend to repeat * ALUs * Multiplexers * Decoders --- ## Multiplexer * Just call it a MUX * Allows us to select one output from several * Here $S$ is the selector * Turn A "off" if S is set to 1, B "off" is S is not set to 1
--- ## Using Muxes * Muxes can select from more inputs * For $2^n$ inputs, we need $n$ select inputs * FPGA (field programmable gate arrays) are programmable hardware * They use MUXes as part of their programmable logic, along with look up tables, adders, and a flip-flop * Look up tables are also implemented with a MUX * "Programming" them means setting all of the control inputs --- ## Logic With A MUX
* Let's say that I have three input signals, A, B, and C * I want a signal to be 1 with certain combinations * For example, maybe these are bits from an opcode that will select how the ALU will be used * We could write out the truth table and then implement that circuit
| C | B | A | Out | | :----: | :---: | :---: | :---: | | 0 | 0 | 0 | 1 | | 0 | 0 | 1 | 0 | | 0 | 1 | 0 | 1 | | 0 | 1 | 1 | 0 | | 1 | 0 | 0 | 0 | | 1 | 0 | 1 | 0 | | 1 | 1 | 0 | 1 | | 1 | 1 | 1 | 0 |
--- ## Logic With A MUX
* Instead of putting down individual logic gates, I could re-use a multiplexer * We can use A, B, and C as the select inputs to a MUX * And then we hard-wire the inputs to Vcc or ground
| C | B | A | Out | | :----: | :---: | :---: | :---: | | 0 | 0 | 0 | 1 | | 0 | 0 | 1 | 0 | | 0 | 1 | 0 | 1 | | 0 | 1 | 1 | 0 | | 1 | 0 | 0 | 0 | | 1 | 0 | 1 | 0 | | 1 | 1 | 0 | 1 | | 1 | 1 | 1 | 0 |
--- ## MUX Look Up Table
| C | B | A | Out | | :----: | :---: | :---: | :---: | | 0 | 0 | 0 | 1 | | 0 | 0 | 1 | 0 | | 0 | 1 | 0 | 1 | | 0 | 1 | 1 | 0 | | 1 | 0 | 0 | 0 | | 1 | 0 | 1 | 0 | | 1 | 1 | 0 | 1 | | 1 | 1 | 1 | 0 |
--- ## FPGAs * Logic represented by MUXes is slower than using the gates directly * e.g. implementing AND with MUXes is a waste * But it is still faster than a software implementation * Field Programmable Gate Arrays (FPGAs) use repeated elements, like muxes, to represent arbitrary combinational logic --- ## FPGA Architectures * There are a lot of special purpose systems that use FPGAs or similar devices * Used where there are too many settings for one piece of hardware, but speed is still important * reconfigurable wireless radios, for example * The most popular FPGA maker was Xilinx * They have been a part of AMD since their acquisition in 2022 --- ## FPGAs Example * FPGAs are generally hyper-focused on a particular task * [This one](https://www.amd.com/en/products/adaptive-socs-and-fpgas/fpga/artix-ultrascale-plus-xa.html) is optimized for automotive applications * They are CPUs and clock rates, but their performance is also measured by the number of look up tables and logic elements available * Multiplexors are used to "glue" different elements together, taking the place of arbitrary logic programmed into the system * But let's keep going through a tour of other combinational logic --- ## Decoder * A decoder converts $n$ inputs to $2^n$ unique outputs * These are often used when turning opcodes into signal to control other parts of the CPU * For example, selecting which input to a MUX goes into the ALU
--- ## Stacked Decoders * We can stack multiple decoders together * Notice that only one output is active at a time because of the NOT gates followed by AND * For another example, this could be used in a cache to select one set out of four given a set address
--- ## Tying Things Together * Why should you care? * A decoder is an excellent way to convert an address to a single selector * You might see this in a cache, for example * Think about set selection during cache access * What other circuit elements have we been using? --- ## Comparison * Remember how the `cmpq` instead was implemented with subtraction? * We can compare tags in a cache by: * Taking the 2's complement of the tag * Adding it to the tag address from a line * If the sum is still negative, then the tag is larger * But if we only care about equality, just check for all 0s and no carry out --- ## Practical Concerns * Notice how more complex logic requires deeper nesting of gates? * It takes actual time for voltage to stabilize * This is why our systems cannot run infinitely fast * Physics is holding us back * Delays are a combination of propagation delays, and voltage settling times --- ## Next Time * Hopefully this is helping to make computers feel less abstract * We'll talk more about time and delays when we cover sequential logic