This is a “hello world” for neural networks. You will load some numbers, put them into matrices, and do some multiplication. This homework also verifies that you can run a Python script and can run a provided test script. Failure to master those skills will negatively impact your ability to keep up with material in this class.
An archive (in tgz format) is provided. It has the following contents:
All pt files can be loaded with torch.load. For example:
w0 = torch.load("weight0.pt")The arrays were saved with torch.save, so they can be loaded with torch.load. When your program is invoked, you may assume that pt are in the same directory (for your own testing, just extract the provided archive in the same directory).
Your program will take two arguments. The first is the filename to load for the X values, and the second is the filename to load for Y values.
Load the weight and bias values and reconstruct the network operation. A linear layer in torch is described here: https://docs.pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear
To summarize, the operation of a linear layer is: y = xAT + b
A is the matrix of weights, and b is the bias. Follow that equation, using ReLU activation units after the hidden layers (the first two). Once you have calculated the network output, calculate the mean squared error relative to Y and print it with 2 digit precision:
print(f'Error is {output:.2f}')Name your program “hw01.py” and submit it through canvas.
Your code must contain an author block at the top of the file. The author block must match the following format:
# Author: your name
# Netid: your netid
# Aid: If in doubt, list anything your referenced. A stack overflow post, chatgpt, gemini, your friend Bob, etc. If referring to an LLM, you must included a link to a particular session. Just saying "I asked chatgpt" isn't a get out of trouble free statement. If two submissions are nearly identical and nothing is listed, I will have to assume that there was copying with an attempt at obfuscation.$ python3 hw01.py X1.pt Y1.pt Error is 0.00 $ python3 hw01.py X2.pt Y2.pt Error is 7.14 $ python3 hw01.py X3.pt Y3.pt Error is 0.99
The first X and Y are the training data. The second one goes outside of the training range (it is from 0 to 2 pi instead of 0 to pi). The third is a sin function instead of the cos function that was used for training. Notice that going outside of the training range results in the largest error; outputs outside of the training range can take on arbitrary values, and are often quite large or small.
Due to the class size, grading will be automated whenever possible. In this case, you can run the grade_hw1.sh script to verify that your program has the expected error for the three provided tests.