# CS 530 - Lecture 02 ## Environments and Knowledge Bernhard Firner 2026-01-22 --- ## Review: Environments * Environments have many attributes * The most difficult to deal with is a partially observable, multiagent, non-deterministic, sequential, dynamic, continuous environment with unknown rules * Big mouthful! * The most ambiguous of these is perhaps "rules" * i.e. does driving follow rules, or not? --- ## Why Care? * AI agents are interactive * But different AI agents operate in different environments * It is critical to understand you environment before rushing forward --- ## Review: Agents * Agents interact with an environment (and perhaps each other)
--- ## Planning * Planning is a key component of any non-reflexive agent * Enables self-evaluation, required for robust systems * Remember the Sphex wasp and the vacuum cleaner * In order to plan, an agent needs to retain information about its environment * More memory increases agent complexity, but often improves performance --- ## Simplest Environments * We can create a simple environment * With enough sensors, there is no hidden state * Or, if we restrict the environment, we simplify our task --- ## Simplifying an Environment * Consider an automatic braking system on a train * [https://en.wikipedia.org/wiki/Automatic_train_control](https://en.wikipedia.org/wiki/Automatic_train_control) * The maximum speed at different parts of the track is precomputed * The maximum speed is sent to a train as is approaches a section of track, and the train automatically adjusts * The agent on the train needs to care about very little, so it is a simple agent --- ## Sensing * Agents can only have knowledge of their environment from three sources * Foreknowledge, in the form of programmed or learned rules * Onboard environmental sensors * Communication from other agents or an external sensing system * When we think of autonomous AIs, we usual think of something that operates independently via a camera --- ## Sensing and Environments * An environment may have more in it than can be sensed * For example, our sense of smell has no analogous digital sensor * The most ubiquitous sensor is a temperature sensor, since it is on almost every circuit * The most useful, in general, is likely a camera --- ## Domain * The part of the environment about which an agent needs to express knowledge * "The things we need to know about" * For the train, it just the maximum speed on the track * For the mother-to-be Sphex wasp, it is the state of its burrow and and the state of its prey --- ## In Domain * We decided what is "in domain" and "out of domain" when defining agent goals * For example, no one expects your vacuum cleaner to be able to drive you to work * Also, no one expects your autonomous car to clean your house * When designing a new system, you have to make difficult choices * Should an autonomous car be able to avoid hitting chipmunks? --- ## Domains and Sensors * An environment will have more in it than your sensors can detect * An agent's domain is just what it can sense * For example, a cooking robot may not have a sense of smell * That means that its domain could be IR and visible light images along with a microphone * It cannot know anything about smell or taste, since it has none, and must act based only on its sensors --- ## Domain Example * What is the domain of a lane keeping system with a single forward facing camera? * Images are 1024x768 in YUV format * Field of view is $120^\circ$ --- ## Domain Example * We could say that the domain is all possible values of pixels * $1024\times768\times3$ * But that would be overstating the size of the domain * Instead, it is more realistic to restrict our agent to all images with an asphalt or concrete road --- ## Limitations Imposed by Sensors * The size of a domain defines two things * The maximum capabilities of an agent * The maximum complexity of an agent * This vehicle cannot see to the sides, so safe lane changes are out of domain --- ## Mistakes with Domains * The first mistake made in many system is to choose the wrong sensors * A sensor may *feel* sufficient, but we often bring biases * When a human glances at a picture, we tend to insert our own knowledge * Instead of using intuition, we should rigorously quantify what information is present --- ## Example: Camera * A car has a forward facing camera with $70^\circ$ fov * How far away can we predict a path along a road? --- ## Example: Camera * We can first begin by asking what defines a path? * Generally, it is lane markers * The U.S. Federal Highway Administration set a minimum width of 4 inches * How many pixels is that? * It depends upon the distance --- ## Example: Camera * Going to simplify a few things with the $70^\circ$ fov, 1024x768 image * This is a triangle, with the height equal to 4 inches * At 10 feet, a 4 inch wide object is approximately 28 pixels wide * from $atan(4 / 120)$ * By the time we reach 85m/279ft we are down to a single pixel --- ## Lane Sizes * This is an example image from Kitti
--- ## Lane Sizes * Nearby lanes are clearly multiple pixels wide
--- ## Lane Sizes * We don't get very far before they are single pixels
--- ## Information Content * You may not see it, since your brain inserts more information
--- ## Example: Camera * What does that mean? * Path prediction performance will break down quickly after 85m * May get bad before then since 1 pixel is a weak signal * We can use the same approach to estimate how far away we could detect humans, animals, road signs, etc * The first step of any research project is to quantify information content of data --- ## Impacts of Data * Notice that data quality affects all parts of our system * Yes, the ML part of it * Also the labelling part * If the labels are off by 10 cm, what does that do to our ML predictions? --- ## Noise * This brings us to noise and uncertainty * If the labels are off by 10cm, does it matter? * Is the noise unbiased and gaussian? --- ## Is Noise Bad? * Noise in the labels can be awful, or it can be no big deal * Let's make an example: * Take MNIST Digits and train a ResNet * Will look at results with clean data and 10% label errors * 10% of training labels are randomly changed to a different digit ---
---
---
--- ## Noise * Noise isn't necessarily a disaster * But it can be difficult to guarantee that it is unbiased * Sometimes you can redefine your labels to avoid labelling problems --- ## Example System * An example of a successful system will hopefully drive some lessons home * The first is a purely reflexive agent * Pay attention to how the labels are defined and how data is collected * And pay attention to what makes those labels usable --- ## Obstacle Avoidance * [Off-Road Obstacle Avoidance through End-to-End Learning](https://proceedings.neurips.cc/paper/2005/hash/fdf1bc5669e8ff5ba45d02fded729feb-Abstract.html) * [https://cs.nyu.edu/~yann/research/dave/](https://cs.nyu.edu/~yann/research/dave/)
--- ## Description * Goal is obstacle avoidance in off-road environments * A human drove the vehicle at 2m/s, and steering angles were recorded * 2 front-facing cameras, recording 320x240 images at 15fps --- ## Data Collection Rig * This is from Yann LeCun's website, and the simplicity of the setup is informative
--- ## What is the Environment? * Partially observable, single-agent, deterministic, sequential, dynamic, continuous, with known rules * Deterministic: There are no other agents in the data, and the environment is static * Known rules: as long as the vehicle isn't placed on a slippery hill, the physics during training will match testing --- ## Sensors * The image quality is awful since the cameras are wireless * But the vehicle makes a new decision at each frame
--- ## Data * 17 days of data collection * 127,000 frame pairs * 95k training * 31.8k testing --- ## Training Target * The training target is simply the steering angle * So what needs to be done to create labels? * Nothing --- ## Simplified Labelling * Notice that a slightly different environment would make that label impossible * For example, the correct steering angle depends upon the vehicle speed * The steering angles must also be consistent * So the driver must adhere to a set of rules --- ## Disadvantages * If the environment did not have consistent rules, what would we do? * Two examples: * Vehicle speed is inconsistent due to inclines * Vehicle turning is not as commanded due to slippery surfaces --- ## Solutions * We could solve those problems by adding memory to the system * Like the Sphex wasp, we need to remember what we were trying to do, and evaluate whether or not we did it * This gets into planning in non-deterministic environments * Our next topic! --- ## One More Point * There is one more important point from this paper * The authors binned outputs into "left", "straight", and "right" * Error was 25.1% on training set, 35.8% on the test set * Wildly inaccurate measure of system performance --- ## Agent Evaluation * As stated before, agent evaluation is difficult * If our evaluation metrics are so bad, how can we establish a measure of trust? * This has similarities to the [Chinese Room](https://en.wikipedia.org/wiki/Chinese_room) posed by philosophers * We will eventually revisit the idea of trust and verifiability in more detail