# CS 530 - Lecture 02

## Environments and Knowledge

Bernhard Firner

2026-01-22

---

## Review: Environments

* Environments have many attributes
* The most difficult to deal with is a partially observable, multiagent, non-deterministic, sequential, dynamic, continuous environment with unknown rules
  * Big mouthful!
  * The most ambiguous of these is perhaps "rules"
    * i.e. does driving follow rules, or not?

---

## Why Care?

* AI agents are interactive
  * But different AI agents operate in different environments
* It is critical to understand you environment before rushing forward

---

## Review: Agents

* Agents interact with an environment (and perhaps each other)

---

## Planning

* Planning is a key component of any non-reflexive agent
* Enables self-evaluation, required for robust systems
  * Remember the Sphex wasp and the vacuum cleaner
* In order to plan, an agent needs to retain information about its environment
  * More memory increases agent complexity, but often improves performance

---

## Simplest Environments

* We can create a simple environment
  * With enough sensors, there is no hidden state
  * Or, if we restrict the environment, we simplify our task

---

## Simplifying an Environment

* Consider an automatic braking system on a train
* [https://en.wikipedia.org/wiki/Automatic_train_control](https://en.wikipedia.org/wiki/Automatic_train_control)
  * The maximum speed at different parts of the track is precomputed
  * The maximum speed is sent to a train as is approaches a section of track, and the train automatically adjusts
* The agent on the train needs to care about very little, so it is a simple agent

---

## Sensing

* Agents can only have knowledge of their environment from three sources
  * Foreknowledge, in the form of programmed or learned rules
  * Onboard environmental sensors
  * Communication from other agents or an external sensing system
* When we think of autonomous AIs, we usual think of something that operates independently via a camera

---

## Sensing and Environments

* An environment may have more in it than can be sensed
  * For example, our sense of smell has no analogous digital sensor
* The most ubiquitous sensor is a temperature sensor, since it is on almost every circuit
* The most useful, in general, is likely a camera

---

## Domain

* The part of the environment about which an agent needs to express knowledge
  * "The things we need to know about"
* For the train, it just the maximum speed on the track
* For the mother-to-be Sphex wasp, it is the state of its burrow and and the state of its prey

---

## In Domain

* We decided what is "in domain" and "out of domain" when defining agent goals
  * For example, no one expects your vacuum cleaner to be able to drive you to work
  * Also, no one expects your autonomous car to clean your house
* When designing a new system, you have to make difficult choices
  * Should an autonomous car be able to avoid hitting chipmunks?

---

## Domains and Sensors

* An environment will have more in it than your sensors can detect
* An agent's domain is just what it can sense
* For example, a cooking robot may not have a sense of smell
  * That means that its domain could be IR and visible light images along with a microphone
  * It cannot know anything about smell or taste, since it has none, and must act based only on its sensors

---

## Domain Example

* What is the domain of a lane keeping system with a single forward facing camera?
  * Images are 1024x768 in YUV format
  * Field of view is $120^\circ$

---

## Domain Example

* We could say that the domain is all possible values of pixels
  * $1024\times768\times3$
* But that would be overstating the size of the domain
* Instead, it is more realistic to restrict our agent to all images with an asphalt or concrete road

---

## Limitations Imposed by Sensors

* The size of a domain defines two things
  * The maximum capabilities of an agent
  * The maximum complexity of an agent
* This vehicle cannot see to the sides, so safe lane changes are out of domain

---

## Mistakes with Domains

* The first mistake made in many system is to choose the wrong sensors
* A sensor may *feel* sufficient, but we often bring biases
  * When a human glances at a picture, we tend to insert our own knowledge
* Instead of using intuition, we should rigorously quantify what information is present

---

## Example: Camera

* A car has a forward facing camera with $70^\circ$ fov
* How far away can we predict a path along a road?

---

## Example: Camera

* We can first begin by asking what defines a path?
  * Generally, it is lane markers
  * The U.S. Federal Highway Administration set a minimum width of 4 inches
* How many pixels is that?
  * It depends upon the distance

---

## Example: Camera

* Going to simplify a few things with the $70^\circ$ fov, 1024x768 image
* This is a triangle, with the height equal to 4 inches
* At 10 feet, a 4 inch wide object is approximately 28 pixels wide
  * from $atan(4 / 120)$
* By the time we reach 85m/279ft we are down to a single pixel

---

## Lane Sizes

* This is an example image from Kitti

---

## Lane Sizes

* Nearby lanes are clearly multiple pixels wide

---

## Lane Sizes

* We don't get very far before they are single pixels

---

## Information Content

* You may not see it, since your brain inserts more information

---

## Example: Camera

* What does that mean?
  * Path prediction performance will break down quickly after 85m
  * May get bad before then since 1 pixel is a weak signal
* We can use the same approach to estimate how far away we could detect humans, animals, road signs, etc
* The first step of any research project is to quantify information content of data

---

## Impacts of Data

* Notice that data quality affects all parts of our system
  * Yes, the ML part of it
  * Also the labelling part
* If the labels are off by 10 cm, what does that do to our ML predictions?

---

## Noise

* This brings us to noise and uncertainty
  * If the labels are off by 10cm, does it matter?
  * Is the noise unbiased and gaussian?

---

## Is Noise Bad?

* Noise in the labels can be awful, or it can be no big deal
* Let's make an example:
  * Take MNIST Digits and train a ResNet
  * Will look at results with clean data and 10% label errors
    * 10% of training labels are randomly changed to a different digit

---

---

---

---

## Noise

* Noise isn't necessarily a disaster
  * But it can be difficult to guarantee that it is unbiased
* Sometimes you can redefine your labels to avoid labelling problems

---

## Example System

* An example of a successful system will hopefully drive some lessons home
  * The first is a purely reflexive agent
* Pay attention to how the labels are defined and how data is collected
  * And pay attention to what makes those labels usable

---

## Obstacle Avoidance

* [Off-Road Obstacle Avoidance through End-to-End Learning](https://proceedings.neurips.cc/paper/2005/hash/fdf1bc5669e8ff5ba45d02fded729feb-Abstract.html)
* [https://cs.nyu.edu/~yann/research/dave/](https://cs.nyu.edu/~yann/research/dave/)

---

## Description

* Goal is obstacle avoidance in off-road environments
* A human drove the vehicle at 2m/s, and steering angles were recorded
* 2 front-facing cameras, recording 320x240 images at 15fps

---

## Data Collection Rig

* This is from Yann LeCun's website, and the simplicity of the setup is informative

---

## What is the Environment?

* Partially observable, single-agent, deterministic, sequential, dynamic, continuous, with known rules
  * Deterministic: There are no other agents in the data, and the environment is static
  * Known rules: as long as the vehicle isn't placed on a slippery hill, the physics during training will match testing

---

## Sensors

* The image quality is awful since the cameras are wireless
  * But the vehicle makes a new decision at each frame

---

## Data

* 17 days of data collection
* 127,000 frame pairs
  * 95k training
  * 31.8k testing

---

## Training Target

* The training target is simply the steering angle
* So what needs to be done to create labels?
  * Nothing

---

## Simplified Labelling

* Notice that a slightly different environment would make that label impossible
* For example, the correct steering angle depends upon the vehicle speed
* The steering angles must also be consistent
  * So the driver must adhere to a set of rules

---

## Disadvantages

* If the environment did not have consistent rules, what would we do?
* Two examples:
  * Vehicle speed is inconsistent due to inclines
  * Vehicle turning is not as commanded due to slippery surfaces

---

## Solutions

* We could solve those problems by adding memory to the system
  * Like the Sphex wasp, we need to remember what we were trying to do, and evaluate whether or not we did it
* This gets into planning in non-deterministic environments
  * Our next topic!

---

## One More Point

* There is one more important point from this paper
* The authors binned outputs into "left", "straight", and "right"
  * Error was 25.1% on training set, 35.8% on the test set
  * Wildly inaccurate measure of system performance

---

## Agent Evaluation

* As stated before, agent evaluation is difficult
  * If our evaluation metrics are so bad, how can we establish a measure of trust?
* This has similarities to the [Chinese Room](https://en.wikipedia.org/wiki/Chinese_room) posed by philosophers
* We will eventually revisit the idea of trust and verifiability in more detail

<!--

## Sensor Advantages

* A few sensors do things that humans don't do
  * LIDAR shoots laser, radar uses radio waves, IR cameras sense heat at night
* Sensors have advantages and disadvantages
  * But starting with the right sensor for a task makes it easier to build a robust agent
-->

<!--

Examples of downed pedestrians, pixels, angles, blur, etc.

Bats and high-frequency sound.

Domain and testing. Domains with many different regions or long tail event distributions are difficult to test

Which leads to question about testing and trust in a system. Chinese room and questions about "knowledge". Dynamic vs static systems, progressive or algorithmic progression.

-->

<!--
What are agents? How do we describe an environment? What is "knowledge" and how can we trust AI when using it in never before seen scenarios?

Discrete vs continuous environments. Deterministic vs stochastic environments. Uncertainty and its sources: data fidelity, label limitations, sampling shortfalls, and hidden states. Testing and evaluation in different environments.
-->

<!--
TODO:
Give an example of a reflexive agent. Talk about reflexive muscles by discussing the moonwalking descending neuron in flies. See if you can find something about how reflexive actions mess up landing on zebras.

Go over DAVE. Show some videos.

Off-Road Obstacle Avoidance through End-to-End Learning
Urs Muller, Jan Ben, Eric Cosatto, Beat Flepp, Yann L. Cun

Advances in Neural Information Processing Systems 18 (NIPS 2005)

https://proceedings.neurips.cc/paper/2005/hash/fdf1bc5669e8ff5ba45d02fded729feb-Abstract.html
-->