# CS 530 - Lecture 01

## Principles of AI

Bernhard Firner

2026-01-20

---

## Course Details

* My email: `bfirner@cs.rutgers.edu`
* Canvas
* Office: Hill 273
* Office hours: TBD

---

## Syllabus: Book

* "Artificial Intelligence: A Modern Approach," fourth edition, by Russel and Norvig.
  * Recommended
  * Less depth than focused texts, but broad
  * Gives a good starting point
* I also recommend "The Mind of a Bee," by Lars Chittka
  * Bees have around 1 million neurons and can outperform anything we can build

---

## Syllabus: Topics

* Follows CS 440 or CS520
* Precedes more advances courses
  * CS [533](https://www.cs.rutgers.edu/academics/graduate/m-s-program/course-synopses/course-details/16-198-533-natural-language-processing) (natural language processing)
  * CS [535](https://www.cs.rutgers.edu/academics/graduate/m-s-program/course-synopses/course-details/16-198-535-pattern-recognition-theory-and-applications) (pattern recognition)
  * CS [536](https://www.cs.rutgers.edu/academics/graduate/m-s-program/course-synopses/course-details/16-198-536-machine-learning) (machine learning)
* Less focus on *implementation*, more focus on practical *application*

---

## Philosophy

* The topics may, at times, feel slightly philosophical
* They will also be driven by my own personal experiences
  * Mostly self-driving cars: [https://arxiv.org/pdf/2010.08776](https://arxiv.org/pdf/2010.08776)

---

## Syllabus: Tools and Languages

* Python, Scikit, and PyTorch
  * [https://scikit-learn.org/stable/](https://scikit-learn.org/stable/)
  * [https://numpy.org/doc/stable/reference/index.html#reference](https://numpy.org/doc/stable/reference/index.html)
  * [https://pytorch.org/get-started/locally/](https://pytorch.org/get-started/locally/)
  * [https://docs.pytorch.org/docs/stable/index.html](https://docs.pytorch.org/docs/stable/index.html)
* Farama Foundation Gymnasium for RL Examples
  * [https://github.com/Farama-Foundation/Gymnasium](https://github.com/Farama-Foundation/Gymnasium)

---

## Structure

* Lectures (hopefully interactive)
* Examinations
* Projects
  * Will ask you to evaluate and plot more than implement
  * Final project will feel like preparatory work for a large-scale project

---

## Syllabus: Grading

* 15% Midterm
* 30% Final
* 20% Homeworks
* 35% Final Project and Report

---

## Academic Integrity

* [CS Academic Integrity Policy](https://www.cs.rutgers.edu/academics/undergraduate/academic-integrity-policy)
* Academic Integrity applies to both exams and assignments
* Violating the integrity policy negatively impacts your classmates, and current and future Rutgers students
  * So enforcement will be strict

---

## Assistance on Assignments

* Your work must be your own
* You may ask for advice from other students, on canvas, in recitation and in office hours
  * And you can use online material, including LLMs, to clarify any confusion
  * If you make heavy use of assistance, from a person, tool, or website, you must cite them

---

## Topics

* Introduction to agents and environments (ch 1-2)
* Probability and Planning with Uncertainty (11-14)
* Evaluating Decisions and Actuation (ch 15-16)
* Learning Behaviors from Data (ch 19-23)

---

## Real-World Progress

* Progress in real-world academic and industry research is an arduous journey
  * It takes a deep understanding of the problem domain, available sensors and actuators, and data

---

## Diving In

* Let's start off with agents and environments
  * Chapters 1-2 in Russel and Norvig's Book

---

## Agents

* What are agents?
  * Interactive with an environment
  * They perceive and they act
* What is not an agent?
  * A program that takes in a medical image and prints out a list of diagnoses and probabilities

---

## Agent Difficulties

* Interaction with the environment means that an agent can influence the data that it sees
  * Imagine a self-driving car
    * The current view of the environment is influenced by past steering
* This makes evaluation difficult

---

## Ramifications

* Agent actuation actually changes the environment

---

## Evaluating an Agent

* Let's say that I have a system that predicts the proper steering angle for a vehicle
* Through extensive testing, over multiple continents, weather conditions, and lighting, I can guarantee that the steering angle predicted has an error of at most $\frac{1}{100}$ of a degree
* Is this a good agent?

---

## Evaluating an Agent

* If the agent is *biased*, then it could steer to the left by $\frac{1}{100}$ of a degree at all times
  * It won't be long before we crash
* If the agent is too high-latency, it will not be able to track a curve
  * This will lead to oscillations as well
* Agents must be evaluated on their successful interaction with their environment

---

## Another Example

* I have a lane detection system that is always correct to within 1 cm, out to 300 meters
  * Can I build an autonomous vehicle?

---

## Sufficient

* Lane markers are highly correlated with where vehicles drive
  * But humans ignore them all the time, when necessary
* Obviously we change lanes
  * Also drive around double-parked cars and pedestrians
  * We also drive in places where there are no lane markers
* Perfect knowledge of lanes is insufficient to drive

---

## Environments and Metrics

* It is tempting to simplify a problem by evaluating metrics on the *environment* rather than on the *agent*
  * This divides the problem into multiple pieces
  * Usually a good software engineering approach
* Software engineering is able to decouple components
* But when an agent interacts with an environment, they are tightly coupled

---

## Sensing the Environment

* Cameras, radar, lidar, ultrasonics, GPS, accelerometer, thermometer, barometer, magnetometer, microphone, etc, etc, etc
* Nearly all sensors are *discrete* in time
  * This is a problem, as the world is continuous

---

## Discrete Vs Continuous

* In a game of checkers or chess, squares are either occupied or not
* In the real world, things are more messy
  * Pay attention to how you park, shifting from one side of the space to the other depending upon the adjacent vehicles
* Not only that, moves in chess are instantaneous, you cannot intercept a rook with your pawn at it moves across the board
  * If you've ever attempted to merge in traffic, you must have noticed that driving is continuous in both time and space

---

## Static Vs Dynamic

* This bring up another distinction
  * A chessboard is static, meaning that it does not change as your agent deliberates
    * Also true for some continuous environments, such as a factory
* Not true for driving, cyber intrusion detection, a robotic catheter, etc
* Sometimes we are caught in between
  * Playing chess on a timer, for example, makes the game semi-dynamic

---

## Ramifications

* Actuation is also (generally) a continuous activity
  * e.g. the car keeps moving, unlike a chess piece
* An agent's updates are generally limited by its sensors
  * So, perhaps 30fps for cameras
  * We also need to "fuse" different sensors, despite their different sensing rates

---

## Solutions

* Discrete predictions are ill-suited for continuous environments
  * Steering angles for a car, or instantaneous yaw-pitch-roll commands for a drone
* Instead, we would be better off predicting a continuous path
  * Or a path with enough points on it that we could interpolate
* The agent then issues actuation commands that attempt to follow the path
  * This transforms our discrete predictions into continuous actuation

---

## Forced Decision Cadence

* Physical systems often force decisions at some rate
* For example, self-driving cars must keep steering if they aren't sitting still
* Another example: automatically steered catheters
  * [https://journals.sagepub.com/doi/abs/10.1177/0278364920903785](https://journals.sagepub.com/doi/abs/10.1177/0278364920903785)

---

## Discrete Environments

* Many digital systems are discrete (although still complicated)
* Networks security, such as agents that defend or exploit cyber systems
    * See [https://cage-challenge.github.io/cage-challenge-4/](https://cage-challenge.github.io/cage-challenge-4/)
* Video games

---

## Assembly Line Example

* Sometimes, when we can control an environments, we can simplify a continuous world to make it seem discrete
* Consider a fully-automated assembly line
  * The assembly line can be made to limit activity until each robotic action completes
  * The individual actions may be continuous, but the steps become fully discrete
* This is similar to using clock-driving sequential logic rather than asynchronous logic

---

## Determinism

* Is the environment fully known?
* Many are not
  * Too much complexity to measure
  * Sensors are insufficient
  * The world is stochastic in nature
* Outside of games, most environments are not deterministic

---

## More Details

* Even some things that seem deterministic are not, in practice
* For example, a game using pseudo-random numbers is technically deterministic
  * But, because we do not know the RNG state, it won't seem that way
  * This is a problem of observability

---

## Observability

* Hidden state means that an environment is only partially observable
* Most environments are not entirely observable
  * Any sensor has limited temporal and spatial resolution, for example

---

## Number of Agents

* The number of agents can also make an environments more difficult
  * Single-agent environments are straightforward
  * That means that most problems we want to work on will be multi-agent
* Driving is, once again, a great example

---

## Cooperativity

* Are other drivers cooperative? Competitive?
  * Probably easiest to describe them as a mix
* For a non-game example of a competitive system, consider high-frequency trading agents working on a trading market

---

## Episodic Vs Sequential

* Games reset themselves to a known starting state
  * This makes the decision space easier to explore
* The real world is sequential, with the current state always dependent upon the previous ones
* If possible, we want a way to "reset" an environment, but that is often impossible

---

## Known Vs Unknown

* This is the final attribute to consider
* Do we know all of the rules of our environment?
  * This determines our ability to simulate something
  * And that determines how well reinforcement learning (RL) will work

---

## Known Vs Unknown

* The rules of chess, shogi, and go are all known
  * So, even with huge state spaces, reinforcement learning is effective
* How about driving or the stock market?
  * Anything can happen
* If our simulation is an approximation, then RL can be insufficient
  * And *that* means that we'll have to actually collect real world data

---

## Worst-Case Environment

* In the worst-case, we have to deal with a partially observable, multiagent, non-deterministic, sequential, dynamic, continuous environment with unknown rules
* Notice that this includes autonomous driving, drone delivery systems, robotic housekeepers, and so on
  * I would not expect any of those problems to be fully solved any time soon

---

## Reflex Agents

* Since environments are so complicated, sometimes it makes sense to simplify things
* A **simple reflex agent** only cares about the current state of an environment
  * For many fundamental rules, this is fine
  * Slam on the brakes if the car in front of you suddenly stops, for example

---

## Failures of Reflex Agents

* Reflex agents fail when the environment is not fully observable
* Imagine a cleaning robot that goes into one room and cleans it
  * Now what? It needs to remember if the last room was also clean.
  * If not, it will wander forever, never knowing if its task is complete

---

## Randomness as a Solution

* *Velella velella* is a hydrozoan that drifts on the surface of the ocean
  * [Citizen Science Writeup](https://theoryandpractice.citizenscienceassociation.org/articles/10.5334/cstp.847)
* Pushed by the wind and cannot steer themselves, often ending up beached
  * Some sails steer left while other steer right, so they cannot all suffer the same fate

---

## Planning

* Randomness works with swarm systems, but is a poor choice for most agents
* Planning is a better alternative
  * Evaluate the environment, then formulate some plan
    * Plan could be a path or a sequence of actions
* The agent evaluates its progress, allowing it to identify deficiencies

---

## Metacognition

* This is a term that means "thoughts about thinking"
* Agents must do more than simply interact with an environment
  * They must take feedback from the environment after acting
  * They must hold on to state
* A planning agent is able to evaluate its own actions, going beyond a reflexive agent

---

## Biological Agent Example

* Sphex wasp

---

## Another Driving Example

* Let's say that your desired speed is 65mph, but you are only moving at 55
* $55 < 65$, so you push down the pedal
* You don't go any faster
  * Response?

---

## Possible Responses

* We could evaluate $55 < 65$ in a loop, pushing down the pedal until it is fully depressed
* A better agent would evaluate why we aren't accelerating
  * Maybe we are missing the pedal with our foot?
  * Perhaps we are going up a steep incline and cannot expect to reach 65mph?
  * Or perhaps we have lost traction and we should not accelerate

---

## Exploring the Environment

* If we want to know if our agent properly deals with those kinds of complicated situations, we must explore our environment
  * Here, environment means everything that our agent can experience through its sensors
* We need a good sampling of the environment for testing, at least
* Most modern techniques also require data for training

---

## The Data Problem

* Current machine learning techniques are data hungry
  * Put to shame by biological systems
* A few hundred hours is sufficient training for a bee to gather honey, protect its hive, and build honeycombs
  * Sure, some of that is instinctive behavior, but all of it is fine-tuned for each specific individual
  * And if we knew how to hard-code any "instincts" into our learning systems we would

---

## Course Goal

* Learning a good approaches to AI agent development is desired outcome
  * What problems can AI solve and when does current AI struggle?
  * What can we do to simplify problems so that AI can solve them?
  * What specific AI techniques can we apply to different environments?