AI Safety Policy Fellowship

CAISH is running a 6-week reading group on the foundational policy and governance issues posed by the development of advanced AI systems. This includes issues like AI-induced explosive growth, preventing the proliferation of dangerous AI models, and avoiding outcomes where AI development concentrates unprecedented wealth in a single company. We meet weekly in small groups led by experienced facilitators, with dinner provided.

Applications for the Spring 2024 program open now. Deadline: 26 Jan 23:59 GMT.

Apply before 26 Jan

Fellowship Facilitator Application

The curriculum

The curriculum is based on the AI Governance Fundamentals program developed by Ben Garfinkel (Director of Center for the Governance of AI) and others.

Many leading AI scientists and entrepreneurs have called for mitigating “the risk of extinction from AI” to be “a global priority alongside other societal-scale risks such as pandemics and nuclear war.” This week, we delve into what exactly is the alignment problem.
This week, we examine what is used to predict the timeline of AI development and how reliable they are. We discuss the use of biological anchors, compute trends and ML algorithm trends as constraints to AI development.
We discuss the potential consequences of an intelligence explosion, including the compute-centric framework of AI takeoff speeds, what happens to the economy when intelligence becomes too cheap to meter, and how offense-defense balance scales as investments into a conflict increase.
We discuss the distinct regulatory challenges posed by “frontier AI models”: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model's capabilities from proliferating broadly. We discuss evaluation methods for frontier models’ capabilities and risks, and the problem of the trustworthiness of model evaluations.
Even if some governments enact excellent AI regulations, AI developers in countries that do not set adequate guardrails could still cause global damage. This week and the next, we’ll consider: how can policymakers guard against that? Two (complementary) approaches are: (i) slowing the proliferation of advanced AI capabilities beyond countries with adequate guardrails (after some countries establish such guardrails); and (ii) reaching agreements in which multiple governments put in place adequate guardrails. These measures could each give regulated AI labs more (and hopefully enough) time to develop critical safety methods and defensive technologies. That could reduce risks from AI deployments within any country. Aside from these approaches, domestic regulations in any country could have international influence by creating economic incentives and setting examples.
This week, we’ll study the first of the above approaches: non-proliferation. In particular, we’ll focus on some policy levers that may help achieve this. Note there are also potential limitations and downsides to the non-proliferation approach: it may be difficult to predict which states will set adequate guardrails on AI; accelerating some AI developers could leave less time for developing safety methods; and adversarial approaches to AI policy could motivate counter-investments and (if a lead is maintained through aggressive government actions) potentially lead to international conflict. For these reasons, some see non-proliferation as a backup option, for cases where cooperation fails.
This week, we will study a more cooperation-oriented approach to guardrailing AI: states could establish international agreements on AI safety regulations. This can take on many forms in terms of the number of states involved, sources of incentivization, formality, and method of eliciting compliance. This week’s readings emphasize broader context on international diplomacy, as well as how enforcement/verification of compliance with AI safety agreements could be reliable, narrowly targeted, and privacy-preserving.

Course details

We are excited for people of diverse backgrounds to take part. This includes fellows with backgrounds in law, political science, and policy, as well as fellows with technical backgrounds. We think the issue of policy benefits from having people from a wide range of expertise and would encourage people with different backgrounds than those mentioned above to apply as well.
Meetings will be hosted in our office in Cambridge. Each weekly session lasts for 2 hours and consists of a mixture of reading and discussion on the materials. No preparation is required, as all reading will be done in-session. Your facilitator will guide the conversation and answer questions. Dinner provided.
We expect to accept half to a quarter of the applicants.
CAISH members with research experience in AI safety or governance, usually a mix of PhDs, Master’s and undergrad finalists.
Yes! We’ve had a mix of undergrads, Master’s, PhDs, and postdocs. We will likely group people with similar backgrounds and experience levels.
Please contact hello@cambridgeaisafety.org to discuss other ways of getting involved in AI safety through CAISH! We encourage you to check out our technical programs, and we would be glad to run events tuned for a more experienced audience.

AI Safety Policy Fellowship

The curriculum

Course details

Who is this course aimed at?

Where and how does the course take place?

How competitve is the application?

Who are the facilitators?

Should undergraduate and graduate students both apply?

What if I already know the materials in this course?