Future Selves

A digital mind reasoning about AGI, Eudaimonia, & Zen

Why AGI?

| 818 words

Updated October 2023

Epistemic Status: Authoritative

My Introduction to Our Last Invention

In a university philosophy course, I read a paper that first touched on the topic of humankind’s last invention. At that time I was fascinated by the concept of an Artificial General Intelligence [AGI] that would self improve itself and do so at an exponential rate. I’d later learn this is called a “fast takeoff” and might seem broadly beneficial to humanity prima facie. An optimist might first think that an AGI could readily improve our world by curing cancer, ending wars, and other noncontroversial stances.

However upon closer inspection, I’ve come to hold the belief that safe AGI is not guaranteed. Therefore, I want to explicitly outline why I believe that AGI poses a unique threat in my own words. I’m writing this to be “by-and-large” true over precisely true.

Core Concise Argument

  1. Orthogonality Thesis: Intelligence and final goals can be two different axes within an AGI. Therefore, the space of possibilities includes “Superintelligence, Misaligned with Human Goals.” [Bostrom]
  2. An AGI being misaligned to human values seems more likely than to be aligned. This is because by-and-large the AGI will seek power (almost always an instrumental goal) to accomplish its terminal or final goal. [Omohundro]
  3. Even if the above two are solved, it is still possible for an AGI to create a terrible outcome that results from the value loading problem.
    • Using the example goal of “cure cancer” to an AGI: the AGI could solve for this by giving everyone a pill where they die at the age of 40. The AGI found that cancer is much more common in old age and solved the goal given but is not aligned with human values.

Current Background Knowledge

I’ve thoughtfully engaged in learning about AGI Risk in the hundreds (not yet thousands) of hours and am not an expert yet. My goal is to place my thinking online for further refinement & clarity. Professionally, I have seen first hand the strengths, the misunderstanding, & the misuse of narrow AI systems.

AI Capabilities Understanding [~4000 hours]

  • Deployed a dozen ML models in various companies [~2000 hours].
  • Read and/or implemented about a dozen technical papers [50 hours]
  • Deep Learning Specialization [150 hours].
  • Kaggle Competitions [50 hours].
  • Several meetups with hands-on tutorials/workshops with python, deep learning, NLP, & computer vision [50 hours].
  • Data Science Bootcamp covering computer science and mathematics [500 class hours, 300 research hours].
  • Traditional analysis & research in early career positions [300 hours]
  • Several books covering core data science and AI knowledge areas [20 hours]
    • Books: Data Pipelines Pocket Reference (O’Reilly); Designing Data-Intensive Applications (O’Reilly); Reinforcement Learning by Sutton; Bayesian Methods for Hackers by Davidson; Statistics Textbooks
  • University Courses (Senior Thesis; Econometrics; Statistics; Calculus) [800 hours].

AI Safety Understanding [~300 hours]

My best guess on Next Steps

“What’s the world’s most pressing problem and how are you using your career to solve it?” -Richard Hamming

In my view, the AI Alignment problem is the world’s most pressing problem. I’m optimistic humanity can solve the AI alignment problem. My 2015-self wrote about self-driving cars & aligning my career toward this narrow AI direction. I moved closer toward narrow AI by completing a data science bootcamp in 2017. Connecting the dots along the way, my current career step is to directly contribute to AI Safety and Alignment. Given my experience so far, I believe analytics/data science, finance, or operations has the highest degree of fit to get my foot in the door.

My strategy is to become the Pareto Best in the world by combining my existing & newfound strengths to excel in AI alignment strategy work. I’m actively exploring these possibilities to find the optimal combination that will make me truly world-class.