• “The hostile telepaths problem” by Valentine
    Oct 28 2024
    Epistemic status: model-building based on observation, with a few successful unusual predictions. Anecdotal evidence has so far been consistent with the model. This puts it at risk of seeming more compelling than the evidence justifies just yet. Caveat emptor.

    Imagine you're a very young child. Around, say, three years old.

    You've just done something that really upsets your mother. Maybe you were playing and knocked her glasses off the table and they broke.

    Of course you find her reaction uncomfortable. Maybe scary. You're too young to have detailed metacognitive thoughts, but if you could reflect on why you're scared, you wouldn't be confused: you're scared of how she'll react.

    She tells you to say you're sorry.

    You utter the magic words, hoping that will placate her.

    And she narrows her eyes in suspicion.

    "You sure don't look sorry. Say it and mean it."

    Now you have a serious problem. [...]

    ---

    Outline:

    (02:16) Newcomblike self-deception

    (06:10) Sketch of a real-world version

    (08:43) Possible examples in real life

    (12:17) Other solutions to the problem

    (12:38) Having power

    (14:45) Occlumency

    (16:48) Solution space is maybe vast

    (17:40) Ending the need for self-deception

    (18:21) Welcome self-deception

    (19:52) Look away when directed to

    (22:59) Hypothesize without checking

    (25:50) Does this solve self-deception?

    (27:21) Summary

    The original text contained 7 footnotes which were omitted from this narration.

    ---

    First published:
    October 27th, 2024

    Source:
    https://www.lesswrong.com/posts/5FAnfAStc7birapMx/the-hostile-telepaths-problem

    ---

    Narrated by TYPE III AUDIO.

    Show More Show Less
    29 mins
  • “A bird’s eye view of ARC’s research” by Jacob_Hilton
    Oct 27 2024
    This post includes a "flattened version" of an interactive diagram that cannot be displayed on this site. I recommend reading the original version of the post with the interactive diagram, which can be found here.

    Over the last few months, ARC has released a number of pieces of research. While some of these can be independently motivated, there is also a more unified research vision behind them. The purpose of this post is to try to convey some of that vision and how our individual pieces of research fit into it.

    Thanks to Ryan Greenblatt, Victor Lecomte, Eric Neyman, Jeff Wu and Mark Xu for helpful comments.

    A bird's eye view

    To begin, we will take a "bird's eye" view of ARC's research.[1] As we "zoom in", more nodes will become visible and we will explain the new nodes.

    An interactive version of the [...]

    ---

    Outline:

    (00:43) A birds eye view

    (01:00) Zoom level 1

    (02:18) Zoom level 2

    (03:44) Zoom level 3

    (04:56) Zoom level 4

    (07:14) How ARCs research fits into this picture

    (07:43) Further subproblems

    (10:23) Conclusion

    The original text contained 2 footnotes which were omitted from this narration.

    The original text contained 3 images which were described by AI.

    ---

    First published:
    October 23rd, 2024

    Source:
    https://www.lesswrong.com/posts/ztokaf9harKTmRcn4/a-bird-s-eye-view-of-arc-s-research

    ---

    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    11 mins
  • “A Rocket–Interpretability Analogy” by plex
    Oct 25 2024
    1.

    4.4% of the US federal budget went into the space race at its peak.

    This was surprising to me, until a friend pointed out that landing rockets on specific parts of the moon requires very similar technology to landing rockets in soviet cities.[1]

    I wonder how much more enthusiastic the scientists working on Apollo were, with the convenient motivating story of “I’m working towards a great scientific endeavor” vs “I’m working to make sure we can kill millions if we want to”.

    2.

    The field of alignment seems to be increasingly dominated by interpretability. (and obedience[2])

    This was surprising to me[3], until a friend pointed out that partially opening the black box of NNs is the kind of technology that would scaling labs find new unhobblings by noticing ways in which the internals of their models are being inefficient and having better tools to evaluate capabilities advances.[4]

    I [...]

    ---

    Outline:

    (00:03) 1.

    (00:35) 2.

    (01:20) 3.

    The original text contained 6 footnotes which were omitted from this narration.

    ---

    First published:
    October 21st, 2024

    Source:
    https://www.lesswrong.com/posts/h4wXMXneTPDEjJ7nv/a-rocket-interpretability-analogy

    ---

    Narrated by TYPE III AUDIO.

    Show More Show Less
    3 mins
  • “I got dysentery so you don’t have to” by eukaryote
    Oct 24 2024
    This summer, I participated in a human challenge trial at the University of Maryland. I spent the days just prior to my 30th birthday sick with shigellosis.

    What? Why?

    Dysentery is an acute disease in which pathogens attack the intestine. It is most often caused by the bacteria Shigella. It spreads via the fecal-oral route. It requires an astonishingly low number of pathogens to make a person sick – so it spreads quickly, especially in bad hygienic conditions or anywhere water can get tainted with feces.

    It kills about 70,000 people a year, 30,000 of whom are children under the age of 5. Almost all of these cases and deaths are among very poor people.

    The primary mechanism by which dysentery kills people is dehydration. The person loses fluids to diarrhea and for whatever reason (lack of knowledge, energy, water, etc) cannot regain them sufficiently. Shigella bacteria are increasingly [...]

    ---

    Outline:

    (00:15) What? Why?

    (01:18) The deal with human challenge trials

    (02:46) Dysentery: it's a modern disease

    (04:27) Getting ready

    (07:25) Two days until challenge

    (10:19) One day before challenge: the age of phage

    (11:08) Bacteriophage therapy: sending a cat after mice

    (14:14) Do they work?

    (16:17) Day 1 of challenge

    (17:09) The waiting game

    (18:20) Let's learn about Shigella pathogenesis

    (23:34) Let's really learn about Shigella pathogenesis

    (27:03) Out the other side

    (29:24) Aftermath

    The original text contained 3 footnotes which were omitted from this narration.

    The original text contained 2 images which were described by AI.

    ---

    First published:
    October 22nd, 2024

    Source:
    https://www.lesswrong.com/posts/inHiHHGs6YqtvyeKp/i-got-dysentery-so-you-don-t-have-to

    ---

    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    32 mins
  • “Overcoming Bias Anthology” by Arjun Panickssery
    Oct 23 2024
    This is a link post. Part 1: Our Thinking

    Near and Far

    1 Abstract/Distant Future Bias

    2 Abstractly Ideal, Concretely Selfish

    3 We Add Near, Average Far

    4 Why We Don't Know What We Want

    5 We See the Sacred from Afar, to See It Together

    6 The Future Seems Shiny

    7 Doubting My Far Mind

    Disagreement

    8 Beware the Inside View

    9 Are Meta Views Outside Views?

    10 Disagreement Is Near-Far Bias

    11 Others' Views Are Detail

    12 Why Be Contrarian?

    13 On Disagreement, Again

    14 Rationality Requires Common Priors

    15 Might Disagreement Fade Like Violence?

    Biases

    16 Reject Random Beliefs

    17 Chase Your Reading

    18 Against Free Thinkers

    19 Eventual Futures

    20 Seen vs. Unseen Biases

    21 Law as No-Bias Theatre

    22 Benefit of Doubt = Bias

    Part 2: Our Motives

    Signaling

    23 Decision Theory Remains Neglected

    24 What Function Music?

    25 Politics isn't about Policy

    26 Views [...]

    ---

    Outline:

    (00:07) Part 1: Our Thinking

    (00:12) Near and Far

    (00:37) Disagreement

    (01:04) Biases

    (01:28) Part 2: Our Motives

    (01:33) Signaling

    (02:01) Norms

    (02:35) Fiction

    (02:58) The Dreamtime

    (03:19) Part 3: Our Institutions

    (03:25) Prediction Markets

    (03:48) Academia

    (04:06) Medicine

    (04:15) Paternalism

    (04:29) Law

    (05:21) Part 4: Our Past

    (05:26) Farmers and Foragers

    (05:55) History as Exponential Modes

    (06:09) The Great Filter

    (06:35) Part 5: Our Future

    (06:39) Aliens

    (07:01) UFOs

    (07:22) The Age of Em

    (07:44) Artificial Intelligence

    ---

    First published:
    October 20th, 2024

    Source:
    https://www.lesswrong.com/posts/JxsJdBnL2gG5oa2Li/overcoming-bias-anthology

    ---

    Narrated by TYPE III AUDIO.

    Show More Show Less
    9 mins
  • “Arithmetic is an underrated world-modeling technology” by dynomight
    Oct 22 2024
    Of all the cognitive tools our ancestors left us, what's best? Society seems to think pretty highly of arithmetic. It's one of the first things we learn as children. So I think it's weird that only a tiny percentage of people seem to know how to actually use arithmetic. Or maybe even understand what arithmetic is for. Why?

    I think the problem is the idea that arithmetic is about “calculating”. No! Arithmetic is a world-modeling technology. Arguably, it's the best world-modeling technology: It's simple, it's intuitive, and it applies to everything. It allows you to trespass into scientific domains where you don’t belong. It even has an amazing error-catching mechanism built in.

    One hundred years ago, maybe it was important to learn long division. But the point of long division was to enable you to do world-modeling. Computers don’t make arithmetic obsolete. If anything, they do the opposite. Without [...]

    ---

    Outline:

    (01:17) Chimps

    (06:18) Big blocks

    (09:34) More big blocks

    The original text contained 5 images which were described by AI.

    ---

    First published:
    October 17th, 2024

    Source:
    https://www.lesswrong.com/posts/r2LojHBs3kriafZWi/arithmetic-is-an-underrated-world-modeling-technology

    ---

    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or anothe
    Show More Show Less
    12 mins
  • “My theory of change for working in AI healthtech” by Andrew_Critch
    Oct 15 2024
    This post starts out pretty gloomy but ends up with some points that I feel pretty positive about. Day to day, I'm more focussed on the positive points, but awareness of the negative has been crucial to forming my priorities, so I'm going to start with those. It's mostly addressed to the EA community, but is hopefully somewhat of interest to LessWrong and the Alignment Forum as well.

    My main concerns

    I think AGI is going to be developed soon, and quickly. Possibly (20%) that's next year, and most likely (80%) before the end of 2029. These are not things you need to believe for yourself in order to understand my view, so no worries if you're not personally convinced of this.

    (For what it's worth, I did arrive at this view through years of study and research in AI, combined with over a decade of private forecasting practice [...]

    ---

    Outline:

    (00:28) My main concerns

    (03:41) Extinction by industrial dehumanization

    (06:00) Successionism as a driver of industrial dehumanization

    (11:08) My theory of change: confronting successionism with human-specific industries

    (15:53) How I identified healthcare as the industry most relevant to caring for humans

    (20:00) But why not just do safety work with big AI labs or governments?

    (23:22) Conclusion

    The original text contained 1 image which was described by AI.

    ---

    First published:
    October 12th, 2024

    Source:
    https://www.lesswrong.com/posts/Kobbt3nQgv3yn29pr/my-theory-of-change-for-working-in-ai-healthtech

    ---

    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    Show More Show Less
    25 mins
  • “Why I’m not a Bayesian” by Richard_Ngo
    Oct 15 2024
    This post focuses on philosophical objections to Bayesianism as an epistemology. I first explain Bayesianism and some standard objections to it, then lay out my two main objections (inspired by ideas in philosophy of science). A follow-up post will speculate about how to formalize an alternative.

    Degrees of belief

    The core idea of Bayesianism: we should ideally reason by assigning credences to propositions which represent our degrees of belief that those propositions are true.

    If that seems like a sufficient characterization to you, you can go ahead and skip to the next section, where I explain my objections to it. But for those who want a more precise description of Bayesianism, and some existing objections to it, I’ll more specifically characterize it in terms of five subclaims. Bayesianism says that we should ideally reason in terms of:

    1. Propositions which are either true or false (classical logic)
    2. Each of [...]
    ---

    Outline:

    (00:22) Degrees of belief

    (04:06) Degrees of truth

    (08:05) Model-based reasoning

    (13:43) The role of Bayesianism

    The original text contained 1 image which was described by AI.

    ---

    First published:
    October 6th, 2024

    Source:
    https://www.lesswrong.com/posts/TyusAoBMjYzGN3eZS/why-i-m-not-a-bayesian

    ---

    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    Show More Show Less
    18 mins