The Highlights
1.
After 34 Years, Someone Finally Beat Tetris. A 13-year old kid becomes the first person to “beat” Tetris by hitting a true killscreen. I don’t really care about Tetris, but this story illustrates a few important things.
First, you see the innovation process occurring in button tapping and other techniques over decades.
Second, the combination of an online community, competition, and a standardized metric for success lead to steady progress in the field, achieving things once thought impossible. EDIT: also notice how many people can repeat someone else’s success once someone shows that it’s possible. Just like with the 4-minute mile.
Third, none of this would be possible if there weren’t enough people with enough free time (the featured player practices 3-5 hours a day) working on the problem. If there weren’t many people interested in Tetris, these innovations would go undiscovered. This is true of approximately every field, big or small. Rising population and leisure time across the world has contributed to innovations in dozens of these niche fields.
Fourth, this is what I imagine a near-term post-scarcity society would look like; communities of nerds working on a problem that only really matters to them, freed from the need to work a 9 to 5.
I think in a world like this, a lot of scientific research would embody the ideals of Frictionless Reproducibility outlined in Donoho’s paper Data Science at the Singularity. I highly recommend this paper for anyone thinking about how to improve the institutions of science.
Though there’s two things that I think it misses: first, increasing laboratory automation will allow physical experiments to pursue frictionless reproducibility as well, and second, once we have all of this data of humans doing experiments, the natural next step is to train transformer models to do the same, automating research.
Related: A guy built a big 3D printer using common parts and 3D printed components. Neat. 3D printing reminds me of cryptocurrency: a community of hobbyists is quietly creating a bunch of new capabilities and nobody knows what to use them for, they’re just cool.
2.
Nice threads on where the real bottleneck in drug discovery lies. Essentially the problem isn’t that we need AI to find small molecules that bind to a protein, it’s that we don’t have a good enough understanding of diseases and biology in general to figure out which proteins to target in the first place. Biology is hard, and all of those “AI for drug design” and “AI for biology” approaches neglect the fact that the large majority of the cost, time, and risk for creating new drugs comes from clinical trials. This is partially a policy problem.
I can’t help but imagine a different approach:
Decoupled policy that allows people to try experimental drugs while receiving compensation and insurance proportional to the risk.
Drug companies obtain a library of substances screened for safety that hold promise for curing the disease.
Give patients combinations of these substances while changing the cocktail over time. Use multifactorial designs to make trials more efficient.
Use at-home patient data collection as much as possible.
Patients could receive the medications (and placebos) in the mail and report data on their smartphones. The trial data can be automatically analyzed according to a preregistered design. Drugs that are found to be effective could go through a dedicated trial, but the safety and efficacy can already be established without it.
3.
Language models will continue to shrink. The paper Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws accounts for the cost of inference when scaling language models and find that “LLM researchers expecting reasonably large inference demand (~1B requests) should train models smaller and longer than Chinchilla-optimal.” This blog post looks at how fast language models learn for a given number of GPU-hours and claims that “… contrary to popular belief, larger models yield worse results. If you had to pick a parameter size and dataset, you might be better off opting for a 7B model and training for 7 epochs on trillions of tokens.” It’s no surprise then that there are now many language models with under 2 billion parameters, such as Stable LM, Phi-2, TinyLlama 1.1B, and Falcon 1B. In the future, it may make more sense to combine these small models instead of training a larger one. in fact, the paper Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM shows how combinations of smaller models can outperform large ones.
Everything Else
Interesting hypothesis with lots of implications:
If we feel happier in modern day, might that cause a rise in depression?
Related: a thread on the fact that news has gotten more negative over time. Might Scott Alexander’s theory fit here as well? As people got richer they compensated by consuming more negative media, a behavioral mechanism for the hedonic treadmill.
Some good discussion of ways to donate to boost economic growth and an ACX post on Does Capitalism Beat Charity? In general, it seems like the best way to spend money in order to boost growth is to fund innovation directly. After that, policy research and advocacy tailored to different localities seems like a good bet. In third place, spending money on various global public goods could boost growth (but states are in a much better position to fund these than individuals). I wish there was more investment in all of these directions.
Joe Carlsmith is has some interesting philosophical writing regarding alignment and ethics. Being nicer than Clippy and On the abolition of man are the two most important ones. I don’t recommend reading them in their entirety, I find his discussions to be too long and abstract in general; but I think he’s pointing at some really important stuff.
The Returns to Science In the Presence of Technological Risks
Price of Anarchy in Algorithmic Matching of Romantic Partners. One of the few papers that applies economics to dating apps.
This paper on Permissionless Consensus seems to have an important set of impossibility results regarding blockchains operating under different "degrees of permissionlessness". A video and thread explaining more.
Deep Differentiable Logic Gate Networks. I wonder if you could run the training for these on an FPGA directly.
Geometric Algebra Transformer. Seems useful for machine learning on molecules and materials.
Autonomous discovery of emergent morphologies in directed self-assembly of block copolymer blends. This is interesting in the context of modular peptides since putting different peptides in a block co-polymer could create new patterns.
Discovery and Identification of a Novel Tag of HlyA60 for Protein Active Aggregate Formation in Escherichia coli. Useful for protein purification.
Succession, a sci-fi story by Richard Ngo.
Interesting thoughts. AI could do wonders for discovering new drugs, but the bottleneck remains with a limited understanding of biology and overly restricted policy.
I have made the case before that we probably ought to pare back the FDA to make it easier and less costly for drugs to get approved. The FDA has grown well-beyond its original mandate.