Polyweb
Posts
OpenAI Grant Winner: Building AI That Actually Understands What Matters to People

OpenAI Grant Winner: Building AI That Actually Understands What Matters to People

Sara Landi Tortoli
April 22, 2024

✨ Hi, I’m Sara, host of the Polyweb podcast. On the show, we explore how we can find meaning in the Technology Era. Join conversations with global tech leaders and entrepreneurs, navigating the intersection of business, life, and technology.

Watch on YouTube or listen now on Apple, Spotify, or RSS Feed.

The rise of powerful language models like ChatGPT has sparked both excitement and trepidation about the future of artificial intelligence. While many envision AI as a tireless assistant that can augment our capabilities, there are valid concerns that these systems could also amplify biases, pursue misaligned agendas, and exacerbate societal divisions.

Oliver Klingefjord, co-founder of the Meaning Alignment Institute, offers a compelling alternative vision - one where ordinary people have a direct say in shaping the values and behaviors of the AI systems that will increasingly shape our lives.

Klingefjord's organization was recently awarded an OpenAI grant to research a "democratic fine-tuning" approach to AI alignment. Rather than relying on paid contractors or opaque AI models to define the ethical principles underlying a language model's behavior, the Meaning Alignment Institute is pioneering a process that taps into the collective wisdom of everyday people.

"We're currently heading towards a world where every group or almost every individual will have their own model, fine-tuned to their preferences," Klingefjord explains. "This will lead to a bunch of consequences that I don't think is desirable, such as models that pursue political agendas through dubious means and exacerbate existing conflicts."

Instead, the institute's democratic fine-tuning process involves presenting people with moral dilemmas and drilling down to uncover the underlying values that inform their responses. These values are then used to create a "moral graph" - a map of the agreements and disagreements within a diverse group of participants. This graph can then be used to guide the training of AI models, ensuring they make decisions aligned with broadly shared human wisdom rather than the whims of a particular faction.

"We're not just trying to be representative - we're trying to surface the wisest values, even if they come from minority perspectives that many people haven't directly experienced," Klingefjord says.

This focus on unearthing deeper meaning, rather than just broad consensus, is a key distinction from other approaches like Constitutional AI. And it points to the institute's broader vision of using AI not just to mirror our existing preferences, but to actively enhance our capacity for ethical reasoning and decision-making.

🎯 Three Key Takeaways:

Democratizing the "Values" that Shape AI Systems
The Meaning Alignment Institute's democratic fine-tuning approach aims to give the general public a voice in defining the core values and ethical principles that guide powerful AI systems like large language models.
This has the potential to make the development of transformative AI technologies more transparent and accountable to society as a whole, rather than being solely shaped by the agendas of tech companies or government bodies.
Aiming for an "Ethical Intelligence" That Transcends Human Disagreements
Rather than mirroring the entrenched divisions and biases of human society, the institute hopes to develop AI systems that can reason at a "super wise" level and find creative solutions that reconcile disparate human values.
Implication: If achieved, this "ethical intelligence" could help humanity overcome some of our most intractable political, social and philosophical disagreements.
Surfacing the Wisest Values, Not Just Broad Consensus
Rather than just aiming for representational diversity or majority consensus, the institute's goal is to surface the "wisest" values - even if they come from minority perspectives that many people haven't directly experienced. They believe this will lead to AI systems that make more ethically-grounded decisions.

📚 Resources:

👉 CHECK OUT THE MEANING ALIGNMENT INSTITUTE: https://www.meaningalignment.org/

Meaning Alignment Institute Substack: https://meaningalignment.substack.com/

Introducing Democratic Fine-Tuning: https://meaningalignment.substack.com/p/introducing-democratic-fine-tuning

Value Discovery GPT: https://chat.openai.com/g/g-TItg5klMA-values-discovery

⏱️ Time Stamps:

00:00 Intro

04:42 How the Meaning Alignment Institute won the OpenAI grant to explore democratic AI alignment

08:13 Comparing democratic fine-tuning to RLHF and Constitutional AI

12:33 Institute's process of surfacing underlying values through moral dilemmas

16:00 Creating a "moral graph" to capture wisdom and agreements

22:04 Risks of personalized AI models and the need for shared human values

26:25 Using the moral graph to guide and evaluate AI model outputs

32:12 Potential applications of the moral graph

36:07 Challenges of scaling to superintelligent AI

39:55 Broader implications and the need for people to articulate their values

43:55 Where to find more information about the Meaning Alignment Institute

👀 About:

Oliver is an engineer and researcher, with a background building startups and a keen interest in continental philosophy. Most recently, he simulated democratic deliberation using LLMs.

🔗 CONNECT WITH OLIVER:

👥 Linkedin - https://www.linkedin.com/in/oliver-klingefjord-45b662105/

🐦 Twitter - https://twitter.com/klingefjord

🔗 CONNECT WITH SARA:

📸 Instagram - https://bit.ly/saralanditortoli-instagram

🐦 Twitter - https://twitter.com/Sara_LandiTor

👥 Linkedin - https://bit.ly/saralanditortoli-linkedin

💌 Newsletter - https://polyweb.beehiiv.com/

🪢 ABOUT POLYWEB:

Polyweb navigates through the intricate nexus of technology, product innovation, and human creativity. Engage with leading entrepreneurs and experts in uncovering how we can forge impactful companies and lead lives enriched with purpose in the Technology Era.

✨ If you like this newsletter and found the content useful, please consider sharing it 🙏