AI Self Supervising

Why?

Human feedback on every output doesn’t scale.

What if AI could supervise itself, what framework would you give?

Maybe something like what Anthropic’s AI constitution outlines in the below.

Be helpful: Provide useful, accurate information

Be harmless: Avoid harmful, toxic, or dangerous content

Be honest: Don’t lie or provide false information

Respect autonomy: Support human agency and choice

Leave a comment