In the 2022 AGI Safety Fundamentals course I worked on toy investigations on deceptively aligned agents in a toy environment.

You can find relevant code, results and ideas for future work / extensions in this gitlab repository.