We're a team building World-Model-Lens, a library for Interpretability

Project summary

World Model Lens is a research‑oriented software toolkit for analyzing, debugging, and understanding world models (e.g., Dreamer‑style RSSMs, JEPA, transformers, video prediction, and robotics/autonomous‑driving world models). It provides observability, replay, causal analysis, safety auditing, and probing tools so researchers can inspect activations, beliefs, uncertainties, and failure modes in a structured way. The project’s main goal is to make world models more interpretable, safer, and easier to debug, especially for teams working on AI safety, RL, planning, and autonomous systems. It standardizes logging, analysis, and benchmarks so that different groups can compare and reproduce findings on world‑model behavior.

PS. Inspired from @NeelNanda's TransformerLens Library, and AgentLens library from MATS Alumni

What are this project's goals? How will you achieve them?

GOALS

Provide production‑grade observability infrastructure for world models: activation caching, saliency maps, surprise detection, and belief tracking.
Enable replay and counterfactual analysis: trajectory replay, intervention replay, imagination branching, and “what‑if” exploration.
Support causal and mechanistic analysis: causal tracing, circuit discovery, path patching, and bottleneck detection.
Build safety‑first tooling: OOD detection, hallucination analysis, robustness testing, and safety audits for deployed models.
Offer standardized probing and metrics: linear probes, semantic probes (DINO/CLIP‑style), concept discovery, and disentanglement metrics (MIG, DCI, SAP).
These goals will be achieved by:
- Extending the existing HookedWorldModel ( we made ourselves )and adapter system to support more backends and use cases.
- Writing and integrating reusable analysis modules (belief analyzer, causal tracer, safety auditor).
- Adding benchmarks and example workflows for RL, robotics, autonomous driving, and scientific simulation settings.
- Documenting and packaging the toolkit so that external research groups can plug their own models into it with minimal effort.

How will this funding be used?

GPU/Compute costs for running experiments, benchmarking, and example workflows on real world‑model checkpoints.
Tooling and deployment (CI/CD, testing, documentation, API/CLI improvements, and monitoring integrations such as Prometheus/OpenTelemetry if needed).

What are the most likely causes and outcomes if this project fails?

Not a chance for it to fail, we need support and we're pretty sure we will publish this and help the public

How much money have you raised in the last 12 months, and from where?

As we're international students and have beeing using our personal savings but now that we've drained them too testing. Need support from ppl, and posting in this from reading tweets on @ethanjperez ( we're NYU students too ) @RyanKidd ( One of our team member is really preparing his life to get into MATS program )