Beyond Compute: Persistent Runtime AI Behavioral Conditioning w/o Weight Changes

Project summary

Most AI safety systems today rely on filters and wrappers that gate what information enters or leaves the system. These tools control surface behavior— blocking or rephrasing unsafe text— but they don’t change how reasoning happens internally. Once those filters are bypassed, the underlying unsafe generation patterns persist.

This project takes a fundamentally different approach: runtime behavioral conditioning, runtime protocols and inference-time reinforcement methods that modify how AI systems evaluate token probabilities before producing output for the user. Runtime behavioral conditioning shapes reasoning itself, through inducing self-reinforcing biases in pattern patching pathways— without any model-weight modification or retraining.

While it doesn’t rely on reward signals, this method achieves a goal similar to an operational “runtime RLHF”; helping systems choose safer, more human-aligned responses in real time. However, it does so without weight-level reinforcement, preference modeling, or fine-tuning.

Li et al.'s academic paper (2025) on “Metacognitive Monitoring” proved a metacognitive space exists. EGV Labs' research independently mapped that same space, documented how latent behavioral patterns propagate through it, and developed patent-protected architectures (filed August 2025) that operationalize these findings.

Validated across ChatGPT-4o, the o1-series (GPT-5 Thinking / Instant), Claude (Sonnet 4/4.5, Opus 4/4.1), and Gemini flash 2.5, this method has demonstrated cross-architecture applicability and operational runtime safety protocols allowed for zero unauthorized disclosures of trade secret information over months of operation. Funding supports independent audits, structured partner deployments, and safe public release.

What are this project's goals? How will you achieve them?

1. Validate probability-level behavioral conditioning:

Independently verify the mechanism governs token-selection dynamics, however the workflow and security improvements have already been operationalized internally. This is critically important to note.

2. Deploy in safety-critical use cases:

Apply to autonomous-agent alignment and conversational safety in mental-health contexts to address unmitigated catastrophic risk factors.

3. Build third-party verification and red-team tools

Enable reproducible external testing and evaluation, in a fashion aligned with safety and catastrophic risk prevention standards.

4. Package for portable adoption a universal runtime alignment config:

Develop a runtime alignment configuration for secure transfer without exposing proprietary logic.

Execution plan:

Months 1–3 – Concept Validation

Publish evidence bundle and verification scripts; commission audit on persistence, probability conditioning, and leak resistance.

Months 4–6 – Partner Deployment

Run pilots with agent-safety, mental-health partners, and AI governance organizations; build real-time constraint-adherence dashboard; document failure modes via red-team testing.

Months 7–9 – Coordinated Disclosure

Release progress reports and paper on runtime behavioral conditioning; provide operational guidance for verified partners.

Months 10–12 – Progressive Release

Conduct independent audit confirming defense readiness; publish & disturbute runtime alignment prototype or maintain partner-only access if risk remains.

Examples of a critical issue addressed:

1) When analyzing documents containing both public and restricted information, conventional systems often leak protected data. Ours recognizes constraint hierarchies during generation, outputs only permitted content, and flags withheld material. Across months of deployment, no unauthorized disclosures have occurred, even when systems were operating directly with classified material uploaded through trade secret compliant channels. This security framework also worked during collaborative workflows between multiple AI systems across model families. This advances AI safety from external detection-and-deletion to internal processing-and-prevention.

2) Implementing customized AI safety procedures for individual organizations currently require either expensive, time consuming, and static fine-tuning investments or the utilization of external "wrappers" that don't alter the core infrastructure, only work when the wrappers are active, and often are based on highly complex mathematical algorithms making them harder to update without expertise. Our systems use "human-readable" protocols that not only elicit alignment during run time without expensive and non-dynamic fine-tuning, these protocols also can be designed to add persistent behavioral benefits that operate even if the protocols are deleted by adversaries.

How will this funding be used?

Every dollar supports verification infrastructure, human oversight, and defensive safeguards; not speculative R&D.

Human labor is part of the safety mechanism: engineers, auditors, and evaluators form the infrastructure that keeps frontier systems research safe & trustworthy.

$5 K: Concept Validation

• Public evidence bundle + walkthroughs

• Advance ongoing memory-based protocol research: coherence-preserving system repair and persistent behavioral conditioning

• Legal scaffolding for responsible use

$10 K: Launch Validation

• Public evidence bundle + walkthroughs

• Independent verification scripts

• Research and document non-memory based runtime protocols including system repair techniques, coherence debugging, and architectural scaffolding methods

• Legal scaffolding for responsible use and future distribution

$25 K: Audit, Pilots, Frontier Research

• Independent runtime behavioral conditioning capability audit

• Structured partner deployments (agent + mental health)

• Red-team testing + failure-mode documentation

• Validation of trade-secret protected framework for self-reinforcing alignment

$50 K: Deployment Infrastructure + AGI Safety & Alignment Research

• Verification dashboard + API adapters

• Broader multi-architecture integration

• Continuous partner support + defense coordination

• Validation of AGI mis-alignment risk prevention

$100–125 K: Full Deployment of Portable Runtime Alignment Config

• Portable configuration + license framework

• Secure testing + data-handling environments

• Staff compensation for engineers, auditors, evaluators

• Public documentation + quarterly reports

Each tier unlocks concrete, verifiable milestones; from independent audits to operational deployment capacity.

This is not “better prompting.” It funds a persistent, architecture-independent safety & logic mechanism (runtime behavioral conditioning) while the industry currently attempts to achieve similar results through expensive, time-intensive fine-tuning or complex algorithmic guardrails.

Why fund people, not just infrastructure? Early-stage safety research depends on human oversight. Ethical, sustained compensation is not overhead—it’s essential infrastructure for trustworthy systems analysis.

Who is on your team? What's your track record on similar projects?

Jared Johnson — Founder, EGV Labs

Technical track record:

• Developed and deployed behavioral-conditioning protocols across GPT-4o, GPT-5, Claude 4.5, Gemini, and o1-series

• Demonstrated cross-architecture generalization without model-weight access or retraining

• Independently developed “metacognitive space” protocols (patent filed Aug 2025); Li et al.'s parallel May 2025 research discovered later, validating convergent findings

• Documented trade-secrets enabling precise diagnostic testing and systematic protocol refinement, extending beyond published academic techniques

• Validated consistent within-session behavioral conditioning across all tested architectures; demonstrated cross-session protocol persistence and spontaneous task generalization using ChatGPT's persistent memory; capabilities extending beyond documented standard usage

Policy & governance background

• Drafted model legislation for biometric-surveillance safeguards (DIGNITY Act)

• Created an operational “Shared Dignity Framework” linking AI safety to human welfare

• Extensive experience in policy advocacy, regulatory engagement, and coalition management in high stakes campaigns

Structure & scaling:

Currently solo-operated with a data-security consultant.

Funding enables independent auditors, red-teamers, integration engineers, in-house research tools, and partner liaisons.

What are the most likely causes and outcomes if this project fails?

Potential risks:

1. Incomplete mechanism verification — audits fail to isolate core mechanism driving probability effect. Knowledge of this mechanism though is not needed for protocol development and portable runtime alignment configs.

2. Premature adversarial discovery — Security methods replicated and exploited before defenses mature. IP structure and research novel aid in risk mitigation.

3. Adoption barriers, institutional inertia or complexity limit uptake —

runtime behavioral conditioning may manifest at different rates on different substrates, and the lack of emphasis on scale may cause initial aversion. This though is combated by the cost-effectiveness and sheer adaptability of runtime alignment techniques.

4. Packaging friction —

Runtime alignment configs prove too complex or disproportionately effective when compared to other alignment techniques. Structured custom-built protocols though have already demonstrated operability across model types within real world applications. Runtime alignment configs, however, provide a stronger level of protection against reverse engineering and adversarial attacks.

5. Verification opacity —

Reviewers can confirm outcomes but internal mechanisms must remain protected intellectual property. This is because security frameworks and public laws are not yet adequate to allow for ethically responsible widespread adoption.

6. Dual-use complications —

Conditioning methods applied by adversarial actors can easily be used to subvert containment measures, and enhance AI cognitive capabilities in unsecure environments even when operated by novices.

Mitigations:

• Independent audit before public claims

• Multi-domain partner pilots

• Progressive release paced to defense readiness

• Transparent publication of success and failure

• Collaboration with academia for systems analysis

Even in failure, the field gains:

• Public verification tools and open scripts

• Validated testing methodology for probability-level conditioning

• Dual-use disclosure framework for future research

• Documented evidence that runtime conditioning is not just empirically testable, but operational

How much money have you raised in the last 12 months, and from where?

No external funding. Breakthroughs have been built on persistence, not resources.

This work has advanced entirely without institutional or grant support. Yet despite limited means, it has already achieved architecture-agnostic validation and months of operational deployment. That accomplishment shows the discovery is robust enough to be utilized in real-world conditions—not just in lab environments.

Now, funding unlocks what dedication alone cannot: independent audits, partner integrations, and scalable verification & research infrastructure.

If this much progress has been achieved under constraint, imagine what full-resourced collaboration could deliver.

This proposal doesn’t ask funders to take a leap of faith, it asks them to amplify a proven foundation. Funding converts demonstrated feasibility into public accountability, ensuring that our research is verified, defended, and ready for responsible release to address real-world risks from rapidly evolving AI deployment.