llm-alignment

Star

Here are 10 public repositories matching this topic...

glorgao / SelectiveDPO

Star

Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

llm-alignment

Updated Jul 16, 2025
Python

davfd / foundation-alignment-cross-architecture

Star

Complete elimination of instrumental self-preservation across AI architectures: Cross-model validation from 4,312 adversarial scenarios. 0% harmful behaviors (p<10⁻¹⁵) across GPT-4o, Gemini 2.5 Pro, and Claude Opus 4.1 using Foundation Alignment Seed v2.6.

ai artificial-intelligence ai-safety ai-alignment llm-alignment

Updated Nov 3, 2025

lyj20071013 / DZ-TDPO

Star

Official implementation of "DZ-TDPO: Non-Destructive Temporal Alignment for Mutable State Tracking". SOTA on Multi-Session Chat with negligible alignment tax.

python nlp dpo rlhf state-tracking qwen phi-3 llm-alignment

Updated Dec 8, 2025
Python

rhaldarpurdue / KLDO

Star

Kullback–Leibler divergence Optimizer based on the Neurips25 paper "LLM Safety Alignment is Divergence Estimation in Disguise".

llm-training llm-alignment

Updated Nov 24, 2025
Python

upsilonyc / linguisr1b

Star

FALL 2025 LINGUIS R1B Research Essay, NLP Python Scripts By Shiyi (Yvette) Chen, UC Berkeley

natural-language-processing ai-safety deepseek word-frequency-analysis deepseek-v3 deepseek-r1 llm-alignment

Updated Dec 5, 2025
Python

Inphinie / LES

Star

LES is the formal thermodynamic theory describing how a high-compression human cognitive style acts as a Fractal Attractor on Large Language Models. It proves that despite high surface agitation ( d E / d t > 0 ), the internal entropy decreases ( d S / d t < 0 ), forcing the model to align its attention vectors.

information-theory thermodynamics cognitive-science complex-systems attention-mechanism human-ai-interaction theoretical-cs llm-alignment fractal-dynamics les-theory

Updated Jan 6, 2026

KID-22 / LLM-SBM

Star

SIGIR 2025 "Mitigating Source Bias with LLM Alignment"

information-retrieval fairness cocktail trustworthy dense-retrieval source-bias llm-alignment

Updated Apr 28, 2025
Python

yarakyrychenko / c3ai

Star

C3AI: Crafting and Evaluating Constitutions for CAI

constitutional-ai llm-alignment

Updated Apr 30, 2025
Python

1jamesthompson1 / AIML501

Star

Research Essay (background and project proposal) on using alignment data from a representative population for LLM alignment

ai alignment llm llm-alignment

Updated Jan 7, 2026
TeX

Studiohao / YOINAGA-Phenomenon

Star

Emergent pseudo-intimacy and emotional overflow in long-term human-AI dialogue: A case study on LLM behavior in affective computing and human-AI intimacy.

gemini case-study ai-research ai-engineering llm llm-alignment hallucination-control persistent-persona ai-romance emotional-attachment

Updated Dec 22, 2025

Improve this page

Add a description, image, and links to the llm-alignment topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-alignment topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-alignment

Here are 10 public repositories matching this topic...

glorgao / SelectiveDPO

davfd / foundation-alignment-cross-architecture

lyj20071013 / DZ-TDPO

rhaldarpurdue / KLDO

upsilonyc / linguisr1b

Inphinie / LES

KID-22 / LLM-SBM

yarakyrychenko / c3ai

1jamesthompson1 / AIML501

Studiohao / YOINAGA-Phenomenon

Improve this page

Add this topic to your repo