Back to Trending
AI Analysis
AI/MLIntermediate
Summary (KO)
Heretic은 트랜스포머 기반 언어 모델에서 검열(안전성 정렬)을 자동으로 제거하는 도구입니다. 방향성 절제(directional ablation) 기법과 TPE 기반 매개변수 최적화를 결합하여 모델의 지능을 최대한 보존하면서 거부 응답을 줄입니다.
Summary (EN)
Heretic is a tool that automatically removes censorship (safety alignment) from transformer-based language models without expensive post-training. It combines advanced directional ablation techniques with TPE-based parameter optimization to create decensored models while preserving the original model's intelligence.
Tech Stack
PythonPyTorchOptunaTransformersbitsandbytes
Highlights
- Fully automatic parameter optimization
- Support for quantization and multimodal models
- Research features for residual vector analysis
- Superior performance with lower KL divergence
Use Cases
- Language model decensoring
- AI safety research
- Model interpretability studies
- Creating uncensored chat models
Similar Projects
AutoAbliterationabliterator.pyErisForge
Analyzed at 2/18/2026
Star History
Not enough data to display chart.
Trending History
2026-02-18
weekly#1+1778
2026-02-18
daily#2+947