git-trend
Back to Trending

p-e-w/heretic

Fully automatic censorship removal for language models

Python7,681 stars781 forks
AI Analysis
AI/MLIntermediate

Summary (KO)

Heretic은 트랜스포머 기반 언어 모델에서 검열(안전성 정렬)을 자동으로 제거하는 도구입니다. 방향성 절제(directional ablation) 기법과 TPE 기반 매개변수 최적화를 결합하여 모델의 지능을 최대한 보존하면서 거부 응답을 줄입니다.

Summary (EN)

Heretic is a tool that automatically removes censorship (safety alignment) from transformer-based language models without expensive post-training. It combines advanced directional ablation techniques with TPE-based parameter optimization to create decensored models while preserving the original model's intelligence.

Tech Stack

PythonPyTorchOptunaTransformersbitsandbytes

Highlights

  • Fully automatic parameter optimization
  • Support for quantization and multimodal models
  • Research features for residual vector analysis
  • Superior performance with lower KL divergence

Use Cases

  • Language model decensoring
  • AI safety research
  • Model interpretability studies
  • Creating uncensored chat models

Similar Projects

AutoAbliterationabliterator.pyErisForge

Analyzed at 2/18/2026

Star History

Not enough data to display chart.

Trending History
2026-02-18
weekly#1+1778
2026-02-18
daily#2+947