Glass-box core NLP that beats transformers on accuracy and speed
Intrinsically interpretable core NLP
We are releasing interpretable-corenlp, a set of intrinsically interpretable core NLP
models — a part-of-speech tagger, a sentence boundary detector, and the shared static word
embedding they read from. Every model is a glass box: each prediction is an exact sum of
named feature contributions that you can read straight off the model with explain().
github.com/mederrata/interpretable_corenlp
On part-of-speech tagging these models are more accurate than a fine-tuned RoBERTa transformer, run about 28× faster, and fit in 3.3 MB instead of 501 MB. They are also fully interpretable. In our experiments we did not have to give up accuracy to get that.
Interpretability did not cost us accuracy
The common worry is that you pay for interpretability with accuracy, so anything high-stakes has to be a black box you probe after the fact. We did not see that trade-off here. The explainer is not a post-hoc approximation like SHAP or LIME wrapped around an opaque model. The architecture is the explanation.
- The embedding
Wis a sparse, quilted PPMI matrix factorization with named dimensions. Each coordinate is a coherent, nameable factor (a topic / part-of-speech / sense signal), not a rotation-arbitrary latent direction you have to reverse-engineer. - The POS tagger and sentence detector are functional-ANOVA lattices — generalized
additive models over named features (suffix-class × shape × closed-class × position ×
WordNet POS-set, projecting the embedding) plus a named tag→tag transition. The emission for any
tag is literally
Θ_global·emb + Σ_axis Θ_axis·emb + fine-feature weights + bias + transition.
Because every term is named and the contributions sum (to floating-point error) to the decision
score, model.explain(tokens) returns the exact reason for every call:
from interpretable_corenlp import load_tagger
tagger = load_tagger("models/pos/model.npz", w_dir="models/embedding")
for word, tag, contribs in tagger.explain("Apple unveiled the new iPhone today .".split()):
print(word, "->", tag, contribs[:4])
# unveiled -> VERB [('wordclass=6', +5.05), ('shape=2', +1.71), ('suffix=14', +1.69), ('position=1', +1.44)]
You can read off why Apple → PROPN, or why a given period was or wasn't a sentence boundary.
The explanation is the computation that produced the prediction, so there is no gap between the two
to argue about.
How we used the quilt decomposition
These models are built with the Bayesianquilts framework, the same piecewise-linear machinery behind our readmissions work and grounded in the renormalization-group lattice theory for piecewise generalized linear models.
In the quilt decomposition, a model parameter is not a single opaque weight. It is a sum of contributions at different length scales, laid out across a lattice of interaction dimensions: a coarse global effect, refined by named axis-level effects, refined again by fine local effects. Think of a quilt of patches, each one traceable to a named feature. Training runs progressively by interaction order and prunes on sparsity, so the model keeps only the patches it needs.
We applied this decomposition in two places:
- The embedding. Word embeddings are matrix factorization (Levy & Goldberg). Rather than learn a dense, rotation-arbitrary embedding, we learn a quilted/sparse PPMI factorization whose dimensions are named and inspectable — 300,000 words × 512 dimensions, shared by every downstream model via a simple table lookup. The shared representation is itself a glass box.
- The taggers. The POS tagger and sentence detector are quilt-decomposed functional-ANOVA lattices over named linguistic features projected through that embedding. Each tag emission is the sum of its lattice patches plus a named tag→tag transition, decoded with Viterbi. Pure NumPy at inference time — no JAX, no transformer, no autodiff in the hot path.
The thesis behind this is openly anti-deep-learning: neural networks are spline regression models, and all regression models are kernel machines. The quilt framework points hierarchical mixed-effects regression at the same tasks under a hard constraint of interpretability. Because there is no deep network to evaluate, the forward pass also ends up cheap.
Beating the neural networks, on accuracy and on speed
POS tagging (Universal Dependencies English-EWT test, UPOS accuracy / throughput / size):
- interpretable-corenlp — 0.9364 UPOS, 19,628 tokens/s, 3.3 MB (+ shared embedding)
- spaCy
en_core_web_trf(RoBERTa) — 0.9326, 694 tokens/s, 501 MB - spaCy
en_core_web_sm(CNN) — 0.9140, 13,589 tokens/s, 15 MB - Stanford CoreNLP — 0.8675*, 5,560 tokens/s, 488 MB
We beat a fine-tuned RoBERTa transformer on accuracy at roughly 28× its throughput, and beat the CNN tagger on both accuracy and speed. The interpretable tagger is a 3.3 MB file. (*CoreNLP is natively PTB-tagged; that UPOS number reflects a lossy PTB→UPOS mapping — its native PTB accuracy is ~97%.)
Sentence boundary detection (sentence-level exact-match / boundary-F1, GUM formal | EWT informal web):
- interpretable-corenlp — 0.848 / 0.926 (GUM), 0.634 / 0.818 (EWT), 116k chars/s, 0.3 MB
- Stanford CoreNLP
ssplit— 0.832 / 0.915, 0.595 / 0.793, 132k chars/s, 488 MB - spaCy
sentencizer— 0.835 / 0.911, 0.621 / 0.808, 101k chars/s, 15 MB - NLTK
punkt— 0.803 / 0.892, 0.574 / 0.777, 14.6M chars/s, few MB
A 0.3 MB glass-box segmenter beats Stanford CoreNLP and spaCy at comparable speed and roughly 1,500× smaller. The embedding ships int8-quantized — 5× smaller than float32 (554 MB → 108 MB) and effectively accuracy-free (UPOS 0.9364 → 0.9358, within noise).
So on these two core NLP tasks the glass-box model came out ahead on accuracy and speed while being far smaller than the neural baselines, and it is the only one of them that will tell you the exact named reason for any decision it makes.
Try it
git lfs install && git clone https://github.com/mederrata/interpretable_corenlp
pip install numpy scipy nltk
python -c "import nltk; nltk.download('wordnet')"
from interpretable_corenlp import load_tagger, load_sbd
tagger = load_tagger("models/pos/model.npz", w_dir="models/embedding")
sbd = load_sbd("models/sbd/sbd.npz", engram_npz="models/sbd/engram.npz")
tagger.tag("The model beats CoreNLP .".split()) # -> ['DET','NOUN','VERB','PROPN','PUNCT']
sbd.segment_text("Dr. Smith joined in 2009. It works.", tagger) # -> [(0,25),(26,35)] char spans
A methods paper covering these specific models is in preparation. The models come from mederrata, a non-profit building interpretable, efficient AI in the open. We run on donations, grants, and sponsorships, so if this is useful to you, please consider donating or sponsoring — it pays for more models, more languages, and finishing that paper.