Live · trained & serving

AGPT-1

AGPT-1 is our first model, and the proof of the whole thesis: there is no foundation model underneath it. We trained it from random weights on our own data, and the result — tokenizer, architecture, and weights — is owned outright, with no base-weight license to answer to.

See the full catalog Our approach →

What it is

A from-scratch text-embedding model.

AGPT-1 turns text into vectors — the foundation for the semantic search we're bringing to the platform. It was pre-trained by us on a license-clean corpus, with weights initialized from random, not fine-tuned from someone else's base.

That distinction is the whole point: every other shortcut starts from a licensed foundation model. AGPT-1 starts from nothing but our data and our design, so the resulting IP is unambiguously ours.

Status

Live

Type

Text embeddingstretch-embed

Architecture

CustomBuilt in-house

Tokenizer

ProprietaryOur own

Training

In-houseOn our own data

Base model

NoneRandomly initialized

Ownership

100%No base-weight license

Why from scratch

Owned weights change what we can do.

Clean IP

No upstream license terms, usage caps, or model-provider dependency. The weights are an asset we own.

Our tokenizer

A proprietary vocabulary trained on our own corpus, tuned to our domain language rather than a generic web crawl.

Efficient by design

Compact enough to train and serve efficiently, and fast enough to embed at the scale retrieval will need.

The spine

The same from-scratch pipeline extends to our other models — AGPT-1 is the template, not a one-off.