Overview

Introducing Topaz

Topaz is a small, closed language for application intent: one canonical way to say each thing, written by people and by agents, checked by the toolchain.

Version note: This documentation reflects Topaz v5.2 canonical syntax, a strict superset of frozen v5.1. Modules are new in v5.2: see Modules & Visibility.

Topaz is a small, closed language for application intent. There is one canonical way to say each thing; people and agents both write it, and the toolchain checks it. It reads like Python or TypeScript, but the surface is deliberately closed: for each intent there is exactly one canonical form, locked in the v5.2 specification. For who it is for, and who it is not for, see Why Topaz?.

Every program on this page that shows output is executed against the v5.2 toolchain by the site verifier. Each output block is pinned byte-for-byte to a captured run, and the remaining fences are parse-checked.

Lena Code interop. Topaz is one of the languages covered by Lena Code conversion workflows. It is not Lena Code's core surface or intermediate language; Topaz remains an independent language and is used as an engineering language for parts of CSKernel™ implementation and validation.

First proof: Unicode-first

Identifiers are Unicode from the lexer up. Domain words stay in your language instead of becoming romanized approximations:

TOPAZ
function greet(name: string, language: string) -> string {
    return match language {
        case "한국어" => "안녕하세요, {name}님!"
        case "Русский" => "Привет, {name}!"
        case _ => "Hello, {name}!"
    }
}

let 사용자 = "김토파즈"
print(greet(사용자, "한국어"))
print(greet("Topaz", "Русский"))
OUTPUT
안녕하세요, 김토파즈님!
Привет, Topaz!

사용자 is an ordinary variable. Unicode identity is not a tolerated edge case. Identifiers are exactly the scalar sequences you wrote, with no silent normalization (§1), and the module resolver rejects look-alike module names across files under exact, NFC/NFD, and case-fold collision keys (§17).

The four pillars

A small, closed surface

For each intent there is one way to write it. The language decides policy, so people and agents don't re-decide it per file: Result for recoverable failure, optionals for absence, defer for cleanup.

TOPAZ
function parsePort(raw: string) -> Result<int, string> {
    let n = toInt(raw) ?? -1
    if n < 1 {
        return Err("not a port: {raw}")
    }
    return Ok(n)
}

function startup(raw: string) -> Result<string, string> {
    defer print("config closed")
    let port = parsePort(raw)?
    return Ok("listening on {port}")
}

print("{startup("8080")}")
print("{startup("http")}")
OUTPUT
config closed
Ok(listening on 8080)
config closed
Err(not a port: http)

The ? operator propagates the Err, and defer runs on every exit path. That is why config closed appears in both transcripts above.

Unicode-first identity

Proven above. Strings are Unicode scalar sequences with no implicit normalization, so what you wrote is what compares equal (§1). Grapheme-cluster APIs are deferred to a future v5 decision (§20, §22).

Agent-ready

The specification is small enough for an agent to hold in full, with machine-checkable profiles and documentation that is itself verified. Every canonical fence on this site is parse-checked against the toolchain, and runnable examples execute with their output pinned. See how this site is verified.

Templates with intent

sql, sh, and path strings are templates. They are structured values whose parts and interpolations stay separate instead of collapsing into a string:

TOPAZ
let table = "users"
let q = sql"select * from {table} where active = {true}"
print("{q}")
OUTPUT
<sql template, 3 part(s), 2 interpolation(s)>

That placeholder rendering is the honest state today: the template value exists and nothing was concatenated, while the per-domain rendering policies (quoting, escaping) are deferred to a future v5 decision. Keeping the parts separate is what the grammar enforces, so you don't have to remember to.

The type discipline, honestly

Topaz is statically typed by specification. Annotations, inference, and literal types are all part of the locked v5.2 surface.

TOPAZ
type TrafficLight = "red" | "yellow" | "green"

function next(light: TrafficLight) -> TrafficLight {
    return match light {
        case "red" => "green"
        case "green" => "yellow"
        case _ => "red"
    }
}

The static checker ships in v5.2: topaz check types a whole compilation unit by default. The contracts the interpreter once enforced only at runtime as dynamic guards (TPZ5xxx) now graduate to static diagnostics, caught before the program runs. The status page tracks the toolchain gates.

What runs today

Topaz v5.2 runs end to end: a Unicode-first lexer, full parser, and multi-file module resolver; a whole-unit static type checker (topaz check); the reference interpreter behind topaz run, which handles pattern matching, defer, concurrent, and faults rendered with source carets; and a Rust emission backend (topaz emit / topaz build) that lowers a program to a self-contained native binary, differential-tested against the interpreter so the two agree by construction. This site publishes the counts and the gates on Toolchain status.

Run your first program in the Getting Started guide. Every transcript there is captured from the real toolchain.

The road ahead

The language surface is locked at v5.2. The items below are deferred on purpose rather than overlooked:

  • Grapheme-cluster APIs and per-domain template rendering (quoting/escaping policy): a future v5 decision (§20, §22).
  • Optimized lowering: the Rust backend is correctness-first today, and typed/monomorphic lowering is a later record.

Nothing above is promised by this page beyond its label; the status page is the source of truth.

Start here

One way to say it.