Skip to content

JuLC Compiler Design

A progressive guide to the JuLC compiler architecture — from high-level overview to internal implementation details. Intended for anyone from first-time contributors to seasoned compiler engineers.

JuLC (Java UPLC Compiler) compiles a subset of Java into Untyped Plutus Lambda Calculus (UPLC) — the on-chain execution language for Cardano smart contracts. It lets Java developers write Cardano validators using familiar syntax, types, and tooling (IDEs, Gradle, JUnit) while producing the same on-chain bytecode as Haskell-based Plutus or Aiken.

┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐
│ Java Source │ ──────► │ JuLC Compiler │ ──────► │ UPLC Program │
│ (@Validator) │ │ (julc-compiler) │ │ (on-chain) │
└─────────────────┘ └─────────────────┘ └──────────────┘

What JuLC is:

  • A source-level compiler (Java source code in, UPLC bytecode out)
  • A subset compiler — only a safe, deterministic subset of Java is allowed
  • A Plutus V3 compiler targeting the Conway era

What JuLC is NOT:

  • Not a JVM bytecode compiler (it reads .java files, not .class files)
  • Not a full Java compiler (no classes, inheritance, exceptions, threads, I/O)
  • Not a general-purpose transpiler (output is specifically UPLC for Cardano)

JuLC accepts a functional subset of Java designed for on-chain safety:

SupportedNot Supported
record typesclass with mutable fields
sealed interface + pattern matchingClass inheritance
for-each loops, while loopsfor(;;) C-style loops, do-while
if/else, switch expressionstry/catch, throw
Optional<T>null
BigInteger, byte[], String, booleanfloat, double, arrays
Static methodsInstance methods, this, super
Immutable variables (final semantics)Reassignment (except loop accumulators)

Cardano uses the Extended UTXO (eUTxO) model. Each unspent transaction output (UTxO) can optionally be locked by a validator script. To spend that UTxO, a transaction must provide a redeemer (the “proof” or “action”) and the validator script must return true.

┌──────────────────────────────────────────────────────────────┐
│ Transaction │
│ │
│ Inputs: Outputs: │
│ ┌──────────────────┐ ┌─────────────────┐ │
│ │ UTxO (locked by │ │ UTxO (new, may │ │
│ │ validator script)│ │ be locked too) │ │
│ │ + Redeemer │ └─────────────────┘ │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────┐ │
│ │ Validator Script (UPLC program) │ │
│ │ Input: ScriptContext (tx data) │ │
│ │ Output: true (accept) / false (fail) │ │
│ └──────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘

The ScriptContext passed to every validator contains:

  • TxInfo — the full transaction (inputs, outputs, fees, minting, signatories, etc.)
  • Redeemer — the action data provided by the transaction submitter
  • ScriptInfo — identifies which script purpose triggered this execution (spending, minting, etc.)
PurposeWhen TriggeredJava Annotation
SpendingConsuming a UTxO locked at a script address@SpendingValidator
MintingMinting or burning tokens under this policy@MintingValidator
WithdrawingWithdrawing staking rewards@WithdrawValidator
CertifyingPublishing delegation certificates@CertifyingValidator
VotingCasting governance votes@VotingValidator
ProposingSubmitting governance proposals@ProposingValidator

3. From Java to On-Chain: The 10,000-Foot View

Section titled “3. From Java to On-Chain: The 10,000-Foot View”

The JuLC compilation pipeline transforms Java source through three intermediate representations:

Java Source (human-readable)
│ JavaParser
Java AST (abstract syntax tree — JavaParser nodes)
│ PirGenerator
PIR (Plutus Intermediate Rep) (typed, named variables, high-level constructs)
│ UplcGenerator
UPLC (Untyped Plutus LC) (untyped, De Bruijn indices, minimal)
│ FlatEncoder + CborEncoder
On-chain bytecode (binary, embedded in transactions)

Why three stages?

  • Java AST → PIR: Translates Java constructs (records, loops, method calls) into a typed lambda calculus with Let bindings, LetRec for recursion, and DataMatch for pattern matching. Types are preserved for correct encode/decode insertion.

  • PIR → UPLC: Erases types, converts named variables to De Bruijn indices, lowers Let to function application, LetRec to the Z-combinator, and DataMatch to tag-based dispatch. This is the “compilation” step.

  • UPLC → Binary: Serializes using FLAT encoding (compact bit-level format) wrapped in CBOR. This is what goes on-chain.


JuLC is organized into focused modules. Here are the key ones grouped by role:

ModuleRoleKey Types
julc-coreUPLC AST, constants, serializationTerm (10 variants), DefaultFun (102 builtins), PlutusData, Program
julc-ledger-apiJava records for all Cardano V3 ledger typesScriptContext, TxInfo, TxOut, Value, Address, Credential + 35 more
ModuleRoleKey Types
julc-compilerJava source → UPLC compilation (36 Java files)JulcCompiler, PirGenerator, PirTerm, PirType, UplcGenerator
julc-stdlibOn-chain standard library (13 libraries, ~65 methods)StdlibRegistry, ListsLib, MapLib, ValuesLib, etc.
ModuleRoleKey Types
julc-vmVM SPI — pluggable execution backendJulcVm, JulcVmProvider, EvalResult
julc-vm-scalusDefault VM backend (wraps Scalus CEK machine)ScalusVmProvider
julc-vm-javaFuture pure-Java VM backend(in development)
ModuleRole
julc-onchain-apiAnnotations (@Validator, @Entrypoint, @Param, @OnchainLibrary) + off-chain stubs for IDE support
julc-testkitValidatorTest base class for JUnit-based validator testing
julc-testkit-jqwikProperty-based testing support
julc-annotation-processorJava annotation processor — compiles validators at build time
julc-gradle-pluginGradle plugin wrapping the annotation processor
julc-blueprintPlutus blueprint (CIP-57) generation
julc-cliCommand-line compiler interface
ModuleRole
julc-cardano-client-libIntegration with cardano-client-lib (transaction building)
julc-e2e-testsEnd-to-end integration tests (CIP-113, etc.)
julc-examplesExample validators and library code
julc-bomBill of Materials for dependency management
ModuleRole
julc-analysisStatic analysis tooling
julc-decompilerUPLC → human-readable decompilation
julc-benchmarkPerformance benchmarking
julc-blsBLS12-381 cryptographic operations
julc-vm-truffleGraalVM Truffle-based VM backend
julc-playgroundWeb-based compiler playground

┌────────────────┐
│ julc-bom │ (Bill of Materials)
└────────────────┘
┌─────────────────────────┐
│ julc-core │
│ Term, DefaultFun, │
│ PlutusData, Program │
└────────┬────────────────┘
┌──────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌────────────┐ ┌──────────────┐ ┌───────────────┐
│ julc-vm │ │julc-ledger- │ │julc-onchain- │
│ (VM SPI) │ │ api │ │ api │
└─────┬──────┘ │ (40 types) │ │ (Annotations) │
│ └──────┬───────┘ └───────┬───────┘
│ │ │
┌─────┴──────┐ ┌─────┴──────────────────┴────────┐
│julc-vm- │ │ julc-compiler │
│ scalus │ │ (Java → PIR → UPLC pipeline) │
└────────────┘ │ 36 files, 7 packages │
└──────────┬───────────────────────┘
┌─────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌───────────────┐ ┌───────────────┐
│ julc-stdlib │ │ julc-testkit │ │julc-annotation│
│ (13 libs) │ │(ValidatorTest)│ │ -processor │
└──────────────┘ └───────────────┘ └───────────────┘
┌────────┴────────┐
│ julc-gradle- │
│ plugin │
└─────────────────┘

Key dependency rules:

  • julc-core has no dependency on other julc modules (only cbor-java)
  • julc-compiler depends on julc-core and julc-ledger-api but not on julc-vm
  • julc-vm depends only on julc-core
  • julc-stdlib provides StdlibRegistry used by julc-compiler at compile time
  • julc-testkit composes julc-compiler + julc-vm for test-time compile-and-execute

The on-chain language has 10 term variants:

sealed interface Term {
record Var(NamedDeBruijn name) // Variable (De Bruijn indexed)
record Lam(String name, Term body) // Lambda abstraction
record Apply(Term function, Term argument) // Function application
record Force(Term term) // Force polymorphic term
record Delay(Term term) // Delay evaluation (thunk)
record Const(Constant value) // Constant value
record Builtin(DefaultFun fun) // Built-in function (102 total)
record Error() // Halt execution
record Constr(long tag, List<Term> fields) // Constructor (Plutus V3)
record Case(Term scrutinee, List<Term> branches) // Case match (Plutus V3)
}

The intermediate representation adds types and high-level constructs:

sealed interface PirTerm {
record Var(String name, PirType type) // Named, typed variable
record Let(String name, PirTerm value, PirTerm body) // Let binding
record LetRec(List<Binding> bindings, PirTerm body) // Recursive let (loops)
record Lam(String param, PirType paramType, PirTerm body) // Typed lambda
record App(PirTerm function, PirTerm argument) // Application
record Const(Constant value) // Constant
record Builtin(DefaultFun fun) // Builtin
record IfThenElse(PirTerm cond, PirTerm then_, PirTerm else_) // Conditional
record DataConstr(int tag, PirType type, List<PirTerm> fields) // Data constructor
record DataMatch(PirTerm scrutinee, List<MatchBranch> branches) // Pattern match
record Error(PirType type) // Typed error
record Trace(PirTerm message, PirTerm body) // Debug trace
}
sealed interface PirType {
// Primitives
record IntegerType()
record ByteStringType()
record StringType()
record BoolType()
record UnitType()
record DataType() // Raw untyped Data (escape hatch)
// Containers
record ListType(PirType elemType)
record PairType(PirType first, PirType second)
record MapType(PirType keyType, PirType valueType)
record OptionalType(PirType elemType)
record ArrayType(PirType elemType) // PV11, CIP-156
// Functions
record FunType(PirType paramType, PirType returnType)
// Algebraic data types
record RecordType(String name, List<Field> fields)
record SumType(String name, List<Constructor> constructors)
}

The universal on-chain value encoding:

sealed interface PlutusData {
record ConstrData(long tag, List<PlutusData> fields)
record MapData(List<Map.Entry<PlutusData, PlutusData>> entries)
record ListData(List<PlutusData> items)
record IntData(BigInteger value)
record BytesData(byte[] value)
}

Every Java value eventually becomes PlutusData on-chain. The compiler inserts encode/decode operations at type boundaries.


JulcCompiler.compile() orchestrates a 24-step pipeline. Here is the complete flow:

Java Source(s)
┌─────────────────────────────────────────┐
│ 1. Parse (JavaParser → AST) │
│ 2. Validate (SubsetValidator) │
│ 3. Library check │
│ 4. Annotated class discovery │
│ 5. Script purpose detection │
│ 6. Type registration (ledger + user) │
│ 7. @Param field detection │
│ 8. Static field detection │
│ 9. Entrypoint discovery │
│ 10. Parameter validation │
│ 11. Library compilation (multi-pass) │
│ 12. Compose stdlib + library lookups │
│ 13. Symbol table setup │
│ 14. Helper method PIR generation │
│ 15. Entrypoint PIR generation │
│ 16. Helper method wrapping (Let) │
│ 17. Static field wrapping (Let) │
│ 18. Library method wrapping (Let/LetRec)│
│ 19. Validator wrapping (ScriptContext) │
│ 20. @Param wrapping (outer lambdas) │
│ 21. UPLC generation │
│ 22. Optimization (6 passes, fixpoint) │
│ 23. Program creation (PlutusV3) │
│ 24. ParamInfo creation │
└─────────────────────────────────────────┘
CompileResult(program, params, diagnostics)

The pipeline can be divided into four major phases:

PhaseStepsInput → Output
Frontend1-10Java source → validated AST + metadata
Middle-end11-18AST → PIR term tree
Backend19-22PIR → optimized UPLC
Output23-24UPLC → serialized Program

JuLC uses JavaParser to parse Java source into an AST:

StaticJavaParser.getParserConfiguration()
.setLanguageLevel(ParserConfiguration.LanguageLevel.JAVA_21);
CompilationUnit cu = StaticJavaParser.parse(source);

The parser handles Java 21 features: records, sealed interfaces, pattern matching in switch, instanceof patterns.

After parsing, SubsetValidator walks the AST to reject unsupported Java constructs. It extends JavaParser’s VoidVisitorAdapter and collects multiple errors:

SubsetValidator rejects:
try/catch, throw, synchronized, for(;;), do-while,
null, this, super, new T[], float/double, class inheritance
SubsetValidator allows:
for-each, while, break (in loops), records, sealed interfaces,
switch expressions, instanceof patterns, Optional<T>

Each rejection includes a suggestion pointing to the on-chain alternative:

ERROR: null is not supported on-chain
Suggestion: Use Optional<T> to represent absence

The compiler discovers:

  1. The validator class — annotated with @SpendingValidator, @MintingValidator, etc.
  2. The entrypoint method — annotated with @Entrypoint
  3. @Param fields — deployment-time parameters
  4. Static fields — compile-time constants

Before PIR generation can resolve types, all record and sealed interface types must be registered.

Stage 1: Ledger Types (LedgerTypeRegistry)

Pre-registers ~40 Cardano ledger types in 4 tiers (dependency order):

Tier 1: Leaf records → TxOutRef, Value, IntervalBound
Tier 2: Sealed interfaces → Credential, OutputDatum, ScriptInfo, Vote, DRep
Tier 3: Composite records → Address, TxOut, TxInInfo, Interval
Tier 4: Top-level → TxInfo (16 fields), ScriptContext

Governance types (Conway era): Vote, DRep, Voter, StakingCredential, Delegatee, TxCert, GovernanceAction, ProposalProcedure, Committee, ScriptPurpose.

Stage 2: User Types (TypeRegistrar)

Processes all compilation units (validator + libraries) together:

1. Collect all record and sealed interface declarations
2. Validate no duplicate type names
3. Build dependency graph (field types create edges)
4. Topological sort (Kahn's algorithm)
5. Register in dependency order

Circular dependencies are detected and reported as errors.

TypeResolver maps Java types to PIR types:

Java TypePIR Type
int, long, BigIntegerIntegerType
byte[], PubKeyHash, TxId, PolicyId, …ByteStringType
booleanBoolType
StringStringType
voidUnitType
PlutusData and subtypesDataType
List<T> / JulcList<T>ListType(resolve(T))
Map<K,V> / JulcMap<K,V>MapType(resolve(K), resolve(V))
Optional<T>OptionalType(resolve(T))
User recordsRecordType(name, fields)
User sealed interfacesSumType(name, constructors)

This is the heart of the compiler. PirGenerator (2,147 lines) transforms Java AST nodes into PIR terms, assisted by three extracted helper classes.

┌──────────────────────────────────────────────────────────┐
│ PirGenerator (2,147 lines) │
│ Entry points: generateMethod(), generateExpression() │
│ Owns: SymbolTable, TypeResolver, StdlibLookup │
│ │
│ Delegates to: │
│ ┌────────────────────────────────────────────────┐ │
│ │ AccumulatorTypeAnalyzer (432 lines) │ │
│ │ Pure AST analysis — detects loop accumulator │ │
│ │ types (List vs Map) from usage patterns │ │
│ └────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────┐ │
│ │ TypeInferenceHelper (282 lines) │ │
│ │ Read-only type inference — │ │
│ │ resolveExpressionType, inferPirType, │ │
│ │ inferBuiltinReturnType │ │
│ └────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────┐ │
│ │ LoopBodyGenerator (531 lines) │ │
│ │ Loop body compilation — 5 paths: │ │
│ │ single/multi-acc × break/no-break + zero-acc │ │
│ │ Pack/unpack accumulators, nested loop handling │ │
│ └────────────────────────────────────────────────┘ │
│ │
│ Also uses: │
│ ┌────────────────────────────────────────────────┐ │
│ │ TypeMethodRegistry (905 lines) │ │
│ │ Instance method dispatch (~50 methods) │ │
│ └────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────┐ │
│ │ PirHelpers (356 lines) │ │
│ │ wrapDecode, wrapEncode, list utilities │ │
│ └────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────┐ │
│ │ PirHofBuilders (244 lines) │ │
│ │ HOF PIR builders (map, filter, any, all, find) │ │
│ └────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
Java ExpressionPIR Term
42Const(Integer(42))
trueConst(Bool(true))
"hello"Const(String("hello"))
xVar("x", type)
a + bApp(App(Builtin(AddInteger), a), b)
a == bApp(App(Builtin(EqualsInteger), a), b) (type-aware)
!xIfThenElse(x, false, true)
cond ? a : bIfThenElse(cond, a, b)
new Point(x, y)DataConstr(0, RecordType, [x, y])
point.x()wrapDecode(HeadList(SndPair(UnConstrData(point))), fieldType)
Java StatementPIR Term
int x = 5; rest...Let("x", Const(5), rest)
return exprgenerateExpression(expr) (becomes final term)
someCall(); rest...Let("_", someCall, rest) (evaluate and discard)
if (c) { t } else { e }IfThenElse(c, t, e)
switch (s) { case A a -> ... }DataMatch(s, [branches...])
OpIntegerStringByteStringData
+AddIntegerAppendStringAppendByteString
==EqualsIntegerEqualsStringEqualsByteStringEqualsData
<LessThanIntegerLessThanByteString

Since UPLC has no loops, all loops are transformed into recursive LetRec patterns.

long sum = 0;
for (var item : items) {
sum = sum + item;
}

Becomes:

LetRec([loop__forEach__0 = \xs \acc ->
IfThenElse(NullList(xs),
acc, // base case: return accumulator
Let(item, wrapDecode(HeadList(xs), elemType),
loop__forEach__0(TailList(xs), acc + item))) // recursive case
], loop__forEach__0(items, 0)) // initial call
long n = x;
while (n > 0) {
n = n - 1;
}

Becomes:

LetRec([loop__while__0 = \n ->
IfThenElse(n > 0,
loop__while__0(n - 1), // recursive case
n) // base case: return accumulator
], loop__while__0(x))
PathAccumulatorsBreak?Strategy
A1NoSimple LetRec with single accumulator parameter
B1YesBreak returns accumulator directly; continue recurses
C2+NoPack accumulators into Data list tuple
D0Unit accumulator, discard result
E2+YesTuple packing + break-aware body

AccumulatorTypeAnalyzer distinguishes ListType vs MapType accumulators by scanning the loop body for evidence:

Evidence FoundInferred Type
mkNilPairData() initializationMapType
fstPair()/sndPair() on cursor elementMapType
mkCons() with mkPairData() itemsMapType
mkNilData() without pair evidenceListType

UplcGenerator lowers PIR to UPLC by erasing types and eliminating high-level constructs.

Let → Application:

Let(x, val, body) → Apply(Lam("x", body'), val')

LetRec → Z-combinator (strict fixed-point):

Z = \f -> (\x -> f (\v -> x x v)) (\x -> f (\v -> x x v))
LetRec([name = body], expr)
→ Apply(Lam(name, expr'), Apply(Z, Lam(name, body')))

IfThenElse → Force/Delay (lazy branches):

IfThenElse(c, t, e)
→ Force(Apply(Apply(Apply(Force(Builtin(IfThenElse)), c'),
Delay(t')), ← prevent eager evaluation
Delay(e')))

DataConstr → ConstrData:

DataConstr(tag, type, [f1, f2])
→ ConstrData(tag, MkCons(encode(f1), MkCons(encode(f2), MkNilData)))

DataMatch → Tag dispatch:

DataMatch(scrutinee, branches)
→ Let(pair, UnConstrData(scrutinee),
Let(tag, FstPair(pair),
Let(fields, SndPair(pair),
IfThenElse(tag == 0, branch0(fields),
IfThenElse(tag == 1, branch1(fields),
Error)))))

De Bruijn Indexing: Named variables become integer indices counting lambda binders outward:

\x -> \y -> x → Lam("x", Lam("y", Var(2)))
^^^ cross y(1) then x(2)
ForcesBuiltins
2 (∀ a b)FstPair, SndPair, ChooseList
1 (∀ a)IfThenElse, Trace, MkCons, HeadList, TailList, NullList, …
0 (monomorphic)Arithmetic, comparisons, crypto, encode/decode

UplcOptimizer runs 6 passes in a fixpoint loop (max 20 iterations, stops when unchanged):

Force(Delay(t)) → t

Undoes the lazy wrapping when the result is immediately forced.

AddInteger(3, 4) → 7
EqualsInteger(8, 8) → True

Supports: AddInteger, SubtractInteger, MultiplyInteger, EqualsInteger, LessThanInteger, LessThanEqualsInteger, EqualsByteString, AppendByteString.

Apply(Lam(x, body), val) → body (when x unused in body, val has no side effects)

Trace calls are considered side-effecting and preserved.

Apply(Lam(x, body), arg) → substitute(body, arg) (when x used once, arg is "simple")

“Simple” = Const, Var, Builtin, or Force of these. Complex args are not inlined.

Lam(x, Apply(f, Var(1))) → f (when x not free in f, f is a value)
Case(Constr(tag, fields), branches) → Apply(branches[tag], fields...)

ValidatorWrapper adds ScriptContext decoding and bool→unit/error conversion:

\scriptContextData ->
let ctxFields = SndPair(UnConstrData(scriptContextData))
let redeemer = HeadList(TailList(ctxFields)) // field 1
let result = validate(redeemer, scriptContextData)
in IfThenElse(result, Unit, Error)

For spending validators with datum (3-param), the wrapper also extracts the datum from ScriptInfo.SpendingScript.

Each @Param field adds an outer lambda:

\param1__raw -> Let(param1, UnIData(param1__raw),
\param2__raw -> Let(param2, UnBData(param2__raw),
<validator body>))

The final UPLC term is wrapped in Program.plutusV3(term) and serialized:

  1. FLAT encoding — compact bit-level binary format
  2. CBOR wrapping — standard Cardano on-chain format

PIR serves as the typed bridge between Java and UPLC. It preserves type information that UPLC lacks, enabling the compiler to insert correct encode/decode operations.

UPLC is too low-level to compile to directly:

  • No named variables (De Bruijn indices only)
  • No types (everything is untyped)
  • No Let bindings (only lambda application)
  • No loops (only recursion via combinators)
  • No pattern matching (only tag extraction)

PIR adds these as first-class constructs, making compilation from Java AST straightforward:

Java ConstructPIR ConstructUPLC Lowering
Variable declarationLet(name, value, body)Apply(Lam(name, body), value)
LoopLetRec([binding], body)Z-combinator application
Record constructionDataConstr(tag, type, fields)ConstrData(tag, fieldList)
Pattern matchingDataMatch(scrutinee, branches)Tag dispatch chain
Method callApp(function, argument)Apply(function, argument)
ConditionalIfThenElse(c, t, e)Force(Apply(Apply(Apply(Force(IfThenElse), c), Delay(t)), Delay(e)))

On the Cardano ledger, all values are encoded as Data — a universal 5-constructor representation:

Data ConstructorEncodesEncode BuiltinDecode Builtin
I(integer)int, long, BigIntegerIDataUnIData
B(bytestring)byte[], hash typesBDataUnBData
Constr(tag, fields)Records, sealed interfaces, booleans, OptionalConstrDataUnConstrData
Map(pairs)Map<K,V>MapDataUnMapData
List(items)List<T>ListDataUnListData

The compiler must insert encode/decode wrappers at every type boundary. Two key helpers in PirHelpers.java:

wrapDecode(data, targetType) — Extract typed value from raw Data:

IntegerType → UnIData(data)
ByteStringType → UnBData(data)
ListType → UnListData(data)
MapType → UnMapData(data)
BoolType → EqualsInteger(FstPair(UnConstrData(data)), 1)
StringType → DecodeUtf8(UnBData(data))
DataType → data (pass through)

wrapEncode(value, type) — Wrap typed value back to Data:

IntegerType → IData(value)
ByteStringType → BData(value)
BoolType → IfThenElse(value, ConstrData(1,[]), ConstrData(0,[]))
StringType → BData(EncodeUtf8(value))
ListType → ListData(value)
MapType → MapData(value)
DataType → value (pass through)
// Record: tag 0, fields in order
record Point(int x, int y) {}
// Point(10, 20) → Constr(0, [I(10), I(20)])
// Sealed interface: ascending tags per variant
sealed interface Shape {
record Circle(int radius) implements Shape {} // tag 0
record Rect(int w, int h) implements Shape {} // tag 1
}
// Circle(5) → Constr(0, [I(5)])
// Rect(3,4) → Constr(1, [I(3), I(4)])

Booleans map to Constr(1, []) (True) and Constr(0, []) (False), matching the Haskell Plutus convention.


17. Data Encoding: The Bridge Between Java and Plutus

Section titled “17. Data Encoding: The Bridge Between Java and Plutus”

Understanding encode/decode insertion is key to understanding the compiler. Here’s when each happens:

SituationDirectionExample
Record field accessDecodepoint.x()UnIData(HeadList(fields))
Record constructionEncodenew Point(x, y)ConstrData(0, [IData(x), IData(y)])
List element accessDecodelist.head()wrapDecode(HeadList(list), elemType)
List prependEncodelist.prepend(elem)MkCons(wrapEncode(elem), list)
Method parameterPass-throughEntrypoint params are raw Data
@Param decodeDecode@Param int feeUnIData(param__raw)
HOF lambda argumentDecodelist.map(x -> ...) → unwrap x from Data
HOF lambda resultEncodelist.map(x -> x+1) → wrap result to Data

18. PirGenerator and Its Extracted Helpers

Section titled “18. PirGenerator and Its Extracted Helpers”

Before the ADR-018 refactoring, PirGenerator was a 3,369-line monolith. It was decomposed into focused classes:

ClassLinesResponsibility
PirGenerator2,147Core: expressions, statements, method calls, record access, control flow
LoopBodyGenerator531Loop body compilation across 5 paths (single/multi-acc × break/no-break)
AccumulatorTypeAnalyzer432Pure AST analysis: detect accumulator types from usage patterns
TypeInferenceHelper282Read-only type queries: resolve expression types, infer PIR types
TypeMethodRegistry905Instance method dispatch: 50+ methods across 11 type categories
PirHelpers356Static utilities: wrapDecode, wrapEncode, list operations
PirHofBuilders244HOF PIR construction: map, filter, any, all, find, foldl, zip
PirGenerator
├── calls ──► AccumulatorTypeAnalyzer.refineAccumulatorTypes()
│ (before loop compilation, to determine accumulator types)
├── calls ──► TypeInferenceHelper.resolveExpressionType()
│ (during expression compilation, for type-aware dispatch)
├── calls ──► LoopBodyGenerator.generateSingleAccBody() / etc.
│ (loop body compilation; LoopBodyGenerator calls BACK to
│ PirGenerator for nested expressions/statements)
├── calls ──► TypeMethodRegistry.resolve(type, method, args)
│ (instance method dispatch: list.head(), map.get(), etc.)
├── calls ──► PirHelpers.wrapDecode() / wrapEncode()
│ (encode/decode insertion at type boundaries)
└── calls ──► PirHofBuilders.buildMap() / buildFilter() / etc.
(HOF lambda compilation for list.map(), list.filter(), etc.)

PirGenerator.generateMethodCall() implements a 5-level cascading dispatch. After the ADR-018 refactoring, each level is a named method:

private PirTerm generateMethodCall(MethodCallExpr mce) {
// Level 1: BigInteger.valueOf(n) → identity
result = tryBigIntegerConstant(mce, scope, method, args);
if (result != null) return result;
// Level 2: PlutusData.fromPlutusData() → identity
result = tryFromPlutusDataIdentity(mce, scope, method, args);
if (result != null) return result;
// Level 3: (Type)(Object) cast patterns
result = tryPlutusDataCast(mce, scope, method, args);
if (result != null) return result;
// Level 4: Stdlib static methods (Builtins.*, ListsLib.*, Math.*)
result = tryStaticStdlibCall(mce, scope, method, args);
if (result != null) return result;
// Level 5: Instance methods via TypeMethodRegistry
return resolveInstanceMethodCall(mce, scope, method, args);
// Falls through to: record field access → helper method → error
}
Java CallDispatchPIR Output
list.head()TypeMethodRegistry → ListType.headwrapDecode(HeadList(list), elemType)
list.map(x -> x+1)TypeMethodRegistry → ListType.map → PirHofBuildersLetRec fold with lambda
map.get(key)TypeMethodRegistry → MapType.getLetRec pair list search
value.lovelaceOf()TypeMethodRegistry → “Value.lovelaceOf” (named dispatch)Nested map lookup
n.abs()TypeMethodRegistry → IntegerType.absIfThenElse(n < 0, 0-n, n)

StdlibRegistry (in julc-stdlib) provides PIR term builders for ~65 methods. These are methods that cannot be compiled from Java source because they require low-level PIR/UPLC constructs (lambdas as values, LetRec builders, raw builtin chaining).

Categories:

CategoryExamples
Raw builtinsBuiltins.headList(), Builtins.iData(), Builtins.sha2_256()
HOF buildersListsLib.map(), ListsLib.filter(), ListsLib.foldl()
Math delegatesMath.abs(), Math.max(), Math.min()
Factory methodsOptional.of(), Optional.empty(), PubKeyHash.of()

These are @OnchainLibrary-annotated Java classes compiled from source:

LibraryMethodsFocus
ListsLib17+List operations + HOFs (map, filter, any, all, find, foldl, zip)
MapLib9Map operations (lookup, insert, delete, keys, values)
ValuesLib9Multi-asset Value operations (add, subtract, compare)
ContextsLib13ScriptContext/TxInfo field extraction helpers
OutputLib8TxOut querying (outputsAt, lovelacePaidTo, etc.)
MathLib8Math utilities (abs, pow, divMod, quotRem, expMod)
IntervalLib5Time interval operations
CryptoLib3ECDSA, Schnorr, RIPEMD-160
ByteStringLib8ByteString manipulation
BitwiseLib10Bitwise operations
AddressLib3Address/credential utilities
BlsLibBLS12-381 operations (PV11)
NativeValueLibNative Value operations (PV11)

When the compiler encounters SomeClass.method(args):

CompositeStdlibLookup
├── 1. StdlibRegistry (builtins, HOFs, math — ~65 methods)
└── 2. LibraryMethodRegistry (compiled @OnchainLibrary methods)

First match wins. If neither matches, the compiler falls through to helper method lookup or reports an error.


Library methods may depend on each other across files. LibraryCompiler (extracted from JulcCompiler in ADR-018) uses a multi-pass retry strategy:

Pass 1: Try compiling all library CUs
→ LibA succeeds (no dependencies)
→ LibB fails (depends on LibA, not yet available)
Pass 2: Try remaining CUs
→ LibB succeeds (LibA now in registry)
Pass 3: No progress → report errors for any remaining

Before wrapping library methods as Let bindings around the validator:

  1. Build dependency graph: method A depends on method B if A’s PIR body references B
  2. Kahn’s algorithm sorts dependencies first
  3. Self-recursive methods (detected via PirHelpers.containsVarRef()) use LetRec instead of Let

The VM uses Java’s ServiceLoader pattern for pluggable backends:

// SPI interface
public interface JulcVmProvider {
EvalResult evaluate(Program program, PlutusLanguage language,
ExBudget budget, PlutusData... args);
int priority(); // higher = preferred
}
// Facade (auto-discovers best provider)
JulcVm vm = new JulcVm();
EvalResult result = vm.evaluate(program, scriptContext);
BackendModulePriorityStatus
Scalusjulc-vm-scalus50Default, wraps Scalus 0.16.0 (Scala)
Pure Javajulc-vm-javaIn development
Trufflejulc-vm-truffleExperimental (GraalVM)
sealed interface EvalResult {
record Success(Term result, ExBudget budget)
record Failure(String message)
record BudgetExhausted(ExBudget consumed)
}

julc-testkit provides ValidatorTest for JUnit-based testing:

class MyValidatorTest extends ValidatorTest {
@Test
void testValidator() {
// Compile
var result = compile("MyValidator.java source...");
assertNotNull(result.program());
// Evaluate
var evalResult = evaluate(result.program(), testScriptContext);
assertTrue(evalResult instanceof EvalResult.Success);
}
}
┌─────────────────────────────┐
│ E2E Tests │ julc-e2e-tests (CIP-113, on-chain deploy)
│ (Yaci DevKit) │ Requires external devnet
├─────────────────────────────┤
│ Integration Tests │ julc-compiler tests (compile + evaluate)
│ (Compile → Evaluate) │ 3,442+ tests, 0 failures
├─────────────────────────────┤
│ Unit Tests │ Individual component tests
│ (TypeResolver, SymbolTable,│ (PirHelpers, UplcOptimizer, etc.)
│ SubsetValidator, etc.) │
└─────────────────────────────┘
@Test
void testFeature() {
var source = """
@Validator
public class Test {
@Entrypoint
public static boolean validate(PlutusData redeemer, PlutusData ctx) {
// Test logic that should return true
return someCondition;
}
}
""";
// Compile and evaluate in one step
compileAndAssertTrue(source);
}

GoldenUplcTest captures compiled UPLC hex for 8 representative validators. After any refactoring, golden files must match byte-for-byte — ensuring that internal restructuring produces identical output.


Let’s trace a simple validator through the entire pipeline:

@SpendingValidator
public class AlwaysSucceeds {
@Entrypoint
public static boolean validate(PlutusData redeemer, PlutusData scriptContext) {
int x = 5;
int y = x + 3;
return y == 8;
}
}

JavaParser produces a CompilationUnit with one class declaration containing one method with three statements.

SubsetValidator walks the AST. No rejected constructs.

No user-defined types. LedgerTypeRegistry pre-registers ScriptContext, TxInfo, etc.

Lam("redeemer", DataType,
Lam("scriptContext", DataType,
Let("x", Const(Integer(5)),
Let("y", App(App(Builtin(AddInteger), Var("x", IntegerType)), Const(Integer(3))),
App(App(Builtin(EqualsInteger), Var("y", IntegerType)), Const(Integer(8)))))))

Adds ScriptContext decoding and bool→unit/error:

Lam("__scriptContextData", DataType,
Let("__ctxFields", SndPair(UnConstrData(Var("__scriptContextData"))),
Let("__redeemer", HeadList(TailList(Var("__ctxFields"))),
Let("__result",
App(App(entrypoint, Var("__redeemer")), Var("__scriptContextData")),
IfThenElse(Var("__result"), Const(Unit), Error)))))

LetApply(Lam(...), ...), variables → De Bruijn indices, IfThenElseForce/Delay.

  • Constant folding: AddInteger(5, 3)8
  • Constant folding: EqualsInteger(8, 8)True
  • Beta reduction: inline single-use variables
  • Dead code elimination: remove unused bindings
  • Force/Delay cancellation: Force(Delay(True))True

After optimization, the validator becomes equivalent to \ctx -> Unit (always succeeds).

Program.plutusV3(optimizedTerm) → FLAT encoding → CBOR wrapping → hex string ready for on-chain submission.


Top-level:

FileLinesRole
JulcCompiler.java1,221Main pipeline orchestrator — 24-phase compilation
LibraryCompiler.java138Library compilation sub-pipeline
CompileResult.javaCompilation result (program + diagnostics + params)
CompilerException.javaFatal compiler error
CompilerOptions.javaCompilation options
LibrarySourceResolver.javaClasspath scanning + transitive BFS library resolution

pir/ — PIR generation subsystem:

FileLinesRole
PirGenerator.java2,147Core Java AST → PIR transformer
LoopBodyGenerator.java531Loop body compilation (5 paths × break/no-break)
AccumulatorTypeAnalyzer.java432Accumulator type analysis (pair list detection)
TypeInferenceHelper.java282Read-only type inference
TypeMethodRegistry.java905Instance method dispatch (~50 methods across 11 types)
PirHelpers.java356wrapDecode/wrapEncode + list utilities
PirHofBuilders.java244HOF PIR builders (map, filter, any, all, find, foldl, zip)
PirTerm.javaPIR term AST (12 variants)
PirType.javaPIR type system (13+ variants)
PirFormatter.javaPIR pretty-printing
PirSubstitution.javaPIR variable substitution
StdlibLookup.javaFunctional interface for stdlib resolution
CompositeStdlibLookup.javaChains multiple StdlibLookup instances

resolve/ — Type resolution:

FileRole
TypeResolver.javaJava → PIR type mapping
TypeRegistrar.javaTopological type registration (Kahn’s algorithm)
SymbolTable.javaScope stack for variable/method management
LedgerSourceLoader.javaDynamic ledger type loading from META-INF
LibraryMethodRegistry.javaCompiled library method storage + typed coercion
ImportResolver.javaImport resolution

Other packages:

FileRole
codegen/ValidatorWrapper.javaScriptContext decoding + bool→unit/error wrapping
codegen/DataCodecGenerator.javaData codec generation
desugar/LoopDesugarer.javaFor-each/while → LetRec transformation
desugar/PatternMatchDesugarer.javaSwitch/instanceof → DataMatch transformation
error/CompilerDiagnostic.javaDiagnostic record (level, message, location)
error/DiagnosticCollector.javaStructured error collection
uplc/UplcGenerator.javaPIR → UPLC lowering
uplc/UplcOptimizer.java6-pass UPLC optimizer with fixpoint iteration
validate/SubsetValidator.javaJava subset enforcement
util/MethodDependencyResolver.javaMethod dependency graph construction
util/StringUtils.javaString utilities (Levenshtein distance)
FileRole
julc-core/.../Term.javaUPLC term AST (10 variants)
julc-core/.../DefaultFun.java102 Plutus builtin functions
julc-core/.../PlutusData.javaUniversal on-chain data encoding
julc-stdlib/.../StdlibRegistry.javaPIR term builders for ~65 stdlib methods
julc-vm/.../JulcVmProvider.javaVM SPI interface
julc-testkit/.../ValidatorTest.javaTesting base class

TermDefinition
UPLCUntyped Plutus Lambda Calculus — the on-chain execution language
PIRPlutus Intermediate Representation — typed bridge between Java and UPLC
De Bruijn indexVariable indexing where variables reference lambda binders by distance (1 = innermost)
DataUniversal on-chain value encoding (5 constructors: Constr, Map, List, I, B)
CEK machineCount-Evaluate-Kont — the Plutus virtual machine
ScriptContextLedger data passed to every validator (TxInfo + redeemer + script info)
eUTxOExtended Unspent Transaction Output — Cardano’s accounting model
Force/DelayUPLC constructs for lazy evaluation (needed because UPLC is strict/call-by-value)
Z-combinatorStrict fixed-point combinator enabling recursion in UPLC
wrapDecodePirHelpers method to extract a typed value from raw Data
wrapEncodePirHelpers method to wrap a typed value back into Data
SumTypeTagged union — a sealed interface with record variants
RecordTypeProduct type — a record with named typed fields
LetRecRecursive let binding — used for loops and self-recursive functions
AccumulatorVariable modified across loop iterations (packed into Data tuples for multi-acc)
Kahn’s algorithmTopological sort used for type registration and library ordering
StdlibLookupInterface for resolving static method calls to PIR terms
TypeMethodRegistryRegistry mapping (PirType, method) pairs to instance method handlers
@OnchainLibraryAnnotation marking a class as a reusable on-chain library
@ParamAnnotation marking a validator field as a deployment-time parameter
@EntrypointAnnotation marking the main validator method
FLATBinary encoding format for UPLC programs on-chain
SOPsSums of Products — Plutus V3 constructor/case terms
Golden testTest that compares output byte-for-byte against a saved reference file
Conway eraCurrent Cardano era with governance features (CIP-1694)