lab

Current experiments.

Two public projects are live now: one playable game arena, one fresh cube-solving benchmark receipt.

playable evallive

Octopus Arena

Eleven local model lanes built the same JavaScript space shooter. Play the games, rate them, and compare public ratings against code review.

Open project →
visual benchmarknew

Rubik's Arena

Models solve deterministic 3x3 cube states under a strict referee. The first receipt proves the visual net, traces, replay, and leaderboard pipeline.

Open project →