Research Applied AI Portfolio Get In Touch
Research

We build what doesn't exist yet.

Three active research directions: Physical AI (primary focus), Armenian & low-resource language AI, and visual document understanding. We publish openly and put models on Hugging Face.

Direction 01
Primary Focus Physical AI

Where AI meets the real world.

We believe the most interesting unsolved problems in AI are about understanding and acting in physical environments. Our research sits at the intersection of two hard problems.

Thread A

Real-World Intelligence

Training VLMs to reason about spatial structure, scene geometry, and physical constraints — moving beyond "what is this" to "where is this, how does it move, what will happen next."

  • Spatial intelligence and 3D scene understanding in VLMs
  • Embodied reasoning — what can and can't happen in a physical scene
  • Depth, affordance, and object permanence in vision-language models
Thread B

Robotic Control & Planning

Joint Embedding Predictive Architectures (JEPA) for long-horizon robotic planning. Teaching models to predict useful representations of future states, not just next tokens.

  • JEPA-based planners for multi-step manipulation tasks
  • Vision-Language-Action models for dexterous control
  • Long-horizon prediction without step-by-step supervision
Latest paper

JEPA for Long-Horizon Robotic Planning

Submitted to NeurIPS 2026. JEPA-based architectures trained for extended planning over physical action sequences in robotics tasks.

NeurIPS 2026 · Under Review
Direction 02
Open Science · Pro Bono Armenian AI & Low-Resource Languages

Armenian AI — and a blueprint for every underrepresented language.

We are Armenians. We invest our own budget and time into building AI infrastructure for Armenian — not because it's a business, but because it matters. The methods we develop translate directly to Georgian, Uzbek, and any other low-resource language.

🇦🇲 Released

ATE-1 & ATE-2

Armenian Text Embeddings — SOTA embedding models for Armenian. Outperform Gemini and OpenAI embeddings on every Armenian benchmark.

#1 on ArmBench-TextEmbed
Open

ArmBench Leaderboards

ArmBench-TextEmbed and ArmBench-LLM — the first open benchmarks for evaluating text embedding and LLM performance in Armenian.

First Armenian benchmarks
In progress

Armenian LLM

Training an LLM with a localized tokenizer from scratch. Existing models are tokenizer-inefficient for Armenian — we're fixing that with our own data pipeline.

Active training
Published · EACL 2026 · LoResLM Workshop

Adapting Text Embeddings to Low-Resource Languages with Noisy Translations

SOTA embedding quality in a low-resource language using only 10k noisy translations — no large parallel corpora required. Applied to Armenian; validated on Georgian and Uzbek. Method generalises to any low-resource language.

EACL 2026 · Published
Direction 03
Past Research Document AI

We pioneered visual document retrieval. Then moved on.

We built ColPali-style multimodal embeddings for visual document retrieval and held the #1 rank on the ViDoRe benchmark globally. The field matured and became crowded — we turned our attention to harder problems.

ViDoRe Benchmark · Globally #1 (held)

ColQwen Multimodal Document Embedding Series

Visual document retrieval embeddings ranked #1 on ViDoRe benchmark. Models still available and used in production. Research direction discontinued.

View on Hugging Face →

Everything is open.

All our models and benchmarks are public. We believe open science makes everyone better.

Metric-AI on Hugging Face Collaborate with us →