OktoScript

A decision-driven language for training, evaluating and governing AI models

A domain-specific language (DSL) designed for autonomous AI pipelines with built-in decision, control, monitoring and governance capabilities

Table of Contents

About OktoScript

OktoScript is a decision-driven language created by OktoSeek AI to design, train, evaluate, control and govern AI models end-to-end.

It goes far beyond a simple training script. OktoScript introduces native intelligence, autonomous decision-making and behavioral control into the AI development lifecycle.

OktoScript uses the file extension .okt and is executed by the OktoEngine, which is integrated into the OktoSeek IDE. This allows you to define complete AI training pipelines with built-in decision logic using a simple, declarative syntax.

Currently in version 1.2, OktoScript is the official language of the OktoSeek ecosystem, used by OktoSeek IDE, OktoEngine, VS Code Extension, and various AI development tools.

Key Features:
  • Decision-driven - Built-in CONTROL logic (IF, WHEN, SET, STOP, LOG, SAVE…)
  • Human-readable - Intuitive syntax that engineers and non-engineers can understand
  • Self-monitoring - Tracks metrics, detects anomalies and adapts automatically
  • Behavior-aware - Control personality, language, restrictions and style
  • Safe by design - Integrated GUARD and SECURITY layers
  • Strongly structured - Validated, deterministic and reproducible pipelines
  • Training-aware - Created specifically for AI training and optimization
  • Extensible - Extensible through OktoEngine and custom modules

🚀 Quick Start - Your First 5 Minutes

Get started with OktoScript in just 5 minutes. Follow these simple steps:

Step 1: Create Your First File

Create a file named train.okt:

PROJECT "MyFirstModel" DESCRIPTION "My first OktoScript project" DATASET { train: "dataset/train.jsonl" format: "jsonl" type: "chat" } MODEL { base: "oktoseek/base-mini" } TRAIN { epochs: 3 batch_size: 16 device: "cpu" } EXPORT { format: ["okm"] path: "export/" }

Step 2: Prepare Your Dataset

Create dataset/train.jsonl:

{"input":"Hello","output":"Hi! How can I help?"} {"input":"What's the weather?","output":"I don't have weather data."} {"input":"Thank you","output":"You're welcome!"}

Step 3: Validate & Train

# Validate your configuration okto validate train.okt # Train your model okto run train.okt

That's it! Your model will be trained and exported automatically. Read the full guide →

What is OktoScript?

OktoScript is a decision-driven language created by OktoSeek AI to design, train, evaluate, control and govern AI models end-to-end.

It allows you to define:

  • How a model is trained - Full fine-tuning, LoRA adapters, multi-dataset training
  • How it should behave - Personality, verbosity, language, response style
  • How it should react to problems - Automatic parameter adjustment, early stopping, checkpoint management
  • How and when it should stop, adapt or improve itself - CONTROL block with conditional logic

All using clear, readable and structured commands, built specifically for AI engineering.

Why OktoScript is Different

Traditional AI development is reactive.
You manually monitor metrics, fix problems and restart training.

OktoScript is proactive.

It allows the model to:

  • Detect instability automatically
  • Reduce or increase learning rate based on metrics
  • Adapt batch size based on GPU memory
  • Stop when performance drops
  • Save only the best checkpoints
  • Apply rules when patterns are detected

In other words, OktoScript doesn't just train models — it governs intelligence.

💡 Two Ways to Use OktoSeek:
  • No-Code Interface: Create agents and models using the visual IDE
  • OktoScript (Advanced): Use our decision-driven language for full customization and autonomous control

Use Cases & Examples

OktoScript is perfect for various AI development scenarios:

1. Conversational AI & Chatbots

Create intelligent chatbots for customer service, support, or entertainment.

PROJECT "CustomerSupportBot" DATASET { train: "dataset/support_qa.jsonl" format: "jsonl" type: "chat" } MODEL { base: "oktoseek/chat-base" architecture: "transformer" } TRAIN { epochs: 10 batch_size: 32 device: "cuda" }

2. Computer Vision & Image Classification

Build vision models for object detection, image classification, or visual understanding.

PROJECT "ImageClassifier" DATASET { train: "dataset/images/" format: "image+caption" type: "vision" augmentation: ["flip", "rotate"] } MODEL { base: "oktoseek/vision-base" architecture: "vision-transformer" }

3. Fine-Tuning Large Language Models

Fine-tune LLMs with checkpoint resume, custom callbacks, and advanced configurations.

See: finetuning-llm.okt example

4. Question-Answering Systems

Build QA systems with embeddings and similarity search for semantic retrieval.

See: qa-embeddings.okt example

5. Recommendation Systems

Create recommendation engines for e-commerce, content platforms, or personalized experiences.

See: recommender.okt example

Basic File Structure

Every OktoScript file (.okt) follows this general structure:

PROJECT "ProjectName" { DATASET { ... } MODEL { ... } TRAIN { ... } # Optional blocks: EVALUATE { ... } INFER { ... } EXPORT { ... } DEPLOY { ... } }

Required blocks: PROJECT, DATASET, MODEL, TRAIN

Optional blocks: EVALUATE, INFER, EXPORT, DEPLOY

PROJECT Block

The PROJECT block defines your project metadata, including name, description, author, and version.

PROJECT "PizzaBot" DESCRIPTION "AI specialized in pizza restaurant service" VERSION "1.0" AUTHOR "OktoSeek" TAGS ["food", "restaurant", "chatbot"]

Available fields:

  • PROJECT - Project name (required)
  • DESCRIPTION - Project description (optional)
  • VERSION - Version string (optional)
  • AUTHOR - Author name (optional)
  • TAGS - List of tags (optional)

DATASET Block

The DATASET block defines your training data, including paths, format, type, and language.

DATASET { train: "dataset/train.jsonl" validation: "dataset/val.jsonl" test: "dataset/test.jsonl" format: "jsonl" type: "chat" language: "en" }

Supported formats: JSONL, CSV, TXT, Parquet, Image+Caption, QA, Instruction, Multi-modal

Supported types: classification, generation, qa, chat, vision, regression

Supported languages: en, pt, es, fr, multilingual

MODEL Block

The MODEL block defines your base model, architecture, parameters, and precision settings. It can optionally include an ADAPTER sub-block to apply parameter-efficient fine-tuning methods such as LoRA, QLoRA, PEFT, or other adapters.

MODEL { name: "oktogpt" base: "google/flan-t5-base" architecture: "transformer" parameters: 120M context_window: 2048 precision: "fp16" device: "cuda" ADAPTER { type: "lora" path: "D:/model_trainee/phase1_sharegpt/ep2" rank: 16 alpha: 32 } }

Available fields:

  • name - Model name (optional)
  • base - Base model name (required)
  • architecture - transformer, cnn, rnn, diffusion, vision-transformer
  • parameters - Model size (e.g., 120M, 7B)
  • context_window - Maximum context length
  • precision - fp32, fp16, int8, int4
  • device - cuda, cpu, mps, auto
  • ADAPTER - LoRA/PEFT adapter configuration (optional)

ADAPTER types: lora, qlora, adapter, peft

TRAIN Block

The TRAIN block is one of the most important blocks. It defines all training hyperparameters, optimizer settings, and training strategy.

TRAIN { epochs: 10 batch_size: 32 gradient_accumulation: 2 learning_rate: 0.00025 optimizer: "adamw" scheduler: "cosine" loss: "cross_entropy" device: "cuda" gpu: true mixed_precision: true early_stopping: true checkpoint_steps: 100 output: "./models/pizzabot-v2" }

Key parameters:

  • epochs - Number of training epochs
  • batch_size - Batch size for training
  • learning_rate - Learning rate
  • optimizer - adam, adamw, sgd, rmsprop
  • scheduler - linear, cosine, step
  • device - cpu, cuda, mps
  • early_stopping - Enable early stopping

EVALUATE Block

The EVALUATE block defines which metrics to calculate during evaluation.

EVALUATE { metrics [ "loss", "accuracy", "perplexity", "f1", "bleu", "rouge" ] }

Supported metrics: loss, accuracy, perplexity, precision, recall, f1, bleu, rouge, mae, mse, cosine_similarity, token_efficiency, response_coherence, hallucination_score

🧠 CONTROL Block — Decision Engine

The CONTROL block enables logical, conditional, event-based, and metric-based decisions during training and inference. It introduces a cognitive-level abstraction that allows AI models to take decisions, self-adjust, and self-regulate in a declarative and clean way.

OktoScript enables true declarative AI governance. CONTROL blocks can contain nested conditions and nested event triggers, making it a unique declarative decision-making language in the market.

Supported Events

  • on_step_end - Executed at the end of each training step
  • on_epoch_end - Executed at the end of each epoch
  • validate_every - Execute validation every X steps
  • on_memory_low - Triggered when GPU/RAM is low
  • on_nan - Triggered when NaN values are detected
  • on_plateau - Triggered when loss is stagnant

Supported Directives

IF, WHEN, EVERY, SET, STOP, LOG, SAVE, RETRY, REGENERATE, STOP_TRAINING, DECREASE, INCREASE

Example with Nested Blocks

CONTROL { on_epoch_end { IF loss > 2.0 { SET LR = 0.00005 LOG "High loss detected" WHEN gpu_usage > 90% { SET batch_size = 16 LOG "Reducing batch size due to GPU pressure" } IF val_loss > 3.0 { STOP_TRAINING } } IF accuracy > 0.9 { SAVE "best_model" LOG "High accuracy reached" } EVERY 2 epochs { SAVE "checkpoint_epoch_{epoch}" } } validate_every: 200 IF epoch > 5 AND accuracy < 0.6 { SET LR = 0.00001 LOG "Model is stagnated" } }
Philosophy: OktoScript keeps the surface clean and simple, while the engine behind it performs complex cognitive decision-making. CONTROL defines logic, MONITOR defines awareness, GUARD defines safety, BEHAVIOR defines personality.

📊 MONITOR Block — Full Metrics Support

The MONITOR block tracks ANY available training or system metric. It supports all native and custom metrics, including loss, accuracy, GPU usage, throughput, latency, confidence, and more.

MONITOR { metrics: [ "loss", "accuracy", "val_loss", "gpu_usage", "ram_usage", "throughput", "latency", "confidence" ] notify_if { loss > 2.0 gpu_usage > 90% temperature > 85 hallucination_score > 0.5 } log_to: "logs/training.log" }

🛡️ GUARD Block — Safety / Ethics / Protection

The GUARD block defines safety rules during generation and training. The engine knows exactly what to prevent, how to detect violations, and what action to take.

GUARD { prevent { hallucination toxicity bias data_leak unsafe_code personal_data illegal_content } detect_using: ["classifier", "regex", "embedding"] on_violation { REPLACE with_message: "Sorry, this request is not allowed." } }

Detection methods: classifier, embedding, regex, rule_engine, ml_model

🎭 BEHAVIOR Block — Model Personality

The BEHAVIOR block defines how the model should behave in chat/inference. It sets personality, verbosity, language, and content restrictions.

BEHAVIOR { mode: "chat" personality: "friendly" verbosity: "medium" language: "en" avoid: ["violence", "hate", "politics"] fallback: "How can I help you?" prompt_style: "User: {input}\nAssistant:" }

Personality types: professional, friendly, assistant, casual, formal, creative

💬 INFERENCE Block — Advanced Configuration

The INFERENCE block defines how the model behaves during inference, prediction, or interactive chat. It supports multiple modes, format templates, and nested CONTROL logic.

INFERENCE { mode: "chat" format: "User: {input}\nAssistant:" exit_command: "/exit" params { max_length: 120 temperature: 0.7 top_p: 0.9 beams: 2 do_sample: true } CONTROL { IF confidence < 0.3 { RETRY } IF hallucination_score > 0.5 { REPLACE WITH "I'm not certain about that." } } }

Supported modes: chat, intent, translate, classify, custom

Format variables: {input}, {context}, {labels}

INFER Block (Legacy)

Note: The INFER block has been replaced by the more powerful INFERENCE block in v1.2. See INFERENCE Block above.

EXPORT Block

The EXPORT block defines which formats to export your trained model to.

EXPORT { format: ["gguf", "onnx", "okm", "safetensors"] path: "export/" quantization: "int8" }

You can export to multiple formats simultaneously. The model will be saved in the specified path directory.

DEPLOY Block

The DEPLOY block defines deployment targets for your model. The engine will create the server, generate routes, export in the required format, and configure limits and authentication.

DEPLOY { target: "api" host: "0.0.0.0" endpoint: "/chatbot" requires_auth: true port: 9000 max_concurrent_requests: 100 protocol: "http" format: "onnx" }

Supported targets: local, cloud, edge, api, android, ios, web, desktop

Protocols: http, https, grpc, ws

Formats: onnx, tflite, gguf, pt, okm

🔒 SECURITY Block

The SECURITY block defines security measures for input validation, output validation, rate limiting, and encryption.

SECURITY { input_validation { max_length: 500 disallow_patterns: [ "<script>", "DROP TABLE", "rm -rf", "sudo" ] } output_validation { prevent_data_leak: true mask_personal_info: true } rate_limit { max_requests_per_minute: 60 } encryption { algorithm: "AES-256" } }

Export Formats

OktoScript supports multiple export formats for different use cases:

Standard Formats

Format Purpose Compatibility
.onnx Universal inference, production-ready All platforms
.gguf Local inference, Ollama, Llama.cpp Local deployment
.safetensors HuggingFace, research, training Standard ML tools
.tflite Mobile deployment Android, iOS (future)

OktoSeek Optimized Formats

Format Purpose Benefits
.okm OktoModel - Optimized for OktoSeek SDK Flutter plugins, mobile apps, exclusive tools
.okx OktoBundle - Mobile + Edge package iOS, Android, Edge AI deployment
💡 Note: .okm and .okx formats are optional and optimized for the OktoSeek ecosystem. They provide better integration with OktoSeek Flutter SDK, mobile apps, and exclusive tools. You can always export to standard formats (ONNX, GGUF, SafeTensors) for universal compatibility.

Why use OktoModel (.okm)?

  • ✅ Optimized for OktoSeek Flutter SDK
  • ✅ Better performance on mobile devices
  • ✅ Access to exclusive OktoSeek tools and plugins
  • ✅ Seamless integration with OktoSeek ecosystem
  • ✅ Support for iOS and Android apps

OktoEngine CLI Commands

The OktoEngine CLI is minimal by design. All intelligence lives in the .okt file. The terminal is just the execution port.

Core Commands

Initialize project:

okto init my-project

Validate syntax:

okto validate script.okt

Train a model:

okto train script.okt

Evaluate a model:

okto eval script.okt

Export model:

okto export script.okt

Convert model formats:

okto convert --input model.pt --from pt --to gguf --output model.gguf okto convert --input model.pt --from pt --to onnx --output model.onnx

Inference Commands

Direct inference (single input/output):

okto infer --model models/chatbot.okm --text "Good evening, I want a pizza"

Automatically respects BEHAVIOR, GUARD, INFERENCE, and CONTROL blocks.

Interactive chat mode:

okto chat --model models/chatbot.okm

Opens an interactive loop. Uses prompt_style from BEHAVIOR, enforces GUARD rules, and supports session context. Type '/exit' to quit.

Terminal - Chat Session
🟢 Okto Chat started (type '/exit' to quit)
You: hi
Bot: Hello! How can I help you today?
You: what flavors do you have?
Bot: We have a great selection! Margherita, Pepperoni, Four Cheese, Hawaiian, and Vegetarian. What sounds good to you?
You: /exit
🔴 Session ended

Analysis Commands

Compare two models:

okto compare models/v1.okm models/v2.okm

Compares latency, accuracy, loss, and resource usage. Perfect for A/B testing.

View historical logs:

okto logs my-model

Shows loss per epoch, validation loss, accuracy, CPU/GPU/RAM usage, and decisions made by CONTROL block.

Auto-tune training:

okto tune script.okt

Uses CONTROL block to auto-adjust training parameters (learning rate, batch size, early stopping). This is unique in the market.

Utility Commands

List resources:

okto list projects okto list models okto list datasets okto list exports

System diagnostics:

okto doctor # Shows: GPU, CUDA, RAM, Drivers, Disks, Recommendations okto doctor --install # Auto-install missing dependencies

Other commands:

okto upgrade # Update OktoEngine okto about # Show information okto --version # Show version okto exit # Exit interactive mode

Complete Example: PizzaBot v1.2

Here's a complete, working OktoScript v1.2 example demonstrating all new features:

# okto_version: "1.2" PROJECT "PizzaBot" DESCRIPTION "AI specialized in pizza restaurant service" ENV { accelerator: "gpu" min_memory: "16GB" precision: "fp16" } DATASET { train: "dataset/train.jsonl" validation: "dataset/val.jsonl" format: "jsonl" type: "chat" language: "en" } MODEL { name: "pizzabot-model" base: "google/flan-t5-base" device: "cuda" ADAPTER { type: "lora" path: "./adapters/my-adapter" rank: 16 alpha: 32 } } TRAIN { epochs: 10 batch_size: 32 learning_rate: 0.0001 optimizer: "adamw" scheduler: "cosine" device: "cuda" checkpoint_steps: 100 } MONITOR { metrics: [ "loss", "val_loss", "accuracy", "gpu_usage", "ram_usage", "throughput", "latency", "confidence" ] notify_if { loss > 2.0 gpu_usage > 90% hallucination_score > 0.5 } log_to: "logs/training.log" } BEHAVIOR { mode: "chat" personality: "assistant" verbosity: "medium" language: "en" avoid: ["politics", "violence", "hate"] fallback: "How can I help you?" prompt_style: "User: {input}\nAssistant:" } STABILITY { stop_if_nan: true stop_if_diverges: true min_improvement: 0.001 } CONTROL { on_epoch_end { SAVE model LOG "Epoch completed" IF loss > 2.0 { SET LR = 0.00005 LOG "High loss detected" WHEN gpu_usage > 90% { SET batch_size = 16 } } IF val_loss > 2.5 { STOP_TRAINING } } validate_every: 200 } INFERENCE { mode: "chat" format: "User: {input}\nAssistant:" exit_command: "/exit" params { temperature: 0.7 max_length: 120 top_p: 0.9 do_sample: true } CONTROL { IF confidence < 0.3 { RETRY } IF hallucination_score > 0.5 { REPLACE WITH "I'm not certain about that." } } } GUARD { prevent { hallucination toxicity bias data_leak } detect_using: ["classifier", "regex"] on_violation { REPLACE with_message: "Sorry, this request is not allowed." } } SECURITY { input_validation { max_length: 500 disallow_patterns: ["<script>", "DROP TABLE"] } output_validation { prevent_data_leak: true mask_personal_info: true } rate_limit { max_requests_per_minute: 60 } } EXPORT { format: ["okm", "onnx", "gguf"] path: "export/" } DEPLOY { target: "api" host: "0.0.0.0" endpoint: "/chatbot" port: 9000 protocol: "http" format: "onnx" }

To execute this file:

okto train pizzabot.okt

The OktoEngine will automatically:

  1. Load and validate the dataset
  2. Initialize the model with LoRA adapter
  3. Train with autonomous CONTROL decisions
  4. Monitor all metrics in real-time
  5. Apply GUARD rules during inference
  6. Enforce SECURITY policies
  7. Export to multiple formats
  8. Deploy as API if configured

This example demonstrates: CONTROL with nested blocks, MONITOR with notify_if, GUARD with multiple detection methods, BEHAVIOR with prompt_style, INFERENCE with nested CONTROL, SECURITY with validation, and DEPLOY with full configuration.

Advanced Examples

For experienced developers, here are complex examples demonstrating advanced features:

v1.2 Examples with Decision-Making

New examples demonstrating CONTROL blocks, nested logic, and autonomous decision-making:

Fine-Tuning LLM with Checkpoints

Complete example with checkpoint resume, custom hooks, and multiple export formats:

View finetuning-llm.okt →

Features demonstrated:

  • Checkpoint resume from previous training
  • Custom hooks (Python scripts)
  • Advanced metrics and validation
  • Multiple export formats
  • Production deployment configuration

Complete Vision Pipeline

Production-ready vision system with augmentation and ONNX export:

View vision-pipeline.okt →

QA System with Embeddings

Question-answering system with semantic search capabilities:

View qa-embeddings.okt →

Complete Grammar Reference

For the complete formal grammar specification with all constraints, validation rules, and advanced features:

📖 Full Documentation:

Key Grammar Features

  • Model Inheritance - Reuse configurations with inherit
  • Extension Points - Add custom Python/JS hooks
  • Type Safety - All values validated against constraints
  • Error Handling - Clear error messages with solutions
  • Comprehensive Validation - Pre-flight checks before training

Get Started with OktoScript

Ready to start using OktoScript? Check out our examples and documentation:

OktoScript is developed and maintained by OktoSeek AI.

Note: OktoScript is designed for advanced users, engineers, and researchers. For no-code AI creation, use the OktoSeek IDE visual interface.

← Back to Home