OktoScript

About OktoScript

OktoScript is a decision-driven language created by OktoSeek AI to design, train, evaluate, control and govern AI models end-to-end.

It goes far beyond a simple training script. OktoScript introduces native intelligence, autonomous decision-making and behavioral control into the AI development lifecycle.

OktoScript uses the file extension .okt and is executed by the OktoEngine, which is integrated into the OktoSeek IDE. This allows you to define complete AI training pipelines with built-in decision logic using a simple, declarative syntax.

Currently in version 1.2, OktoScript is the official language of the OktoSeek ecosystem, used by OktoSeek IDE, OktoEngine, VS Code Extension, and various AI development tools.

        Key Features:
        Decision-driven - Built-in CONTROL logic (IF, WHEN, SET, STOP, LOG, SAVE…)
Human-readable - Intuitive syntax that engineers and non-engineers can understand
Self-monitoring - Tracks metrics, detects anomalies and adapts automatically
Behavior-aware - Control personality, language, restrictions and style
Safe by design - Integrated GUARD and SECURITY layers
Strongly structured - Validated, deterministic and reproducible pipelines
Training-aware - Created specifically for AI training and optimization
Extensible - Extensible through OktoEngine and custom modules

      

🚀 Quick Start - Your First 5 Minutes

Get started with OktoScript in just 5 minutes. Follow these simple steps:

Step 1: Create Your First File

Create a file named train.okt:

PROJECT "MyFirstModel"
DESCRIPTION "My first OktoScript project"

DATASET {
    train: "dataset/train.jsonl"
    format: "jsonl"
    type: "chat"
}

MODEL {
    base: "oktoseek/base-mini"
}

TRAIN {
    epochs: 3
    batch_size: 16
    device: "cpu"
}

EXPORT {
    format: ["okm"]
    path: "export/"
}
      

Step 2: Prepare Your Dataset

Create dataset/train.jsonl:

{"input":"Hello","output":"Hi! How can I help?"}
{"input":"What's the weather?","output":"I don't have weather data."}
{"input":"Thank you","output":"You're welcome!"}
      

Step 3: Validate & Train

# Validate your configuration
okto validate train.okt

# Train your model
okto run train.okt
      

That's it! Your model will be trained and exported automatically. Read the full guide →

What is OktoScript?

OktoScript is a decision-driven language created by OktoSeek AI to design, train, evaluate, control and govern AI models end-to-end.

It allows you to define:

How a model is trained - Full fine-tuning, LoRA adapters, multi-dataset training
How it should behave - Personality, verbosity, language, response style
How it should react to problems - Automatic parameter adjustment, early stopping, checkpoint management
How and when it should stop, adapt or improve itself - CONTROL block with conditional logic

All using clear, readable and structured commands, built specifically for AI engineering.

Why OktoScript is Different

Traditional AI development is reactive.
You manually monitor metrics, fix problems and restart training.

OktoScript is proactive.

It allows the model to:

Detect instability automatically
Reduce or increase learning rate based on metrics
Adapt batch size based on GPU memory
Stop when performance drops
Save only the best checkpoints
Apply rules when patterns are detected

In other words, OktoScript doesn't just train models — it governs intelligence.

        💡 Two Ways to Use OktoSeek:
        No-Code Interface: Create agents and models using the visual IDE
OktoScript (Advanced): Use our decision-driven language for full customization and autonomous control

      

Use Cases & Examples

OktoScript is perfect for various AI development scenarios:

1. Conversational AI & Chatbots

Create intelligent chatbots for customer service, support, or entertainment.

PROJECT "CustomerSupportBot"
DATASET {
    train: "dataset/support_qa.jsonl"
    format: "jsonl"
    type: "chat"
}
MODEL {
    base: "oktoseek/chat-base"
    architecture: "transformer"
}
TRAIN {
    epochs: 10
    batch_size: 32
    device: "cuda"
}
      

2. Computer Vision & Image Classification

Build vision models for object detection, image classification, or visual understanding.

PROJECT "ImageClassifier"
DATASET {
    train: "dataset/images/"
    format: "image+caption"
    type: "vision"
    augmentation: ["flip", "rotate"]
}
MODEL {
    base: "oktoseek/vision-base"
    architecture: "vision-transformer"
}
      

3. Fine-Tuning Large Language Models

Fine-tune LLMs with checkpoint resume, custom callbacks, and advanced configurations.

See: finetuning-llm.okt example

4. Question-Answering Systems

Build QA systems with embeddings and similarity search for semantic retrieval.

See: qa-embeddings.okt example

5. Recommendation Systems

Create recommendation engines for e-commerce, content platforms, or personalized experiences.

See: recommender.okt example

Basic File Structure

Every OktoScript file (.okt) follows this general structure:

PROJECT "ProjectName" {
  
    DATASET { ... }
    
    MODEL { ... }
    
    TRAIN { ... }
    
    # Optional blocks:
    EVALUATE { ... }
    INFER { ... }
    EXPORT { ... }
    DEPLOY { ... }
}
      

Required blocks: PROJECT, DATASET, MODEL, TRAIN

Optional blocks: EVALUATE, INFER, EXPORT, DEPLOY

PROJECT Block

The PROJECT block defines your project metadata, including name, description, author, and version.

PROJECT "PizzaBot"
DESCRIPTION "AI specialized in pizza restaurant service"
VERSION "1.0"
AUTHOR "OktoSeek"
TAGS ["food", "restaurant", "chatbot"]
      

Available fields:

PROJECT - Project name (required)
DESCRIPTION - Project description (optional)
VERSION - Version string (optional)
AUTHOR - Author name (optional)
TAGS - List of tags (optional)

DATASET Block

The DATASET block defines your training data, including paths, format, type, and language.

DATASET {
    train: "dataset/train.jsonl"
    validation: "dataset/val.jsonl"
    test: "dataset/test.jsonl"
    format: "jsonl"
    type: "chat"
    language: "en"
}
      

Supported formats: JSONL, CSV, TXT, Parquet, Image+Caption, QA, Instruction, Multi-modal

Supported types: classification, generation, qa, chat, vision, regression

Supported languages: en, pt, es, fr, multilingual

MODEL Block

The MODEL block defines your base model, architecture, parameters, and precision settings. It can optionally include an ADAPTER sub-block to apply parameter-efficient fine-tuning methods such as LoRA, QLoRA, PEFT, or other adapters.

MODEL {
    name: "oktogpt"
    base: "google/flan-t5-base"
    architecture: "transformer"
    parameters: 120M
    context_window: 2048
    precision: "fp16"
    device: "cuda"
    
    ADAPTER {
        type: "lora"
        path: "D:/model_trainee/phase1_sharegpt/ep2"
        rank: 16
        alpha: 32
    }
}
      

Available fields:

name - Model name (optional)
base - Base model name (required)
architecture - transformer, cnn, rnn, diffusion, vision-transformer
parameters - Model size (e.g., 120M, 7B)
context_window - Maximum context length
precision - fp32, fp16, int8, int4
device - cuda, cpu, mps, auto
ADAPTER - LoRA/PEFT adapter configuration (optional)

ADAPTER types: lora, qlora, adapter, peft

TRAIN Block

The TRAIN block is one of the most important blocks. It defines all training hyperparameters, optimizer settings, and training strategy.

TRAIN {
    epochs: 10
    batch_size: 32
    gradient_accumulation: 2
    learning_rate: 0.00025
    optimizer: "adamw"
    scheduler: "cosine"
    loss: "cross_entropy"
    device: "cuda"
    gpu: true
    mixed_precision: true
    early_stopping: true
    checkpoint_steps: 100
    output: "./models/pizzabot-v2"
}
      

Key parameters:

epochs - Number of training epochs
batch_size - Batch size for training
learning_rate - Learning rate
optimizer - adam, adamw, sgd, rmsprop
scheduler - linear, cosine, step
device - cpu, cuda, mps
early_stopping - Enable early stopping

EVALUATE Block

The EVALUATE block defines which metrics to calculate during evaluation.

EVALUATE {
    metrics [
        "loss",
        "accuracy",
        "perplexity",
        "f1",
        "bleu",
        "rouge"
    ]
}
      

Supported metrics: loss, accuracy, perplexity, precision, recall, f1, bleu, rouge, mae, mse, cosine_similarity, token_efficiency, response_coherence, hallucination_score

🧠 CONTROL Block — Decision Engine

The CONTROL block enables logical, conditional, event-based, and metric-based decisions during training and inference. It introduces a cognitive-level abstraction that allows AI models to take decisions, self-adjust, and self-regulate in a declarative and clean way.

OktoScript enables true declarative AI governance. CONTROL blocks can contain nested conditions and nested event triggers, making it a unique declarative decision-making language in the market.

Supported Events

on_step_end - Executed at the end of each training step
on_epoch_end - Executed at the end of each epoch
validate_every - Execute validation every X steps
on_memory_low - Triggered when GPU/RAM is low
on_nan - Triggered when NaN values are detected
on_plateau - Triggered when loss is stagnant

Supported Directives

IF, WHEN, EVERY, SET, STOP, LOG, SAVE, RETRY, REGENERATE, STOP_TRAINING, DECREASE, INCREASE

Example with Nested Blocks

CONTROL {
    on_epoch_end {
        IF loss > 2.0 {
            SET LR = 0.00005
            LOG "High loss detected"
            
            WHEN gpu_usage > 90% {
                SET batch_size = 16
                LOG "Reducing batch size due to GPU pressure"
            }
            
            IF val_loss > 3.0 {
                STOP_TRAINING
            }
        }
        
        IF accuracy > 0.9 {
            SAVE "best_model"
            LOG "High accuracy reached"
        }
        
        EVERY 2 epochs {
            SAVE "checkpoint_epoch_{epoch}"
        }
    }
    
    validate_every: 200
    
    IF epoch > 5 AND accuracy < 0.6 {
        SET LR = 0.00001
        LOG "Model is stagnated"
    }
}
      

        Philosophy: OktoScript keeps the surface clean and simple, while the engine behind it performs complex cognitive decision-making. CONTROL defines logic, MONITOR defines awareness, GUARD defines safety, BEHAVIOR defines personality.
      

📊 MONITOR Block — Full Metrics Support

The MONITOR block tracks ANY available training or system metric. It supports all native and custom metrics, including loss, accuracy, GPU usage, throughput, latency, confidence, and more.

MONITOR {
    metrics: [
        "loss",
        "accuracy",
        "val_loss",
        "gpu_usage",
        "ram_usage",
        "throughput",
        "latency",
        "confidence"
    ]
    
    notify_if {
        loss > 2.0
        gpu_usage > 90%
        temperature > 85
        hallucination_score > 0.5
    }
    
    log_to: "logs/training.log"
}
      

🛡️ GUARD Block — Safety / Ethics / Protection

The GUARD block defines safety rules during generation and training. The engine knows exactly what to prevent, how to detect violations, and what action to take.

GUARD {
    prevent {
        hallucination
        toxicity
        bias
        data_leak
        unsafe_code
        personal_data
        illegal_content
    }
    
    detect_using: ["classifier", "regex", "embedding"]
    
    on_violation {
        REPLACE
        with_message: "Sorry, this request is not allowed."
    }
}
      

Detection methods: classifier, embedding, regex, rule_engine, ml_model

🎭 BEHAVIOR Block — Model Personality

The BEHAVIOR block defines how the model should behave in chat/inference. It sets personality, verbosity, language, and content restrictions.

BEHAVIOR {
    mode: "chat"
    personality: "friendly"
    verbosity: "medium"
    language: "en"
    avoid: ["violence", "hate", "politics"]
    fallback: "How can I help you?"
    prompt_style: "User: {input}\nAssistant:"
}
      

Personality types: professional, friendly, assistant, casual, formal, creative

💬 INFERENCE Block — Advanced Configuration

The INFERENCE block defines how the model behaves during inference, prediction, or interactive chat. It supports multiple modes, format templates, and nested CONTROL logic.

INFERENCE {
    mode: "chat"
    format: "User: {input}\nAssistant:"
    exit_command: "/exit"
    
    params {
        max_length: 120
        temperature: 0.7
        top_p: 0.9
        beams: 2
        do_sample: true
    }
    
    CONTROL {
        IF confidence < 0.3 { RETRY }
        IF hallucination_score > 0.5 { 
            REPLACE WITH "I'm not certain about that." 
        }
    }
}
      

Supported modes: chat, intent, translate, classify, custom

Format variables: {input}, {context}, {labels}

INFER Block (Legacy)

Note: The INFER block has been replaced by the more powerful INFERENCE block in v1.2. See INFERENCE Block above.

EXPORT Block

The EXPORT block defines which formats to export your trained model to.

EXPORT {
    format: ["gguf", "onnx", "okm", "safetensors"]
    path: "export/"
    quantization: "int8"
}
      

You can export to multiple formats simultaneously. The model will be saved in the specified path directory.

DEPLOY Block

The DEPLOY block defines deployment targets for your model. The engine will create the server, generate routes, export in the required format, and configure limits and authentication.

DEPLOY {
    target: "api"
    host: "0.0.0.0"
    endpoint: "/chatbot"
    requires_auth: true
    port: 9000
    max_concurrent_requests: 100
    protocol: "http"
    format: "onnx"
}
      

Supported targets: local, cloud, edge, api, android, ios, web, desktop

Protocols: http, https, grpc, ws

Formats: onnx, tflite, gguf, pt, okm

🔒 SECURITY Block

The SECURITY block defines security measures for input validation, output validation, rate limiting, and encryption.

SECURITY {
    input_validation {
        max_length: 500
        disallow_patterns: [
            "<script>",
            "DROP TABLE",
            "rm -rf",
            "sudo"
        ]
    }
    
    output_validation {
        prevent_data_leak: true
        mask_personal_info: true
    }
    
    rate_limit {
        max_requests_per_minute: 60
    }
    
    encryption {
        algorithm: "AES-256"
    }
}
      

Export Formats

OktoScript supports multiple export formats for different use cases:

Standard Formats

Format	Purpose	Compatibility
`.onnx`	Universal inference, production-ready	All platforms
`.gguf`	Local inference, Ollama, Llama.cpp	Local deployment
`.safetensors`	HuggingFace, research, training	Standard ML tools
`.tflite`	Mobile deployment	Android, iOS (future)

OktoSeek Optimized Formats

Format	Purpose	Benefits
`.okm`	OktoModel - Optimized for OktoSeek SDK	Flutter plugins, mobile apps, exclusive tools
`.okx`	OktoBundle - Mobile + Edge package	iOS, Android, Edge AI deployment

        💡 Note: .okm and .okx formats are optional and optimized for the OktoSeek ecosystem. They provide better integration with OktoSeek Flutter SDK, mobile apps, and exclusive tools. You can always export to standard formats (ONNX, GGUF, SafeTensors) for universal compatibility.
      

Why use OktoModel (.okm)?

✅ Optimized for OktoSeek Flutter SDK
✅ Better performance on mobile devices
✅ Access to exclusive OktoSeek tools and plugins
✅ Seamless integration with OktoSeek ecosystem
✅ Support for iOS and Android apps

OktoEngine CLI Commands

The OktoEngine CLI is minimal by design. All intelligence lives in the .okt file. The terminal is just the execution port.

Core Commands

Initialize project:

okto init my-project

Validate syntax:

okto validate script.okt

Train a model:

okto train script.okt

Evaluate a model:

okto eval script.okt

Export model:

okto export script.okt

Convert model formats:

okto convert --input model.pt --from pt --to gguf --output model.gguf
okto convert --input model.pt --from pt --to onnx --output model.onnx

Inference Commands

Direct inference (single input/output):

okto infer --model models/chatbot.okm --text "Good evening, I want a pizza"

Automatically respects BEHAVIOR, GUARD, INFERENCE, and CONTROL blocks.

Interactive chat mode:

okto chat --model models/chatbot.okm

Opens an interactive loop. Uses prompt_style from BEHAVIOR, enforces GUARD rules, and supports session context. Type '/exit' to quit.

Terminal - Chat Session

🟢 Okto Chat started (type '/exit' to quit)

            You: 
            hi
          
            Bot: 
            Hello! How can I help you today?
          
            You: 
            what flavors do you have?
          
            Bot: 
            We have a great selection! Margherita, Pepperoni, Four Cheese, Hawaiian, and Vegetarian. What sounds good to you?
          
            You: 
            /exit
          
🔴 Session ended

Analysis Commands

Compare two models:

okto compare models/v1.okm models/v2.okm

Compares latency, accuracy, loss, and resource usage. Perfect for A/B testing.

View historical logs:

okto logs my-model

Shows loss per epoch, validation loss, accuracy, CPU/GPU/RAM usage, and decisions made by CONTROL block.

Auto-tune training:

okto tune script.okt

Uses CONTROL block to auto-adjust training parameters (learning rate, batch size, early stopping). This is unique in the market.

Utility Commands

List resources:

okto list projects
okto list models
okto list datasets
okto list exports

System diagnostics:

okto doctor
# Shows: GPU, CUDA, RAM, Drivers, Disks, Recommendations
okto doctor --install  # Auto-install missing dependencies

Other commands:

okto upgrade    # Update OktoEngine
okto about      # Show information
okto --version  # Show version
okto exit       # Exit interactive mode

Complete Example: PizzaBot v1.2

Here's a complete, working OktoScript v1.2 example demonstrating all new features:

# okto_version: "1.2"
PROJECT "PizzaBot"
DESCRIPTION "AI specialized in pizza restaurant service"

ENV {
    accelerator: "gpu"
    min_memory: "16GB"
    precision: "fp16"
}

DATASET {
    train: "dataset/train.jsonl"
    validation: "dataset/val.jsonl"
    format: "jsonl"
    type: "chat"
    language: "en"
}

MODEL {
    name: "pizzabot-model"
    base: "google/flan-t5-base"
    device: "cuda"
    
    ADAPTER {
        type: "lora"
        path: "./adapters/my-adapter"
        rank: 16
        alpha: 32
    }
}

TRAIN {
    epochs: 10
    batch_size: 32
    learning_rate: 0.0001
    optimizer: "adamw"
    scheduler: "cosine"
    device: "cuda"
    checkpoint_steps: 100
}

MONITOR {
    metrics: [
        "loss",
        "val_loss",
        "accuracy",
        "gpu_usage",
        "ram_usage",
        "throughput",
        "latency",
        "confidence"
    ]
    
    notify_if {
        loss > 2.0
        gpu_usage > 90%
        hallucination_score > 0.5
    }
    
    log_to: "logs/training.log"
}

BEHAVIOR {
    mode: "chat"
    personality: "assistant"
    verbosity: "medium"
    language: "en"
    avoid: ["politics", "violence", "hate"]
    fallback: "How can I help you?"
    prompt_style: "User: {input}\nAssistant:"
}

STABILITY {
    stop_if_nan: true
    stop_if_diverges: true
    min_improvement: 0.001
}

CONTROL {
    on_epoch_end {
        SAVE model
        LOG "Epoch completed"
        
        IF loss > 2.0 {
            SET LR = 0.00005
            LOG "High loss detected"
            
            WHEN gpu_usage > 90% {
                SET batch_size = 16
            }
        }
        
        IF val_loss > 2.5 {
            STOP_TRAINING
        }
    }
    
    validate_every: 200
}

INFERENCE {
    mode: "chat"
    format: "User: {input}\nAssistant:"
    exit_command: "/exit"
    
    params {
        temperature: 0.7
        max_length: 120
        top_p: 0.9
        do_sample: true
    }
    
    CONTROL {
        IF confidence < 0.3 { RETRY }
        IF hallucination_score > 0.5 { 
            REPLACE WITH "I'm not certain about that." 
        }
    }
}

GUARD {
    prevent {
        hallucination
        toxicity
        bias
        data_leak
    }
    
    detect_using: ["classifier", "regex"]
    
    on_violation {
        REPLACE
        with_message: "Sorry, this request is not allowed."
    }
}

SECURITY {
    input_validation {
        max_length: 500
        disallow_patterns: ["<script>", "DROP TABLE"]
    }
    
    output_validation {
        prevent_data_leak: true
        mask_personal_info: true
    }
    
    rate_limit {
        max_requests_per_minute: 60
    }
}

EXPORT {
    format: ["okm", "onnx", "gguf"]
    path: "export/"
}

DEPLOY {
    target: "api"
    host: "0.0.0.0"
    endpoint: "/chatbot"
    port: 9000
    protocol: "http"
    format: "onnx"
}
      

To execute this file:

okto train pizzabot.okt

The OktoEngine will automatically:

Load and validate the dataset
Initialize the model with LoRA adapter
Train with autonomous CONTROL decisions
Monitor all metrics in real-time
Apply GUARD rules during inference
Enforce SECURITY policies
Export to multiple formats
Deploy as API if configured

This example demonstrates: CONTROL with nested blocks, MONITOR with notify_if, GUARD with multiple detection methods, BEHAVIOR with prompt_style, INFERENCE with nested CONTROL, SECURITY with validation, and DEPLOY with full configuration.

Advanced Examples

For experienced developers, here are complex examples demonstrating advanced features:

v1.2 Examples with Decision-Making

New examples demonstrating CONTROL blocks, nested logic, and autonomous decision-making:

control-nested.okt - Nested CONTROL blocks with advanced decision-making
behavior-chat.okt - BEHAVIOR with mode and prompt_style
guard-safety.okt - GUARD with multiple detection methods
inference-advanced.okt - Advanced INFERENCE with nested CONTROL
complete-v1.2.okt - Complete example with all v1.2 features

Fine-Tuning LLM with Checkpoints

Complete example with checkpoint resume, custom hooks, and multiple export formats:

View finetuning-llm.okt →

Features demonstrated:

Checkpoint resume from previous training
Custom hooks (Python scripts)
Advanced metrics and validation
Multiple export formats
Production deployment configuration

Complete Vision Pipeline

Production-ready vision system with augmentation and ONNX export:

View vision-pipeline.okt →

QA System with Embeddings

Question-answering system with semantic search capabilities:

View qa-embeddings.okt →

Complete Grammar Reference

For the complete formal grammar specification with all constraints, validation rules, and advanced features:

📖 Full Documentation:

Grammar Specification - Complete formal grammar
Getting Started Guide - Step-by-step tutorial
Validation Rules - Complete validation reference
All Examples - From basic to advanced

Key Grammar Features

Model Inheritance - Reuse configurations with inherit
Extension Points - Add custom Python/JS hooks
Type Safety - All values validated against constraints
Error Handling - Clear error messages with solutions
Comprehensive Validation - Pre-flight checks before training

Get Started with OktoScript

Ready to start using OktoScript? Check out our examples and documentation:

GitHub Repository - View examples and source code
Examples - Complete working examples
Grammar Specification - Complete formal grammar

OktoScript is developed and maintained by OktoSeek AI.

Note: OktoScript is designed for advanced users, engineers, and researchers. For no-code AI creation, use the OktoSeek IDE visual interface.

Table of Contents