NLP Intent Parser — End-to-End Industrial Command Understanding

This project implements a complete intent parsing system for technician-style instructions in industrial and microgrid environments. It takes raw natural language like:

“check the inverter temperature and update the power limit to 20%”

…and outputs structured actions:

{
  "intent": "update_parameter",
  "target": "inverter",
  "parameter": {
    "name": "power_limit",
    "value": "20%"
  }
}

The system benchmarks three NLP modeling families:

TF-IDF + Linear SVM — baseline intent classification
LSTM / BiLSTM — multi-output classification (intent + target + parameter)
DistilBERT Token Classifier — end-to-end structured extraction

The goal: build a clean, production-style pipeline that demonstrates how traditional ML, classical deep learning, and modern transformers differ in capability and performance.

📌 Key Features

✓ Synthetic Dataset Generation

Because real technician logs are private and inconsistent, the project builds a controlled, balanced synthetic dataset covering:

10+ intents
15+ equipment targets
20+ parameter types
Numeric, categorical, and percentage values

Flexible enough to extend or adapt to real operational logs later.

✓ Multi-Model Benchmarking

Each model solves the same set of tasks:

Intent classification
Target identification
Parameter extraction

This allows for direct comparison between:

Model	Strengths	Weaknesses
TF-IDF + SVM	Fast, simple, solid baseline	No structured extraction
LSTM / BiLSTM	Multi-output, learns patterns	Needs hand-engineered preprocessing
DistilBERT	Best overall generalisation; robust extraction	Heavier, slower on CPU

✓ End-to-End Parsing Demo

A final pipeline demonstrates how raw text becomes structured output:

Tokenisation
Transformer inference
Slot grouping
Value extraction
JSON-like final output

This is the most production-like part of the system and the highlight of the project.

✓ Clear, Modular Notebook Structure

The notebook is intentionally structured so that:

Students can follow it
Recruiters can evaluate it quickly
Engineers can adapt it to production

Sections include:

Imports & Setup
Dataset Generation
Preprocessing
TF-IDF Baseline
LSTM / BiLSTM Models
DistilBERT Model
Unified Evaluation
End-to-End Demo
Model Comparison Summary

🏗 Architecture Overview

Raw Instruction
        ↓
   Preprocessing
        ↓
  Three Model Families
        ↓
Unified Evaluation Layer
        ↓
 End-to-End Parser Demo
        ↓
 Structured JSON Output

🧪 Example Usage

Run the final cell in the notebook:

parse_instruction("reset the inverter frequency to 50hz")

Output:

{
  "intent": "reset",
  "target": "inverter",
  "parameter": {
    "name": "frequency",
    "value": "50hz"
  }
}

📊 Model Performance Summary

While scores vary depending on the exact synthetic dataset, trends are consistent:

TF-IDF + SVM performs well for intent only
BiLSTM improves multi-output performance
DistilBERT dominates structured extraction

Transformers are recommended if you want:

robustness to grammar changes
better understanding of technician lingo
high accuracy on parameter extraction

⚙️ Installation

pip install -r requirements.txt

Or install notebook dependencies manually:

transformers
tensorflow
torch
pandas
numpy
scikit-learn
tqdm
matplotlib

📁 Repository Structure

.
├── e2e_intent_parser.ipynb
├── README.md
└── data/

🎯 Why This Project Matters

Industrial environments need interpretable AI — not black boxes.

Technicians speak in short, imperative commands with variable structure. This project shows how to build an AI system that can:

parse real operator instructions
extract actionable parameters
integrate into predictive maintenance and control systems
adapt across equipment types

It’s both a portfolio piece and a template for real deployments.

🚀 Next Steps

Planned improvements:

Add CRF layer on top of DistilBERT
Add error-recovery heuristics for incomplete queries
Export a standalone Python library (intent_parser/)
Production API (FastAPI + lightweight ONNX model)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
.gitignore		.gitignore
README.md		README.md
e2e_intent_parser.ipynb		e2e_intent_parser.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Intent Parser — End-to-End Industrial Command Understanding

📌 Key Features

✓ Synthetic Dataset Generation

✓ Multi-Model Benchmarking

✓ End-to-End Parsing Demo

✓ Clear, Modular Notebook Structure

🏗 Architecture Overview

🧪 Example Usage

📊 Model Performance Summary

⚙️ Installation

📁 Repository Structure

🎯 Why This Project Matters

🚀 Next Steps

About

Uh oh!

Releases

Packages

Languages

kidmpukane/e2e_intent_parser

Folders and files

Latest commit

History

Repository files navigation

NLP Intent Parser — End-to-End Industrial Command Understanding

📌 Key Features

✓ Synthetic Dataset Generation

✓ Multi-Model Benchmarking

✓ End-to-End Parsing Demo

✓ Clear, Modular Notebook Structure

🏗 Architecture Overview

🧪 Example Usage

📊 Model Performance Summary

⚙️ Installation

📁 Repository Structure

🎯 Why This Project Matters

🚀 Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages