-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
When using the HBOS (Histogram-based Outlier Detection) algorithm with n_bins='auto', an IndexError occurs during prediction if the test data contains values that fall outside the training data range for any feature.
Environment
- pyod version: [Check with
pip show pyod] - Python version: 3.12
- Operating System: Linux
Bug Report
Expected Behavior
The HBOS model with n_bins='auto' should handle test data with values outside the training range gracefully, similar to how it handles test data with n_bins=<integer>.
Actual Behavior
An IndexError is raised when predicting on test data that contains values exceeding the training data range.
Error Traceback
Traceback (most recent call last):
File "test.py", line 97, in <module>
predictions = model.predict(test_data.iloc[:4])
File ".../site-packages/pyod/models/base.py", line 162, in predict
pred_score = self.decision_function(X)
File ".../site-packages/pyod/models/hbos.py", line 171, in decision_function
outlier_scores = _calculate_outlier_scores_auto(X, self.bin_edges_,
File ".../site-packages/pyod/models/hbos.py", line 274, in _calculate_outlier_scores_auto
outlier_scores[j, i] = out_score_i[bin_inds[j] - 1]
IndexError: index 147 is out of bounds for axis 0 with size 147Minimal Reproducible Example
from pyod.models.hbos import HBOS
import numpy as np
# Create training data with limited range [0, 10]
np.random.seed(42)
X_train = np.random.randn(100, 5) * 2 + 5
X_train = np.clip(X_train, 0, 10)
print(f"Training range: [{X_train.min():.2f}, {X_train.max():.2f}]")
# Fit model with auto bins
model = HBOS(n_bins='auto', contamination=0.1)
model.fit(X_train)
# Create test data with value OUTSIDE training range
X_test = np.array([[5, 5, 15, 5, 5]]) # Feature 2 value (15) exceeds training max (10)
# This raises IndexError
predictions = model.predict(X_test) # ❌ IndexError!Metadata
Metadata
Assignees
Labels
No labels