-
Notifications
You must be signed in to change notification settings - Fork 2
feat: add OpenTelemetry plugin for database contention detection #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Implements a comprehensive OpenTelemetry instrumentation plugin for the Colony entity manager to detect and monitor database transaction contention and performance issues. Features: - Automatic instrumentation of entity manager operations - Distributed tracing for transactions, queries, and locks - Database-specific contention detectors for PostgreSQL and MySQL - Real-time blocking query detection - Lock wait monitoring and analysis - Transaction statistics collection - Slow query identification - Deadlock and serialization failure detection Components: - telemetry_plugin.py: Main plugin following Colony patterns - system.py: Core telemetry system with instrumentation logic - decorators.py: Instrumented transaction decorators - contention.py: Database-specific contention detection - test.py: Unit tests for telemetry functionality Configuration: - Environment-based configuration (OTEL_* variables) - Support for OTLP exporters (Jaeger, Tempo, etc.) - Console exporter for development/debugging - Configurable slow query thresholds Metrics: - db.transaction.duration: Transaction execution time - db.query.duration: Query execution time - db.lock.wait_duration: Lock acquisition time - db.transaction.active: Active transaction count - db.contention.events: Contention event counter Documentation: - README.md: Comprehensive usage guide - QUICK_START.md: Quick setup instructions - example_usage.py: Practical examples The implementation is non-intrusive and follows the Colony plugin architecture, making it easy to integrate with existing applications.
Enhances the OpenTelemetry telemetry plugin with full MySQL 5.7+ and 8.0+ compatibility for database contention detection and monitoring. MySQL-Specific Features: - InnoDB lock wait detection with fallback for MySQL 5.7/8.0+ - Deadlock detection via error code 1213 and InnoDB status parsing - Transaction statistics from information_schema.innodb_trx - Slow query detection from PROCESSLIST - SHOW ENGINE INNODB STATUS parsing for recent deadlocks - Automatic version detection (sys.innodb_lock_waits for 8.0+) Enhanced Error Detection: - MySQL deadlock error code 1213 detection - Improved contention type classification (deadlock, lock_timeout, serialization_failure, lock_wait) - Database-specific error code parsing - Enhanced span attributes for contention events MySQLContentionDetector Methods: - get_blocking_queries(): Detects InnoDB lock waits - get_lock_waits(): Returns lock wait statistics - get_transaction_stats(): InnoDB transaction metrics + deadlock count - get_slow_queries(): Queries from PROCESSLIST exceeding threshold - get_deadlock_info(): Parses SHOW ENGINE INNODB STATUS Updated Instrumentation (system.py): - MySQL-specific error code detection in execute_query wrapper - Contention type attribute (db.contention.type) in spans - Improved error categorization for both MySQL and PostgreSQL Documentation: - example_mysql.py: Complete MySQL usage examples - README.md: Added MySQL-specific examples and usage - MySQL best practices for contention avoidance Query Compatibility: - Uses sys.innodb_lock_waits for MySQL 8.0+ - Falls back to information_schema.innodb_lock_waits for 5.7 - CASE WHEN aggregation instead of FILTER for MySQL compatibility - GLOBAL_STATUS for deadlock count retrieval The implementation maintains full backward compatibility with existing PostgreSQL functionality while adding comprehensive MySQL support.
WalkthroughAdds a new Telemetry plugin with OpenTelemetry integration: instrumentation, decorators, a Telemetry system, cross-database contention detectors (PostgreSQL, MySQL, SQLite), examples, docs, tests, and requirements updates to include OpenTelemetry packages. Changes
Sequence DiagramsequenceDiagram
actor App as Application
participant Dec as Decorator
participant Tele as Telemetry System
participant EM as Entity Manager
participant DB as Database
participant OTEL as OpenTelemetry
App->>Dec: Call decorated transactional function
Dec->>Tele: Request tracer/span
Tele->>OTEL: Start span (db.transaction.<name>)
Dec->>EM: begin_transaction()
EM->>DB: BEGIN
DB-->>EM: OK
Dec->>App: Execute function body (calls EM queries)
App->>EM: execute_query() / lock()
EM->>DB: SQL statements
DB-->>EM: Results / errors
EM->>Tele: Report metrics/events (query duration, lock wait)
Tele->>OTEL: Emit metrics & events
alt Success
Dec->>EM: commit_transaction()
EM->>DB: COMMIT
DB-->>EM: OK
Tele->>OTEL: Mark span OK, add commit event
else Exception (lock/deadlock/timeout)
EM->>Dec: raise exception
Dec->>Tele: Classify contention, increment retry metadata
Dec->>EM: rollback_transaction()
EM->>DB: ROLLBACK
DB-->>EM: OK
Tele->>OTEL: Record error, retry_count, and contention event
end
Dec->>OTEL: End span
OTEL-->>Dec: Closed
Dec-->>App: Return or raise
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (17)
telemetry/src/telemetry/test.py (1)
142-142: Consider extracting the exception message.The error message "Test error" could be extracted to a constant for better maintainability, though this is a test-only concern.
Based on static analysis hints.
telemetry/src/telemetry_plugin.py (1)
46-58: Consider adding ClassVar annotations for mutable class attributes.Static analysis suggests annotating these mutable class attributes with
typing.ClassVarto indicate they are class-level rather than instance-level. However, this may be a Colony framework convention that doesn't require type hints.Proposed enhancement
+from typing import ClassVar + class TelemetryPlugin(colony.Plugin): ... - platforms = [colony.CPYTHON_ENVIRONMENT] - capabilities = ["telemetry"] - dependencies = [ + platforms: ClassVar = [colony.CPYTHON_ENVIRONMENT] + capabilities: ClassVar = ["telemetry"] + dependencies: ClassVar = [ ... ] - main_modules = ["telemetry"] + main_modules: ClassVar = ["telemetry"]Based on static analysis hints.
telemetry/src/telemetry/system.py (1)
321-324: Consider restructuring the return statement.The
returnstatement at line 324 could be moved to anelseblock for better code structure, though the current implementation is functionally correct.Proposed refactor
try: result = original_execute_query(query, close_cursor) span.set_status(Status(StatusCode.OK)) - return result - - except Exception as e: + except Exception as e: + # error handling... + raise + else: + return resultBased on static analysis hints.
telemetry/src/telemetry/decorators.py (5)
81-84: Bareexceptswallows all exceptions silently.Using a bare
exceptwithpasshides failures when obtaining the tracer. This can mask issues like misconfigured OpenTelemetry or unexpected errors. Consider catching specific exceptions or at minimum logging a debug message.🔎 Proposed fix
if OTEL_AVAILABLE and trace_enabled: try: tracer = trace.get_tracer("colony.entity_manager") - except: - pass + except Exception: + # Tracer initialization failed; proceed without tracing + tracer = None
91-94: Bareexceptwhen retrieving engine name.Similar to the tracer acquisition, this bare
exceptsilently catches all exceptions. The fallback to "unknown" is reasonable, but consider catchingExceptionexplicitly for clarity.🔎 Proposed fix
try: engine_name = self_value.entity_manager.engine.get_engine_name() - except: + except Exception: engine_name = "unknown"
106-106: Unusedctx_tokenvariable.The result of
trace.set_span_in_context(span)is assigned but never used. If context propagation is intended, this token should be used to activate the context. Otherwise, consider removing it. Note:start_as_current_spancontext manager (used inmonitored_query) handles this automatically.🔎 Proposed fix - remove if not needed
- ctx_token = trace.set_span_in_context(span) + # Note: span is manually managed; context propagation not required here
150-155:retry_countis misleading—increment without retry logic.The
retry_countis incremented on contention detection and recorded as a span attribute, but the decorator does not implement actual retry logic. This could mislead users into thinking retries are handled automatically. Consider renaming tocontention_detected(boolean) or documenting that retry logic is the caller's responsibility.🔎 Proposed fix - clarify intent
- if is_contention: - span.set_attribute("db.transaction.contention", True) - span.add_event("contention_detected", {"error.type": type(e).__name__}) - retry_count += 1 - - span.set_attribute("db.transaction.retry_count", retry_count) + if is_contention: + span.set_attribute("db.transaction.contention", True) + span.set_attribute("db.transaction.retry_suggested", True) + span.add_event("contention_detected", {"error.type": type(e).__name__})
174-187: Unused*args, **kwargsin decorator signature.The
decoratorfunction accepts*args, **kwargsbut never uses them. This appears to be a remnant of boilerplate. Remove to avoid confusion.🔎 Proposed fix
- def decorator(function, *args, **kwargs): + def decorator(function): """ The decorator function for the instrumented transaction decorator. :type function: Function :param function: The function to be decorated. :rtype: Function :return: The decorator interceptor function. """ decorator_interceptor_function = create_decorator_interceptor(function) return decorator_interceptor_functiontelemetry/example_mysql.py (4)
33-40: Hardcoded credentials in example code.The example uses hardcoded credentials (
root/root). While acceptable for demonstration purposes, consider adding a comment noting this is example-only and should use environment variables in production, or use placeholder values like"your_password".🔎 Proposed improvement
# Create MySQL entity manager + # NOTE: Use environment variables for credentials in production entity_manager = entity_manager_plugin.load_entity_manager_properties("mysql", { "id": "mysql_database", "host": "localhost", "port": 3306, - "user": "root", - "password": "root", + "user": os.environ.get("MYSQL_USER", "root"), + "password": os.environ.get("MYSQL_PASSWORD", ""), "database": "example_db", "isolation": "read committed", })
119-119: f-string without placeholders.Line 119 uses an f-string but contains no interpolation. Remove the
fprefix.🔎 Proposed fix
- print(f"\nDeadlock/Block detected:") + print("\nDeadlock/Block detected:")
164-164: f-string without placeholders.Same issue as above—remove the
fprefix.🔎 Proposed fix
- print(f"\nSlow query detected:") + print("\nSlow query detected:")
280-318: Monitor thread lacks graceful shutdown mechanism.The
monitorfunction runs an infinitewhile Trueloop with no exit condition. Consider adding a shutdown flag or usingthreading.Eventfor graceful termination. The broad exception handling is acceptable here to keep the monitoring thread alive.🔎 Proposed improvement for graceful shutdown
-def monitor_mysql_periodically(entity_manager, telemetry_plugin, interval=30): +def monitor_mysql_periodically(entity_manager, telemetry_plugin, interval=30, stop_event=None): """ Sets up periodic MySQL monitoring for production environments. ... + :type stop_event: threading.Event + :param stop_event: Optional event to signal shutdown. """ import time import threading detector = telemetry_plugin.get_contention_detector("mysql") + if stop_event is None: + stop_event = threading.Event() def monitor(): print(f"\nStarted MySQL monitoring (checking every {interval}s)") - while True: + while not stop_event.is_set(): try: # ... monitoring logic ... - time.sleep(interval) + stop_event.wait(interval) except Exception as e: print(f"Error in monitoring thread: {e}") - time.sleep(interval) + stop_event.wait(interval)telemetry/src/telemetry/contention.py (2)
383-385: Silent exception swallowing in fallback path.The
try-except-passpattern silently ignores why the MySQL 8.0+ query failed. Consider logging at debug level to aid troubleshooting version compatibility issues.🔎 Proposed fix
except Exception: # Fallback to information_schema for older MySQL versions - pass + # Note: sys.innodb_lock_waits may not exist in MySQL < 8.0 + pass # Proceed to legacy fallbackOr if the telemetry system supports debug logging:
- except Exception: + except Exception as e: # Fallback to information_schema for older MySQL versions - pass + self.telemetry_system.plugin.debug( + "sys.innodb_lock_waits unavailable, using legacy query: %s" % str(e) + )
512-519: Silent exception when fetching deadlock count.The nested
try-except-passsilently ignores errors fetching the deadlock count fromGLOBAL_STATUS. This could hide permission issues or MySQL configuration problems.🔎 Proposed fix
try: cursor = entity_manager.execute_query(deadlock_query) row = cursor.fetchone() if row: stats["deadlocks"] = int(row[0]) if row[0] else 0 cursor.close() - except Exception: - pass + except Exception as e: + # GLOBAL_STATUS may not be accessible; deadlock count unavailable + stats["deadlocks"] = Nonetelemetry/example_usage.py (3)
32-39: Hardcoded credentials in example code.Similar to the MySQL example, consider adding a note about using environment variables in production.
🔎 Proposed improvement
- # Create entity manager with your configuration + # Create entity manager with your configuration + # NOTE: Use environment variables for credentials in production entity_manager = entity_manager_plugin.load_entity_manager_properties("pgsql", { "id": "example_database", "host": "localhost", - "user": "postgres", - "password": "postgres", + "user": os.environ.get("POSTGRES_USER", "postgres"), + "password": os.environ.get("POSTGRES_PASSWORD", ""), "database": "example_db", "isolation": "serializable", })
129-129: f-strings without placeholders at lines 129, 142, 164.Multiple f-strings lack interpolation. Remove the
fprefix from these lines.🔎 Proposed fixes
- print(f"\nBlocking situation detected:") + print("\nBlocking situation detected:")- print(f"\nLock wait detected:") + print("\nLock wait detected:")- print(f"\nSlow query detected:") + print("\nSlow query detected:")
229-254: Monitor thread lacks graceful shutdown mechanism.Same issue as the MySQL example—the infinite loop has no exit condition. Consider adding a
stop_eventparameter for graceful termination.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
requirements.txttelemetry/MYSQL_SUPPORT.mdtelemetry/QUICK_START.mdtelemetry/README.mdtelemetry/example_mysql.pytelemetry/example_usage.pytelemetry/src/telemetry/__init__.pytelemetry/src/telemetry/contention.pytelemetry/src/telemetry/decorators.pytelemetry/src/telemetry/system.pytelemetry/src/telemetry/test.pytelemetry/src/telemetry_plugin.py
🧰 Additional context used
🧬 Code graph analysis (7)
telemetry/src/telemetry/test.py (2)
telemetry/src/telemetry/decorators.py (1)
transaction(42-187)telemetry/src/telemetry/contention.py (10)
ContentionDetector(32-88)get_blocking_queries(51-62)get_blocking_queries(97-166)get_blocking_queries(339-426)get_lock_waits(64-75)get_lock_waits(168-212)get_lock_waits(428-466)get_transaction_stats(77-88)get_transaction_stats(214-273)get_transaction_stats(468-527)
telemetry/example_mysql.py (3)
data/src/entity_manager_plugin.py (1)
load_entity_manager_properties(78-91)telemetry/src/telemetry/system.py (2)
instrument_entity_manager(226-461)get_contention_detector(463-478)telemetry/src/telemetry/contention.py (9)
get_blocking_queries(51-62)get_blocking_queries(97-166)get_blocking_queries(339-426)get_lock_waits(64-75)get_lock_waits(168-212)get_lock_waits(428-466)get_slow_queries(275-329)get_slow_queries(529-581)get_deadlock_info(583-625)
telemetry/src/telemetry_plugin.py (2)
telemetry/src/telemetry/system.py (4)
Telemetry(52-478)instrument_entity_manager(226-461)get_contention_detector(463-478)configure(198-224)telemetry/src/telemetry/decorators.py (1)
transaction(42-187)
telemetry/src/telemetry/contention.py (1)
telemetry/src/telemetry/test.py (1)
warning(161-162)
telemetry/src/telemetry/system.py (2)
telemetry/src/telemetry_plugin.py (1)
instrument_entity_manager(68-79)telemetry/src/telemetry/contention.py (3)
PgSQLContentionDetector(91-329)MySQLContentionDetector(332-625)ContentionDetector(32-88)
telemetry/src/telemetry/decorators.py (3)
telemetry/src/telemetry/test.py (6)
begin(90-91)begin(127-128)commit(93-94)commit(130-131)rollback(96-97)rollback(133-134)threads/src/thread_pool/system.py (1)
set_status(947-955)template_engine/src/template_engine/system.py (1)
end(597-598)
telemetry/src/telemetry/__init__.py (3)
telemetry/src/telemetry/contention.py (2)
ContentionDetector(32-88)PgSQLContentionDetector(91-329)telemetry/src/telemetry/system.py (1)
Telemetry(52-478)telemetry/src/telemetry/test.py (1)
TelemetryTest(34-49)
🪛 GitHub Actions: Main Workflow
requirements.txt
[error] 1-1: pip install failed: No matching distribution found for opentelemetry-api>=1.20.0.
🪛 Ruff (0.14.8)
telemetry/src/telemetry/test.py
142-142: Avoid specifying long messages outside the exception class
(TRY003)
telemetry/example_mysql.py
68-68: Undefined name Warehouse
(F821)
75-75: Undefined name Warehouse
(F821)
93-93: Do not catch blind exception: Exception
(BLE001)
119-119: f-string without any placeholders
Remove extraneous f prefix
(F541)
164-164: f-string without any placeholders
Remove extraneous f prefix
(F541)
311-311: Do not catch blind exception: Exception
(BLE001)
telemetry/src/telemetry_plugin.py
46-46: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
47-47: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
48-57: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
58-58: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
telemetry/src/telemetry/contention.py
51-51: Unused method argument: entity_manager
(ARG002)
64-64: Unused method argument: entity_manager
(ARG002)
77-77: Unused method argument: entity_manager
(ARG002)
160-160: Consider moving this statement to an else block
(TRY300)
162-162: Do not catch blind exception: Exception
(BLE001)
206-206: Consider moving this statement to an else block
(TRY300)
208-208: Do not catch blind exception: Exception
(BLE001)
267-267: Consider moving this statement to an else block
(TRY300)
269-269: Do not catch blind exception: Exception
(BLE001)
323-323: Consider moving this statement to an else block
(TRY300)
325-325: Do not catch blind exception: Exception
(BLE001)
381-381: Consider moving this statement to an else block
(TRY300)
383-385: try-except-pass detected, consider logging the exception
(S110)
383-383: Do not catch blind exception: Exception
(BLE001)
420-420: Consider moving this statement to an else block
(TRY300)
422-422: Do not catch blind exception: Exception
(BLE001)
460-460: Consider moving this statement to an else block
(TRY300)
462-462: Do not catch blind exception: Exception
(BLE001)
518-519: try-except-pass detected, consider logging the exception
(S110)
518-518: Do not catch blind exception: Exception
(BLE001)
521-521: Consider moving this statement to an else block
(TRY300)
523-523: Do not catch blind exception: Exception
(BLE001)
575-575: Consider moving this statement to an else block
(TRY300)
577-577: Do not catch blind exception: Exception
(BLE001)
619-619: Consider moving this statement to an else block
(TRY300)
621-621: Do not catch blind exception: Exception
(BLE001)
telemetry/src/telemetry/system.py
324-324: Consider moving this statement to an else block
(TRY300)
telemetry/example_usage.py
67-67: Undefined name Person
(F821)
73-73: Undefined name Address
(F821)
89-89: Undefined name Person
(F821)
92-92: Undefined name Person
(F821)
112-112: Do not catch blind exception: Exception
(BLE001)
129-129: f-string without any placeholders
Remove extraneous f prefix
(F541)
142-142: f-string without any placeholders
Remove extraneous f prefix
(F541)
164-164: f-string without any placeholders
Remove extraneous f prefix
(F541)
248-248: Do not catch blind exception: Exception
(BLE001)
telemetry/src/telemetry/decorators.py
83-83: Do not use bare except
(E722)
83-84: try-except-pass detected, consider logging the exception
(S110)
93-93: Do not use bare except
(E722)
106-106: Local variable ctx_token is assigned to but never used
Remove assignment to unused variable ctx_token
(F841)
174-174: Unused function argument: args
(ARG001)
174-174: Unused function argument: kwargs
(ARG001)
220-220: Consider moving this statement to an else block
(TRY300)
🔇 Additional comments (18)
telemetry/README.md (1)
1-294: Well-structured documentation.The README provides comprehensive coverage of features, configuration, usage patterns, metrics, and troubleshooting. The examples for both PostgreSQL and MySQL are clear and helpful.
telemetry/QUICK_START.md (1)
1-184: Effective quick-start guide.The guide provides a clear path from installation through basic usage, with practical examples and troubleshooting advice.
telemetry/MYSQL_SUPPORT.md (1)
1-328: Comprehensive MySQL support documentation.This document provides excellent coverage of MySQL-specific features, including version compatibility, error detection, and InnoDB monitoring. The compatibility matrix and examples are particularly helpful.
telemetry/src/telemetry/__init__.py (1)
1-44: Clean module initialization.The module structure follows Python best practices with appropriate exports for the public API.
telemetry/src/telemetry/test.py (1)
57-174: Solid test coverage for core functionality.The tests appropriately cover plugin loading, decorator behavior with and without OpenTelemetry, exception handling with rollback, and base detector functionality using mock objects.
telemetry/src/telemetry_plugin.py (1)
60-112: Well-structured plugin implementation.The plugin follows Colony patterns correctly and provides a clean API surface for instrumentation, configuration, and contention detection.
telemetry/src/telemetry/system.py (6)
38-49: Proper graceful degradation for optional dependency.The try/except block correctly handles the case where OpenTelemetry packages are not installed, setting OTEL_AVAILABLE flag appropriately.
86-151: Robust default configuration with fallback options.The initialization logic properly checks for external configuration, falls back to OTLP or console exporters, and handles missing dependencies gracefully.
153-196: Clean metrics initialization.All metric instruments are properly created with appropriate names, descriptions, and units following OpenTelemetry conventions.
336-355: Comprehensive contention detection with MySQL support.The error detection logic properly identifies MySQL error code 1213 (deadlock) and falls back to string matching for other databases. The contention type classification (deadlock, lock_timeout, serialization_failure, lock_wait) is well-structured.
226-461: Thorough entity manager instrumentation.The instrumentation wraps all critical entity manager methods (begin, commit, rollback, execute_query, lock, lock_table) with proper tracing spans, metrics recording, and error handling. The implementation correctly tracks transaction lifetimes and detects slow queries.
463-478: Clean contention detector factory.The factory method provides appropriate detector instances based on the database engine.
requirements.txt (1)
2-4: The OpenTelemetry packages specified in requirements.txt are available at the requested versions. Latest versions are 1.39.1, released in December 2025, which exceeds the >=1.20.0 constraint. No changes needed.Likely an incorrect or invalid review comment.
telemetry/src/telemetry/decorators.py (2)
162-170:return_valuemay be referenced before assignment.If an exception occurs, the
finallyblock executes and thenraisere-throws the exception. However, thereturn return_valueon line 170 is unreachable after araise. This is actually safe because theraiseon line 160 exits before reaching line 170, but the code structure is slightly confusing.The current flow is correct—no fix needed.
190-231: LGTM!The
monitored_querydecorator is well-structured, using thestart_as_current_spancontext manager correctly for automatic span lifecycle management. The graceful degradation when OTEL is unavailable is properly implemented.telemetry/example_mysql.py (1)
320-347: LGTM!The
main()function provides clear guidance on MySQL-specific features and usage instructions. The documentation withinexample_mysql_best_practices()is informative and aligns with MySQL/InnoDB best practices.telemetry/src/telemetry/contention.py (1)
32-89: LGTM!The base
ContentionDetectorclass provides a clean interface with sensible default implementations (empty list/dict). The unusedentity_managerparameters in base methods are intentional to define the interface contract for subclasses.telemetry/example_usage.py (1)
259-281: LGTM!The
main()function provides clear usage instructions and documentation references. The overall example structure is well-organized and educational.
| # Lock source warehouse (this order of locking can cause deadlocks) | ||
| self.entity_manager.lock(Warehouse, from_warehouse) | ||
|
|
||
| # Simulate some processing | ||
| import time | ||
| time.sleep(0.1) | ||
|
|
||
| # Lock destination warehouse | ||
| self.entity_manager.lock(Warehouse, to_warehouse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined Warehouse class referenced.
The Warehouse class is used but never defined or imported. For a runnable example, consider either defining a stub class or adding a comment explaining this is pseudocode.
🔎 Proposed fix - add stub or comment
+# Note: Warehouse is a placeholder entity class representing your domain model
+# In a real application, this would be an Entity subclass defined in your codebase
+
class OrderService:
def __init__(self, entity_manager):
self.entity_manager = entity_managerCommittable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.14.8)
68-68: Undefined name Warehouse
(F821)
75-75: Undefined name Warehouse
(F821)
🤖 Prompt for AI Agents
In telemetry/example_mysql.py around lines 67 to 75 the code references an
undefined Warehouse class when locking warehouses; either define a minimal stub
Warehouse class in this module (e.g., a simple dataclass or class with an
id/name) above these lines, or add an explicit import from the module where
Warehouse is defined, or replace the reference with a clear comment that this is
pseudocode and requires a real Warehouse model in real usage; implement one of
these fixes so the example is runnable and the reference is resolved.
| class UserService: | ||
| def __init__(self, entity_manager): | ||
| self.entity_manager = entity_manager | ||
|
|
||
| @decorators.transaction() | ||
| def create_user_with_address(self, user_data, address_data): | ||
| """ | ||
| Creates a user and address in a single transaction. | ||
| This will be automatically traced. | ||
| """ | ||
|
|
||
| # Create user entity | ||
| user = Person() | ||
| user.name = user_data["name"] | ||
| user.email = user_data["email"] | ||
| self.entity_manager.save(user) | ||
|
|
||
| # Create address entity | ||
| address = Address() | ||
| address.street = address_data["street"] | ||
| address.city = address_data["city"] | ||
| address.person = user | ||
| self.entity_manager.save(address) | ||
|
|
||
| return user | ||
|
|
||
| @decorators.transaction() | ||
| def update_user_email(self, user_id, new_email): | ||
| """ | ||
| Updates a user's email. | ||
| Demonstrates lock acquisition tracking. | ||
| """ | ||
|
|
||
| # Lock the user record | ||
| self.entity_manager.lock(Person, user_id) | ||
|
|
||
| # Find and update | ||
| user = self.entity_manager.get(Person, user_id) | ||
| user.email = new_email | ||
| self.entity_manager.update(user) | ||
|
|
||
| return user | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined Person and Address classes referenced.
These entity classes are used but never defined or imported. Add a comment clarifying these are placeholder domain entities.
🔎 Proposed fix
+ # Note: Person and Address are placeholder entity classes
+ # In a real application, these would be Entity subclasses from your domain model
class UserService:
def __init__(self, entity_manager):
self.entity_manager = entity_manager📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class UserService: | |
| def __init__(self, entity_manager): | |
| self.entity_manager = entity_manager | |
| @decorators.transaction() | |
| def create_user_with_address(self, user_data, address_data): | |
| """ | |
| Creates a user and address in a single transaction. | |
| This will be automatically traced. | |
| """ | |
| # Create user entity | |
| user = Person() | |
| user.name = user_data["name"] | |
| user.email = user_data["email"] | |
| self.entity_manager.save(user) | |
| # Create address entity | |
| address = Address() | |
| address.street = address_data["street"] | |
| address.city = address_data["city"] | |
| address.person = user | |
| self.entity_manager.save(address) | |
| return user | |
| @decorators.transaction() | |
| def update_user_email(self, user_id, new_email): | |
| """ | |
| Updates a user's email. | |
| Demonstrates lock acquisition tracking. | |
| """ | |
| # Lock the user record | |
| self.entity_manager.lock(Person, user_id) | |
| # Find and update | |
| user = self.entity_manager.get(Person, user_id) | |
| user.email = new_email | |
| self.entity_manager.update(user) | |
| return user | |
| # Note: Person and Address are placeholder entity classes | |
| # In a real application, these would be Entity subclasses from your domain model | |
| class UserService: | |
| def __init__(self, entity_manager): | |
| self.entity_manager = entity_manager | |
| @decorators.transaction() | |
| def create_user_with_address(self, user_data, address_data): | |
| """ | |
| Creates a user and address in a single transaction. | |
| This will be automatically traced. | |
| """ | |
| # Create user entity | |
| user = Person() | |
| user.name = user_data["name"] | |
| user.email = user_data["email"] | |
| self.entity_manager.save(user) | |
| # Create address entity | |
| address = Address() | |
| address.street = address_data["street"] | |
| address.city = address_data["city"] | |
| address.person = user | |
| self.entity_manager.save(address) | |
| return user | |
| @decorators.transaction() | |
| def update_user_email(self, user_id, new_email): | |
| """ | |
| Updates a user's email. | |
| Demonstrates lock acquisition tracking. | |
| """ | |
| # Lock the user record | |
| self.entity_manager.lock(Person, user_id) | |
| # Find and update | |
| user = self.entity_manager.get(Person, user_id) | |
| user.email = new_email | |
| self.entity_manager.update(user) | |
| return user |
🧰 Tools
🪛 Ruff (0.14.8)
67-67: Undefined name Person
(F821)
73-73: Undefined name Address
(F821)
89-89: Undefined name Person
(F821)
92-92: Undefined name Person
(F821)
🤖 Prompt for AI Agents
In telemetry/example_usage.py around lines 55 to 97, the code references
undefined domain entity classes Person and Address; add a short comment near the
top of this snippet (or import area) stating that Person and Address are
placeholder domain entities used for example purposes and should be replaced
with real model definitions or imports in production, or alternatively
import/define minimal stub classes for the example so the sample code is
self-contained and clear to readers.
| # Complex query that might be slow | ||
| query = """ | ||
| SELECT u.name, u.email, COUNT(o.id) as order_count | ||
| FROM users u | ||
| LEFT JOIN orders o ON o.user_id = u.id | ||
| WHERE o.created_at BETWEEN %s AND %s | ||
| GROUP BY u.id, u.name, u.email | ||
| ORDER BY order_count DESC | ||
| LIMIT 100 | ||
| """ | ||
|
|
||
| cursor = self.entity_manager.execute_query(query % (start_date, end_date)) | ||
| results = cursor.fetchall() | ||
| cursor.close() | ||
|
|
||
| return results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability in example code.
The generate_user_report method uses string interpolation (%) to build the query with start_date and end_date. Even in example code, this demonstrates an insecure pattern that users might copy. Consider using parameterized queries or adding a security warning comment.
🔎 Proposed fix
# Complex query that might be slow
+ # WARNING: Use parameterized queries in production to prevent SQL injection
query = """
SELECT u.name, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
- WHERE o.created_at BETWEEN %s AND %s
+ WHERE o.created_at BETWEEN ? AND ?
GROUP BY u.id, u.name, u.email
ORDER BY order_count DESC
LIMIT 100
"""
- cursor = self.entity_manager.execute_query(query % (start_date, end_date))
+ # Use parameterized query (syntax depends on your database driver)
+ cursor = self.entity_manager.execute_query(query, params=(start_date, end_date))Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In telemetry/example_usage.py around lines 190 to 205, the SQL is built via
Python string interpolation which exposes an SQL injection risk; replace the
string interpolation with a parameterized query call using the DB driver's
parameter substitution (e.g., pass the query with placeholders and a tuple of
(start_date, end_date) to execute), or use the entity_manager/ORM parameter
binding API if available, and add a brief comment warning not to interpolate
user input directly into SQL.
| ```bash | ||
| pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update installation command with correct package versions.
This installation command will fail due to the invalid version constraint (>=1.20.0) mentioned in requirements.txt. Update once the correct versions are determined.
🤖 Prompt for AI Agents
In telemetry/QUICK_START.md around lines 9–11, the pip install example uses
unpinned packages but conflicts with the version constraints in requirements.txt
(the >=1.20.0 constraint is causing install failures); update the command to use
the concrete, compatible package versions listed in requirements.txt (or valid
semver specifiers that match those constraints), e.g. replace the current line
with a pip install that pins each opentelemetry package to the exact versions
from requirements.txt or to compatible >=/<== ranges that are valid.
| ```bash | ||
| pip install opentelemetry-api opentelemetry-sdk | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update installation instructions with correct package versions.
The installation command references OpenTelemetry packages, but the version constraint in requirements.txt (>=1.20.0) appears to be invalid based on the pipeline failure. Once the correct versions are verified, ensure this documentation is updated accordingly.
🤖 Prompt for AI Agents
In telemetry/README.md around lines 19-21, the install instruction lists
opentelemetry packages without correct version constraints that conflict with
requirements.txt (the >=1.20.0 constraint appears invalid and caused pipeline
failures); verify the exact, published OpenTelemetry package versions on PyPI
that are compatible with the project, then update the README install command to
include those explicit, working version(s) that match requirements.txt (or
alternatively fix requirements.txt to a valid constraint) so docs and CI are
consistent.
| def get_slow_queries(self, entity_manager, duration_threshold="5 seconds"): | ||
| """ | ||
| Returns currently executing slow queries. | ||
|
|
||
| :type entity_manager: EntityManager | ||
| :param entity_manager: The entity manager to query. | ||
| :type duration_threshold: String | ||
| :param duration_threshold: PostgreSQL interval for slow query threshold. | ||
| :rtype: list | ||
| :return: List of slow query information. | ||
| """ | ||
|
|
||
| query = """ | ||
| SELECT | ||
| pid, | ||
| usename, | ||
| application_name, | ||
| client_addr, | ||
| state, | ||
| query, | ||
| NOW() - query_start AS duration, | ||
| wait_event_type, | ||
| wait_event | ||
| FROM pg_stat_activity | ||
| WHERE state != 'idle' | ||
| AND query_start IS NOT NULL | ||
| AND NOW() - query_start > interval '%s' | ||
| ORDER BY duration DESC; | ||
| """ % duration_threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability via string interpolation.
The duration_threshold parameter is directly interpolated into the SQL query using % string formatting. If user input reaches this parameter, it could lead to SQL injection. Use parameterized queries instead.
🔎 Proposed fix
The fix depends on whether the entity manager's execute_query supports parameterized queries. If it does:
query = """
SELECT
pid,
usename,
application_name,
client_addr,
state,
query,
NOW() - query_start AS duration,
wait_event_type,
wait_event
FROM pg_stat_activity
WHERE state != 'idle'
AND query_start IS NOT NULL
- AND NOW() - query_start > interval '%s'
+ AND NOW() - query_start > interval %s
ORDER BY duration DESC;
- """ % duration_threshold
+ """
+ # Pass duration_threshold as a parameter
+ cursor = entity_manager.execute_query(query, params=(duration_threshold,))If parameterized queries aren't supported, validate the input:
import re
if not re.match(r'^\d+\s*(second|seconds|minute|minutes|hour|hours)$', duration_threshold, re.IGNORECASE):
raise ValueError("Invalid duration_threshold format")Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In telemetry/src/telemetry/contention.py around lines 275 to 303, the
duration_threshold is directly interpolated into the SQL string causing a SQL
injection risk; change the implementation to use a parameterized query (pass the
interval value as a bound parameter to entity_manager.execute_query) or, if the
entity manager does not support parameter binding for interval literals,
validate and whitelist the input format (e.g., allow only numeric value + unit
like "5 seconds|minutes|hours" via a strict regex) and raise ValueError on
invalid input before formatting into the SQL.
| def get_slow_queries(self, entity_manager, duration_threshold_seconds=5): | ||
| """ | ||
| Returns currently executing slow queries from processlist. | ||
|
|
||
| :type entity_manager: EntityManager | ||
| :param entity_manager: The entity manager to query. | ||
| :type duration_threshold_seconds: int | ||
| :param duration_threshold_seconds: Minimum query duration in seconds. | ||
| :rtype: list | ||
| :return: List of slow query information. | ||
| """ | ||
|
|
||
| query = """ | ||
| SELECT | ||
| ID as pid, | ||
| USER as user, | ||
| HOST as host, | ||
| DB as db, | ||
| COMMAND as command, | ||
| TIME as duration_seconds, | ||
| STATE as state, | ||
| INFO as query | ||
| FROM information_schema.PROCESSLIST | ||
| WHERE COMMAND != 'Sleep' | ||
| AND TIME > %d | ||
| AND INFO IS NOT NULL | ||
| ORDER BY TIME DESC; | ||
| """ % duration_threshold_seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability via string interpolation.
Similar to the PostgreSQL detector, duration_threshold_seconds is directly interpolated into the SQL query. Although the docstring specifies int, the type is not enforced at runtime.
🔎 Proposed fix
- def get_slow_queries(self, entity_manager, duration_threshold_seconds=5):
+ def get_slow_queries(self, entity_manager, duration_threshold_seconds=5):
+ # Ensure parameter is an integer to prevent SQL injection
+ duration_threshold_seconds = int(duration_threshold_seconds)
+
"""
Returns currently executing slow queries from processlist.
...Or use parameterized queries if supported by the entity manager.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_slow_queries(self, entity_manager, duration_threshold_seconds=5): | |
| """ | |
| Returns currently executing slow queries from processlist. | |
| :type entity_manager: EntityManager | |
| :param entity_manager: The entity manager to query. | |
| :type duration_threshold_seconds: int | |
| :param duration_threshold_seconds: Minimum query duration in seconds. | |
| :rtype: list | |
| :return: List of slow query information. | |
| """ | |
| query = """ | |
| SELECT | |
| ID as pid, | |
| USER as user, | |
| HOST as host, | |
| DB as db, | |
| COMMAND as command, | |
| TIME as duration_seconds, | |
| STATE as state, | |
| INFO as query | |
| FROM information_schema.PROCESSLIST | |
| WHERE COMMAND != 'Sleep' | |
| AND TIME > %d | |
| AND INFO IS NOT NULL | |
| ORDER BY TIME DESC; | |
| """ % duration_threshold_seconds | |
| def get_slow_queries(self, entity_manager, duration_threshold_seconds=5): | |
| """ | |
| Returns currently executing slow queries from processlist. | |
| :type entity_manager: EntityManager | |
| :param entity_manager: The entity manager to query. | |
| :type duration_threshold_seconds: int | |
| :param duration_threshold_seconds: Minimum query duration in seconds. | |
| :rtype: list | |
| :return: List of slow query information. | |
| """ | |
| # Ensure parameter is an integer to prevent SQL injection | |
| duration_threshold_seconds = int(duration_threshold_seconds) | |
| query = """ | |
| SELECT | |
| ID as pid, | |
| USER as user, | |
| HOST as host, | |
| DB as db, | |
| COMMAND as command, | |
| TIME as duration_seconds, | |
| STATE as state, | |
| INFO as query | |
| FROM information_schema.PROCESSLIST | |
| WHERE COMMAND != 'Sleep' | |
| AND TIME > %d | |
| AND INFO IS NOT NULL | |
| ORDER BY TIME DESC; | |
| """ % duration_threshold_seconds |
🤖 Prompt for AI Agents
In telemetry/src/telemetry/contention.py around lines 529 to 556, the SQL builds
a query by interpolating duration_threshold_seconds directly into the string
which allows SQL injection; change this to use a parameterized query or at
minimum enforce an integer cast before using it. Replace the string
interpolation with a parameter placeholder supported by the entity_manager (e.g.
"%s" or "?" depending on its DB API) and pass duration_threshold_seconds as a
bound parameter to entity_manager.execute; if parameterization is not possible,
explicitly validate/cast duration_threshold_seconds =
int(duration_threshold_seconds) and use that integer in the query string.
| def get_deadlock_info(self, entity_manager): | ||
| """ | ||
| Returns recent deadlock information from InnoDB status. | ||
|
|
||
| :type entity_manager: EntityManager | ||
| :param entity_manager: The entity manager to query. | ||
| :rtype: dict | ||
| :return: Deadlock information. | ||
| """ | ||
|
|
||
| query = "SHOW ENGINE INNODB STATUS" | ||
|
|
||
| try: | ||
| cursor = entity_manager.execute_query(query) | ||
| row = cursor.fetchone() | ||
|
|
||
| if row and len(row) > 2: | ||
| status_text = row[2] | ||
|
|
||
| # Parse the latest deadlock information | ||
| deadlock_info = {} | ||
| if "LATEST DETECTED DEADLOCK" in status_text: | ||
| # Extract deadlock section | ||
| start = status_text.find("LATEST DETECTED DEADLOCK") | ||
| end = status_text.find("------------", start + 1) | ||
| if end > start: | ||
| deadlock_section = status_text[start:end] | ||
| deadlock_info["has_recent_deadlock"] = True | ||
| deadlock_info["deadlock_text"] = deadlock_section[:500] # First 500 chars | ||
| else: | ||
| deadlock_info["has_recent_deadlock"] = False | ||
|
|
||
| cursor.close() | ||
| return deadlock_info | ||
|
|
||
| cursor.close() | ||
| return {} | ||
|
|
||
| except Exception as e: | ||
| self.telemetry_system.plugin.warning( | ||
| "Failed to get deadlock info: %s" % str(e) | ||
| ) | ||
| return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deadlock parsing may miss section boundary.
The parsing logic looks for "------------" as the end marker, but InnoDB status sections can have varying delimiter counts. If the delimiter isn't found exactly, end becomes -1, causing end > start to fail, and deadlock_info remains empty (not even has_recent_deadlock: False).
🔎 Proposed fix
if "LATEST DETECTED DEADLOCK" in status_text:
# Extract deadlock section
start = status_text.find("LATEST DETECTED DEADLOCK")
end = status_text.find("------------", start + 1)
- if end > start:
+ if end > start and end != -1:
deadlock_section = status_text[start:end]
deadlock_info["has_recent_deadlock"] = True
deadlock_info["deadlock_text"] = deadlock_section[:500] # First 500 chars
+ else:
+ # Section found but couldn't determine end; still flag as detected
+ deadlock_info["has_recent_deadlock"] = True
+ deadlock_info["deadlock_text"] = status_text[start:start+500]
else:
deadlock_info["has_recent_deadlock"] = False🧰 Tools
🪛 Ruff (0.14.8)
619-619: Consider moving this statement to an else block
(TRY300)
621-621: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
telemetry/src/telemetry/contention.py around lines 583 to 625: the current logic
looks for the exact string "------------" as the end of the deadlock section
which can fail (find returns -1) and leaves deadlock_info empty; update parsing
to locate the next section delimiter more robustly — after finding the "LATEST
DETECTED DEADLOCK" start, search for the next line that is a sequence of hyphens
of length >= 3 using a regex or a loop over lines, treat a not-found end as
end-of-text, set deadlock_info["has_recent_deadlock"] = True when a section is
found and always set it to False when the marker isn't present, trim the
captured section to the first 500 chars, and ensure the cursor is closed in all
paths (preferably via finally or context manager).
Adds complete SQLite support to the OpenTelemetry telemetry plugin for database monitoring and contention detection. SQLite's file-based architecture requires different monitoring approaches than client-server databases. SQLite-Specific Features: - Database file monitoring (path, size, page statistics) - PRAGMA-based diagnostics for database configuration - SQLITE_BUSY error detection (database locked errors) - Database integrity checks (PRAGMA integrity_check) - Optimization support (VACUUM and ANALYZE) - Lock configuration monitoring (busy_timeout, locking_mode) - Journal mode detection (DELETE, WAL, etc.) SQLiteContentionDetector Methods: - get_database_info(): File path, size, PRAGMA settings - get_lock_waits(): Lock configuration (timeout, mode) - get_transaction_stats(): Database statistics and transaction state - get_integrity_check(): Runs PRAGMA integrity_check - optimize_database(): Runs VACUUM and ANALYZE - get_blocking_queries(): Returns empty (no system tables) - get_slow_queries(): Returns empty (use OTel traces) Error Detection Enhancements (system.py): - SQLite OperationalError detection for "database is locked" - Contention type classification: "database_locked" - Enhanced error keyword detection including "busy" - Automatic SQLITE_BUSY error categorization Key Differences from PostgreSQL/MySQL: - File-based database (not client-server) - Database-level locks (not row-level) - No system tables for monitoring connections - PRAGMA statements for diagnostics - VACUUM needed for space reclamation - WAL mode recommended for concurrency Documentation: - example_sqlite.py: Complete SQLite usage examples - README.md: SQLite-specific examples and usage - Best practices for WAL mode and optimization - Monitoring strategies for file-based databases Components Updated: - contention.py: Added SQLiteContentionDetector class (338 lines) - system.py: Enhanced error detection for SQLite - __init__.py: Export SQLiteContentionDetector - README.md: SQLite examples and documentation The implementation provides appropriate monitoring for SQLite's unique characteristics while maintaining API compatibility with the PostgreSQL and MySQL detectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (3)
telemetry/src/telemetry/contention.py (3)
275-303: SQL injection vulnerability remains unaddressed.The
duration_thresholdparameter is still directly interpolated into the SQL query. This was flagged in a previous review. Consider validating input format or using parameterized queries.
529-556: SQL injection vulnerability via integer interpolation remains unaddressed.The
duration_threshold_secondsparameter is interpolated without explicit type enforcement. This was flagged in a previous review.
604-616: Deadlock parsing edge case remains unaddressed.If the
"------------"delimiter isn't found after the deadlock section start,endwill be-1and nohas_recent_deadlockkey is set in the returned dict. This was flagged in a previous review.
🧹 Nitpick comments (1)
telemetry/SQLITE_SUPPORT.md (1)
207-222: Add language specifiers to fenced code blocks.The trace attribute code blocks at lines 207-213 and 216-222 are missing language specifiers. Consider adding
textoryamlfor better rendering and markdown linter compliance.🔎 Suggested fix
## 🔍 Trace Attributes for SQLite ### Transaction Spans -``` +```text db.system = "sqlite" db.transaction.type = "required" db.transaction.duration_ms = 125 db.transaction.contention = true # If SQLITE_BUSY occurred db.contention.type = "database_locked" # SQLite-specificQuery Spans
-
+text
db.system = "sqlite"
db.statement = "UPDATE products SET..."
db.slow_query = true
db.lock.contention = true
db.contention.type = "database_locked"
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
telemetry/README.mdtelemetry/SQLITE_SUPPORT.mdtelemetry/example_sqlite.pytelemetry/src/telemetry/__init__.pytelemetry/src/telemetry/contention.pytelemetry/src/telemetry/system.py
🚧 Files skipped from review as they are similar to previous changes (1)
- telemetry/src/telemetry/init.py
🧰 Additional context used
🧬 Code graph analysis (2)
telemetry/example_sqlite.py (3)
data/src/entity_manager_plugin.py (1)
load_entity_manager_properties(78-91)telemetry/src/telemetry/system.py (1)
get_contention_detector(471-488)telemetry/src/telemetry/contention.py (10)
get_database_info(637-738)get_lock_waits(64-75)get_lock_waits(168-212)get_lock_waits(428-466)get_lock_waits(755-788)get_transaction_stats(77-88)get_transaction_stats(214-273)get_transaction_stats(468-527)get_transaction_stats(790-858)get_integrity_check(932-963)
telemetry/src/telemetry/contention.py (1)
telemetry/src/telemetry/test.py (1)
warning(161-162)
🪛 markdownlint-cli2 (0.18.1)
telemetry/SQLITE_SUPPORT.md
207-207: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
216-216: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
330-330: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
361-361: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
452-452: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
telemetry/README.md
299-299: Heading levels should only increment by one level at a time
Expected: h3; Actual: h4
(MD001, heading-increment)
🪛 Ruff (0.14.8)
telemetry/src/telemetry/system.py
324-324: Consider moving this statement to an else block
(TRY300)
telemetry/example_sqlite.py
34-34: Probable insecure usage of temporary file or directory: "/tmp/example.db"
(S108)
64-64: Undefined name Order
(F821)
76-76: Do not catch blind exception: Exception
(BLE001)
166-166: f-string without any placeholders
Remove extraneous f prefix
(F541)
295-295: Do not catch blind exception: Exception
(BLE001)
345-345: Do not catch blind exception: Exception
(BLE001)
telemetry/src/telemetry/contention.py
51-51: Unused method argument: entity_manager
(ARG002)
64-64: Unused method argument: entity_manager
(ARG002)
77-77: Unused method argument: entity_manager
(ARG002)
160-160: Consider moving this statement to an else block
(TRY300)
162-162: Do not catch blind exception: Exception
(BLE001)
206-206: Consider moving this statement to an else block
(TRY300)
208-208: Do not catch blind exception: Exception
(BLE001)
267-267: Consider moving this statement to an else block
(TRY300)
269-269: Do not catch blind exception: Exception
(BLE001)
323-323: Consider moving this statement to an else block
(TRY300)
325-325: Do not catch blind exception: Exception
(BLE001)
381-381: Consider moving this statement to an else block
(TRY300)
383-385: try-except-pass detected, consider logging the exception
(S110)
383-383: Do not catch blind exception: Exception
(BLE001)
420-420: Consider moving this statement to an else block
(TRY300)
422-422: Do not catch blind exception: Exception
(BLE001)
460-460: Consider moving this statement to an else block
(TRY300)
462-462: Do not catch blind exception: Exception
(BLE001)
518-519: try-except-pass detected, consider logging the exception
(S110)
518-518: Do not catch blind exception: Exception
(BLE001)
521-521: Consider moving this statement to an else block
(TRY300)
523-523: Do not catch blind exception: Exception
(BLE001)
575-575: Consider moving this statement to an else block
(TRY300)
577-577: Do not catch blind exception: Exception
(BLE001)
619-619: Consider moving this statement to an else block
(TRY300)
621-621: Do not catch blind exception: Exception
(BLE001)
662-663: try-except-pass detected, consider logging the exception
(S110)
662-662: Do not catch blind exception: Exception
(BLE001)
678-679: try-except-pass detected, consider logging the exception
(S110)
678-678: Do not catch blind exception: Exception
(BLE001)
688-689: try-except-pass detected, consider logging the exception
(S110)
688-688: Do not catch blind exception: Exception
(BLE001)
698-699: try-except-pass detected, consider logging the exception
(S110)
698-698: Do not catch blind exception: Exception
(BLE001)
709-710: try-except-pass detected, consider logging the exception
(S110)
709-709: Do not catch blind exception: Exception
(BLE001)
719-720: try-except-pass detected, consider logging the exception
(S110)
719-719: Do not catch blind exception: Exception
(BLE001)
729-730: try-except-pass detected, consider logging the exception
(S110)
729-729: Do not catch blind exception: Exception
(BLE001)
732-732: Consider moving this statement to an else block
(TRY300)
734-734: Do not catch blind exception: Exception
(BLE001)
740-740: Unused method argument: entity_manager
(ARG002)
782-782: Consider moving this statement to an else block
(TRY300)
784-784: Do not catch blind exception: Exception
(BLE001)
813-814: try-except-pass detected, consider logging the exception
(S110)
813-813: Do not catch blind exception: Exception
(BLE001)
829-830: try-except-pass detected, consider logging the exception
(S110)
829-829: Do not catch blind exception: Exception
(BLE001)
839-840: try-except-pass detected, consider logging the exception
(S110)
839-839: Do not catch blind exception: Exception
(BLE001)
849-850: try-except-pass detected, consider logging the exception
(S110)
849-849: Do not catch blind exception: Exception
(BLE001)
852-852: Consider moving this statement to an else block
(TRY300)
854-854: Do not catch blind exception: Exception
(BLE001)
860-860: Unused method argument: entity_manager
(ARG002)
860-860: Unused method argument: duration_threshold_seconds
(ARG002)
904-904: Do not catch blind exception: Exception
(BLE001)
905-905: Use explicit conversion flag
Replace with conversion flag
(RUF010)
913-913: Do not catch blind exception: Exception
(BLE001)
914-914: Use explicit conversion flag
Replace with conversion flag
(RUF010)
924-924: Consider moving this statement to an else block
(TRY300)
926-926: Do not catch blind exception: Exception
(BLE001)
953-957: Consider moving this statement to an else block
(TRY300)
959-959: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (9)
telemetry/src/telemetry/system.py (5)
249-261: LGTM!The per-connection transaction timing tracking using
id(connection)as the key is appropriate for the use case. The closure-scoped dictionary ensures isolation per instrumented entity manager.
263-303: LGTM!The commit and rollback wrappers correctly handle timing and metrics. The
finallyblock ensuresactive_transactionsis decremented even if the underlying operation fails.
305-400: LGTM!The query instrumentation is comprehensive with proper span creation, query truncation (500 chars), contention detection for multiple database engines, and configurable slow query thresholds. The error classification logic correctly identifies MySQL deadlocks (error 1213), SQLite BUSY errors, and generic lock-related errors.
401-458: LGTM!The lock method wrappers correctly instrument both row-level and table-level locks with appropriate span attributes and duration metrics.
471-488: LGTM!Clean factory method for creating engine-specific contention detectors with a sensible fallback to the base
ContentionDetector.telemetry/example_sqlite.py (1)
13-44: LGTM!The setup function follows the correct Colony plugin loading pattern and demonstrates proper instrumentation of the SQLite entity manager.
telemetry/src/telemetry/contention.py (3)
637-738: LGTM!The
get_database_infomethod comprehensively collects SQLite database information using PRAGMA commands. The individual try-except blocks provide graceful degradation if specific PRAGMAs fail, which is appropriate for diagnostic tooling.
877-930: LGTM!The
optimize_databasemethod correctly handles both ANALYZE and VACUUM operations with appropriate error handling. The comment noting that VACUUM cannot run inside a transaction is helpful, and failures are properly captured in the results dict.
932-963: LGTM!The
get_integrity_checkmethod correctly implements SQLite's PRAGMA integrity_check with proper result parsing and structured return values.
| order = self.entity_manager.get(Order, order_id) | ||
| order.status = "processed" | ||
| self.entity_manager.update(order) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order class is undefined.
The Order class is used but never imported or defined. This will raise a NameError if the example is executed. Consider either:
- Adding a comment explaining that
Ordershould be replaced with the user's entity class, or - Creating a placeholder class for the example.
🔎 Suggested fix
def example_sqlite_lock_handling(entity_manager):
"""
Example demonstrating SQLite lock detection and handling.
"""
+ # Note: Replace 'Order' with your actual entity class
+ # class Order:
+ # status = None
+
class OrderService:
def __init__(self, entity_manager):
self.entity_manager = entity_manager
@decorators.transaction()
def process_order(self, order_id):
"""
Processes an order.
If the database is locked, OpenTelemetry will detect it.
"""
- # Simulate some database work
+ # Example: Replace Order with your entity class
order = self.entity_manager.get(Order, order_id)Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.14.8)
64-64: Undefined name Order
(F821)
🤖 Prompt for AI Agents
In telemetry/example_sqlite.py around lines 64 to 66, the example uses Order but
Order is not defined or imported causing a NameError; fix by either adding a
brief inline comment above this snippet stating "Replace Order with your entity
class" or by adding a minimal placeholder dataclass/entity definition for Order
in the example (with the fields used, e.g., id and status) and import it
appropriately so the snippet runs as-is.
| print("\nRunning VACUUM and ANALYZE...") | ||
| results = detector.optimize_database(entity_manager) | ||
|
|
||
| print(f"\nOptimization Results:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary f-string prefix.
The string has no placeholders, so the f prefix is unnecessary.
🔎 Suggested fix
- print(f"\nOptimization Results:")
+ print("\nOptimization Results:")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f"\nOptimization Results:") | |
| print("\nOptimization Results:") |
🧰 Tools
🪛 Ruff (0.14.8)
166-166: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In telemetry/example_sqlite.py at line 166, the print statement uses an
unnecessary f-string prefix for a plain string; remove the leading "f" so the
statement becomes a normal string literal (print("\nOptimization Results:")) to
avoid misleading use of f-strings.
| # Check integrity periodically (e.g., daily) | ||
| import datetime | ||
| if datetime.datetime.now().hour == 3: # 3 AM | ||
| integrity = detector.get_integrity_check(entity_manager) | ||
| if not integrity.get('is_ok'): | ||
| print("\n[ALERT] Database integrity check failed!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integrity check may run multiple times during 3 AM hour.
The condition datetime.datetime.now().hour == 3 will evaluate to True for every iteration within the 3 AM hour, causing the integrity check to run repeatedly (every interval seconds). Consider tracking whether the check was already performed today.
🔎 Suggested fix
+ last_integrity_date = None
+
def monitor():
print(f"\nStarted SQLite monitoring (checking every {interval}s)")
+ nonlocal last_integrity_date
while True:
try:
# ... existing code ...
# Check integrity periodically (e.g., daily)
import datetime
- if datetime.datetime.now().hour == 3: # 3 AM
+ today = datetime.date.today()
+ if datetime.datetime.now().hour == 3 and last_integrity_date != today:
integrity = detector.get_integrity_check(entity_manager)
if not integrity.get('is_ok'):
print("\n[ALERT] Database integrity check failed!")
+ last_integrity_date = todayCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In telemetry/example_sqlite.py around lines 336 to 341, the integrity check
conditional uses datetime.datetime.now().hour == 3 which will be True for every
loop iteration during the 3 AM hour and cause repeated checks; fix this by
adding a persistent tracker (e.g., last_integrity_date variable declared outside
the loop) and only run the check when the current date is different from
last_integrity_date and the hour is 3, then update last_integrity_date to today
after running the check so it only runs once per day.
| ## License | ||
|
|
||
| Apache License, Version 2.0 | ||
|
|
||
| #### SQLite Example | ||
|
|
||
| ```python | ||
| # Get a SQLite contention detector | ||
| detector = telemetry_plugin.get_contention_detector("sqlite") | ||
|
|
||
| # Get database information (file path, size, PRAGMA settings) | ||
| info = detector.get_database_info(entity_manager) | ||
| print(f"File path: {info['file_path']}") | ||
| print(f"File size: {info['file_size_mb']} MB") | ||
| print(f"Journal mode: {info['journal_mode']}") | ||
| print(f"Synchronous mode: {info['synchronous']}") | ||
| print(f"Busy timeout: {info['busy_timeout_ms']} ms") | ||
|
|
||
| # Get lock configuration | ||
| lock_info = detector.get_lock_waits(entity_manager) | ||
| for info in lock_info: | ||
| print(f"Busy timeout: {info['busy_timeout_ms']} ms") | ||
| print(f"Locking mode: {info['locking_mode']}") | ||
|
|
||
| # Get transaction statistics | ||
| stats = detector.get_transaction_stats(entity_manager) | ||
| print(f"Database size: {stats['database_size_mb']} MB") | ||
| print(f"Page count: {stats['page_count']}") | ||
| print(f"Freelist count: {stats['freelist_count']}") | ||
| print(f"Active transaction: {stats['active_transaction']}") | ||
|
|
||
| # Run integrity check | ||
| integrity = detector.get_integrity_check(entity_manager) | ||
| if integrity['is_ok']: | ||
| print("✓ Database integrity OK") | ||
| else: | ||
| print("⚠️ Database integrity issues detected") | ||
|
|
||
| # Optimize database (VACUUM and ANALYZE) | ||
| results = detector.optimize_database(entity_manager) | ||
| print(f"VACUUM: {results['vacuum']}") | ||
| print(f"ANALYZE: {results['analyze']}") | ||
| print(f"Space saved: {results.get('space_saved_mb', 0)} MB") | ||
| ``` | ||
|
|
||
| **SQLite-Specific Notes:** | ||
| - SQLite uses database-level locks (not row-level) | ||
| - Contention detected via SQLITE_BUSY/OperationalError | ||
| - No processlist or system tables like PostgreSQL/MySQL | ||
| - Use PRAGMA statements for diagnostics | ||
| - Enable WAL mode for better concurrency | ||
|
|
||
| See [example_sqlite.py](example_sqlite.py) for complete SQLite examples. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLite Example section is misplaced after License.
The SQLite Example section (lines 299-348) appears after the License section, which breaks the document flow. It should be placed alongside the PostgreSQL and MySQL examples in the "Contention Detection" section (around line 128).
Additionally, the heading level #### doesn't follow the expected increment from ## (should start at ###).
🔎 Suggested restructure
Move the SQLite example block (lines 299-348) to after the MySQL example (line 168), changing the heading to:
-#### SQLite Example
+### SQLite ExampleThen ensure the License section remains at the end of the document.
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
299-299: Heading levels should only increment by one level at a time
Expected: h3; Actual: h4
(MD001, heading-increment)
🤖 Prompt for AI Agents
telemetry/README.md lines 295-348: the "SQLite Example" block is incorrectly
placed after the License and uses an incorrect heading level; cut the entire
block (lines 299-348) and paste it into the "Contention Detection" examples area
immediately after the MySQL example (around line ~168), change the heading from
"#### SQLite Example" to "### SQLite Example" so it follows the document
hierarchy, and confirm the License section stays at the end of the file.
| } | ||
| ``` | ||
|
|
||
| **PRAGMACommands Used:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Missing space in "PRAGMACommands".
Should be "PRAGMA Commands" for readability.
🔎 Suggested fix
-**PRAGMACommands Used:**
+**PRAGMA Commands Used:**📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **PRAGMACommands Used:** | |
| **PRAGMA Commands Used:** |
🤖 Prompt for AI Agents
In telemetry/SQLITE_SUPPORT.md around line 64, the heading "PRAGMACommands
Used:" has a missing space; change it to "PRAGMA Commands Used:" (i.e., insert a
space between "PRAGMA" and "Commands") to improve readability and consistency
with other headings.
Implements a comprehensive OpenTelemetry instrumentation plugin for the
Colony entity manager to detect and monitor database transaction
contention and performance issues.
Features:
Components:
Configuration:
Metrics:
Documentation:
The implementation is non-intrusive and follows the Colony plugin
architecture, making it easy to integrate with existing applications.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.