This repository demonstrates AutoQUEST, a task-oriented agent designed to streamline informatics research workflows. Using the MAUDE (Manufacturer and User Facility Device Experience) database as an example data source, AutoQUEST automates research question generation, data integration, querying, and analysis to uncover actionable insights.
The MAUDE database, maintained by the U.S. FDA, contains detailed records of medical device failures, patient problems, and adverse events. Despite its potential, its size and complexity make it challenging to extract meaningful insights efficiently.
AutoQUEST simplifies this process by automating key aspects of data querying and analysis, allowing researchers to explore trends, identify high-risk devices, and generate evidence-based insights.
Medical device safety is critical in healthcare. The vast and complex MAUDE database provides valuable insights into device-related adverse events but poses significant challenges:
- Data Volume: Millions of records spanning decades.
- Complex Schema: Multiple linked tables and inconsistent formats.
- Manual Overhead: Significant time and expertise required for analysis.
AutoQUEST addresses these challenges with:
- Dynamic Research Question Generation: Automatically formulates research questions based on user goals and database structure.
- SQL Query Automation: Generates optimized queries for extracting data from complex tables like MAUDE.
- Data Integration and Analysis: Handles data cleaning, merging, and trend analysis across large datasets.
- Reporting and Visualization: Produces actionable insights with dynamic visualizations and structured reports.
AutoQUEST is an intelligent agent built for automating research workflows. It operates through a modular THINK/DO Chain framework, optimizing both cognitive and operational processes:
-
THINK Chain
- Focuses on reasoning, planning, and memory to generate and refine research questions, develop execution plans, and manage contextual knowledge.
- Example in MAUDE: AutoQUEST identifies key fields such as device problem codes or event types, ensuring targeted analysis.
-
DO Chain
- Executes SQL queries, retrieves data, handles errors, and performs analysis.
- Example in MAUDE: Extracts longitudinal trends in adverse events by querying
Merged_Table_1and visualizing outcomes.
This separation ensures scalability and adaptability to other datasets while maintaining the MAUDE database as a benchmark for its capabilities.
The term QUEST reflects the core functionalities of the AutoQUEST system and can be understood as:
- Query: Formulating precise research questions and extracting relevant data.
- Understand: Analyzing the input and context to guide planning.
- Execute: Running queries and retrieving results.
- Synthesize: Processing and summarizing the data into actionable insights.
- Transform: Converting the findings into meaningful outputs such as reports or visualizations.
These components work together seamlessly to streamline complex informatics research workflows.
- User input: "Analyze adverse event trends by device type."
- AutoQUEST generates specific research questions, such as:
- "Which devices have the highest frequency of adverse events over the last five years?"
- "What trends exist in device failure rates across medical specialties?"
- AutoQUEST dynamically generates SQL queries:
- Fetching adverse events by
device_problem_codeandevent_typefromMerged_Table_3. - Analyzing event frequency over time from
Merged_Table_1.
- Fetching adverse events by
- MAUDE datasets from multiple years are cleaned and merged.
- Example:
Merged_Table_8consolidates patient demographics to correlate adverse events with age or gender.
- AutoQUEST executes the queries and generates visualizations:
- Heatmaps of adverse event frequencies by device type.
- Trend graphs showing longitudinal changes in event rates.
- The results are summarized into dynamic reports tailored to the research question, such as:
- "Top 5 High-Risk Devices by Event Frequency."
- "Correlation Between Adverse Events and Patient Demographics."
- Reasoning: Generates targeted research questions based on MAUDE schema and user goals.
Example: Identifies relevant tables likeMerged_Table_1andMerged_Table_3for event frequency analysis. - Planning: Designs the workflow to query and analyze adverse events by device type and timeframe.
- Memory: Stores and recalls prior query results or reusable SQL snippets for iterative research.
- Execution: Runs SQL queries on the MAUDE database to fetch relevant data.
- Error Handling: Automatically corrects invalid queries using feedback from database errors.
- Analysis: Processes extracted data into insights, such as device safety trends or risk matrices.
sql.ipynb: Jupyter notebook demonstrating SQL query generation and analysis with MAUDE data.prompt.txt: Describes MAUDE table schemas and fields for reference during query generation.
Merged_Table_1: Longitudinal view of device-related events.Merged_Table_3: Focuses on event reports and device problems.Merged_Table_8: Consolidates patient data for demographic analysis.Merged_Table_9: Captures free-text narratives of adverse events.
AutoQUEST, demonstrated with MAUDE, can be applied to other datasets requiring dynamic research workflows. Common use cases include:
- Trend analysis of medical device adverse events.
- Identification of high-risk devices and failure patterns.
- Analysis of correlations between patient demographics and outcomes.
This project is licensed under the MIT License. It leverages publicly available data from the FDA's MAUDE database to demonstrate the capabilities of AutoQUEST.
For more details on the MAUDE database, refer to the FDA MAUDE Documentation.
This version incorporates the QUEST acronym meaning into the core explanation of AutoQUEST. Let me know if further refinements are needed!
For more research and projects from the author, visit: Lei's AI Research Lab
