Skip to content

Component migration (8.4–9.6) and release workflow update#76

Open
askdba wants to merge 38 commits intomainfrom
feature/component-migration-8.4-9.6
Open

Component migration (8.4–9.6) and release workflow update#76
askdba wants to merge 38 commits intomainfrom
feature/component-migration-8.4-9.6

Conversation

@askdba
Copy link
Owner

@askdba askdba commented Feb 12, 2026

Summary

  • Component migration work for MySQL 8.4 / 9.6 compatibility (build, CI, component sources).
  • Release workflow: bump softprops/action-gh-release from v1 to v2 in .github/workflows/release.yml (no input/env changes required).

Release action change

  • Create Release step now uses softprops/action-gh-release@v2.
  • Same inputs used: files, generate_release_notes; v2 is backward compatible for this usage.

Made with Cursor


Note

High Risk
High risk because it introduces a new MySQL component build path (new CMake targets, build scripts, and component lifecycle/binlog persistence code) and significantly rewires CI/release automation across multiple MySQL versions.

Overview
Adds a first-class MySQL Component build mode alongside the existing in-tree plugin build: CMakeLists.txt now builds myvector_component when MYSQL_SOURCE_DIR is provided (and still supports MYSQL_ADD_PLUGIN when built inside mysql-server), with new build scripts (scripts/build-component.sh, scripts/build-component-9.6-docker.sh) and a component manifest (src/component_src/myvector.json).

Expands CI to build and validate component install/uninstall/reinstall across MySQL 8.0/8.4/9.0 plus a best-effort MySQL 9.6 path, and strengthens plugin tests with an explicit reinstall cycle; also extends cppcheck coverage to src/component_src.

Updates automation and repo hygiene: release workflow now builds component artifacts (and uses softprops/action-gh-release@v2 with corrected checksum generation), Docker publish runs via workflow_run gating, super-linter config adds targeted actionlint/markdownlint overrides, and docs/.gitignore are updated (including removing the committed top-level Makefile).

Written by Cursor Bugbot for commit d2d5f33. This will update automatically on new commits. Configure here.

askdba and others added 2 commits February 12, 2026 00:13
…it rollback)

- Add component build: myvector_component.cc, query rewriter, binlog service, UDF service
- Binlog state: persist/load binlog_state.json (file, position, server UUID); refuse start on UUID mismatch
- Init: UDF register then binlog start; rollback (deregister UDFs, release handle) on failure
- CMake: require MYSQL_SOURCE_DIR; build-component.sh and pre-pr.sh
- CI: build-component, test-component (INSTALL COMPONENT + UDF smoke), lint component_src
- Release: build-and-release-component job; create-release needs component artifacts
- Fix: fprintf format string in myvector_binlog_service.cc

Co-authored-by: Cursor <cursoragent@cursor.com>
- Update Create Release step to use action-gh-release@v2
- No input or env changes required; v2 accepts same files and generate_release_notes

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a66f5fe832

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/myvectorutils.cc
)

add_library(myvector_component SHARED ${MYVECTOR_COMPONENT_SRCS})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore a myvector plugin build target

This commit replaces the old plugin-oriented MYSQL_ADD_PLUGIN(myvector, ...) flow with only add_library(myvector_component ...), but the CI plugin job still copies this CMakeLists.txt into mysql-server/plugin/myvector and runs make myvector (.github/workflows/ci.yml, build job). Without a myvector target, that path cannot produce the plugin artifact, so plugin CI and downstream plugin packaging regress.

Useful? React with 👍 / 👎.

return 0.0;
}

double distance = HammingDistanceFn(v1_raw, v2_raw, (void*)&dim1_bytes); // Pass dimension in bits, function expects bytes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Pass bit length to HammingDistanceFn

HammingDistanceFn consumes a bit-count quantity (it computes loop count as qty / (sizeof(unsigned long) * 8) in src/myvector.cc), but this code passes dim1_bytes, which is already divided by 8. That makes the UDF process only one-eighth of each binary vector and return systematically incorrect hamming distances for valid inputs.

Useful? React with 👍 / 👎.

v2 = (FP32*)(v2_raw);
#endif

double distance = computeL2Distance(v1, v2, dim1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor the metric argument in myvector_distance

The component registers myvector_distance with up to 3 arguments, but the implementation always computes L2 and ignores args[2]. Calls like myvector_distance(v1, v2, 'cosine') or 'ip' now silently return L2 values, which corrupts query semantics for users relying on non-L2 metrics.

Useful? React with 👍 / 👎.

askdba and others added 17 commits February 12, 2026 22:40
…ce metric, fix Hamming bit length

Co-authored-by: Cursor <cursoragent@cursor.com>
… for UDF

Co-authored-by: Cursor <cursoragent@cursor.com>
…rsion.h, my_config.h)

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…rintf, script cleanup

- Install mysql_version.h stub into MySQL cache include/mysql/ so server
  headers (plugin.h) find it when building from tarball
- Add noexcept to service base and derived destructors to fix exception
  specification override errors
- Fix snprintf format: use ER_MYVECTOR_INDEX_NOT_FOUND with (%s) and col
  in UDF and plugin (removes -Wformat-extra-args)
- Remove debug logging from build-component.sh
- Include mysql_version.h, my_config.h, component_config.cc and other
  component migration changes

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…warnings

- Clarify build-component.sh version examples and demo path notes in CONTRIBUTING.md
- Fix format specifier (%zu) and string literal warnings in myvector.cc
- Component, binlog, RFC, and build script updates; remove generated Makefile

Co-authored-by: Cursor <cursoragent@cursor.com>
…own validators

- Add chown step after docker build so boost_cache is writable by cache action
- Re-enable VALIDATE_GITHUB_ACTIONS and VALIDATE_MARKDOWN in linter
- Add .markdownlint.yaml and .github/actionlint.yaml for targeted exceptions

Co-authored-by: Cursor <cursoragent@cursor.com>
…o defaults apply

Co-authored-by: Cursor <cursoragent@cursor.com>
- actionlint: ignore shellcheck SC2015 (A && B || C pattern)
- markdownlint: add ---, disable MD030 for list marker spaces
- disable SHELL_SHFMT (pre-existing script style)

Co-authored-by: Cursor <cursoragent@cursor.com>
- Rewrite A && B || C to proper if/fi in ci.yml (fixes shellcheck SC2015)
- Add MD040 false for fenced code without language in .markdownlint.yaml

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Trigger on workflow_run of MyVector CI or Release (remove pull_request)
- Skip job when triggering workflow failed
- Push images only when triggered by Release workflow

Co-authored-by: Cursor <cursoragent@cursor.com>
askdba and others added 2 commits February 25, 2026 12:37
…de and output

- Run verification query into variable instead of pipeline
- Explicitly check mysql_exit before testing output
- Prevents false negatives when $MYSQL_CMD fails

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add libbinlogevents/include for MySQL 8.0 binlog headers
- Use mysql::binlog::event namespace alias for 8.4+ (was binary_log in 8.0)
- Include mysql/mysql.h and MYSQL_ERRMSG_SIZE fallback in UDF service
- Generate version-specific mysql_version.h from mysql-tag (8.0.40->80040, etc.)
  so correct code paths are selected per MySQL version

Co-authored-by: Cursor <cursoragent@cursor.com>
askdba and others added 2 commits February 25, 2026 13:42
- Add <vector> to myvector.h (std::vector used in AbstractVectorIndex)
- Add <mysql/plugin.h> before service_my_plugin_log in UDF service (MYSQL_PLUGIN)
- Add <arpa/inet.h> for ntohs/htonl when using MySQL binlog headers on Unix

Co-authored-by: Cursor <cursoragent@cursor.com>
Set myvector_feature_level=3 (bit 0 set) so start_binlog_monitoring
returns early. The binlog thread requires config (host, user, socket)
which is not available in CI. Basic UDF tests don't need binlog.

Co-authored-by: Cursor <cursoragent@cursor.com>
…L 8.0)

Co-authored-by: Cursor <cursoragent@cursor.com>
Copilot AI added a commit that referenced this pull request Feb 25, 2026
- Move BITS_PER_BYTE and SharedLockGuard to include/myvector.h for
  broader access and cleaner architecture
- Add #include <vector> to myvector.h (missing include)
- Improve SharedLockGuard: rename clear() to release(), add deleted
  copy/move, make constructor explicit
- Fix: VectorIndexCollection::open() now acquires shared lock before
  returning (matching get() behavior), preventing UB when SharedLockGuard
  releases a lock never acquired on the open() path
- Fix: myvector_ann_set_init moves lock guard creation after null check
  to avoid constructing guard with null index
- Fix: improve ER_MYVECTOR_INDEX_NOT_FOUND message to include index name
- Fix: myvector_display returns string literal instead of writing to UDF
  result buffer (fixes invalid memory access)
- Fix: format string %d -> %zu for size_t colinfo.length()

Co-authored-by: askdba <5901296+askdba@users.noreply.github.com>
Copilot AI added a commit that referenced this pull request Feb 25, 2026
…atibility

The PR #76 introduces a macro to abstract UDF metadata service access,
preparing for component builds where the service is accessed differently.
This aligns with the component migration plan in docs/COMPONENT_MIGRATION_PLAN.md.

Co-authored-by: askdba <5901296+askdba@users.noreply.github.com>
askdba and others added 2 commits February 25, 2026 17:21
- SharedLockGuard: make release-only (no acquire in constructor) to avoid
  double-lock when used with get() which already acquires before returning
- VectorIndexCollection::open(): acquire lockShared() before returning to
  match get() behavior; SharedLockGuard expects caller to hold lock
- Keeps component build intact (PR #78 targeted main without component)

Co-authored-by: Cursor <cursoragent@cursor.com>
…orkflow_run gating

- build-component.sh: write mysql_version.h to project include/ so component build
  uses version-specific stub (project include/ is first in CMake path)
- docker-publish.yml: use workflow_run.name instead of workflow_run.path for
  gating Docker push (path may be undocumented)

Made-with: Cursor
myvector.cc
myvectorutils.cc
)
MYSQL_ADD_PLUGIN(myvector ${MYVECTOR_PLUGIN_SRCS})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing MODULE_ONLY breaks in-tree plugin shared library build

High Severity

The MYSQL_ADD_PLUGIN(myvector ${MYVECTOR_PLUGIN_SRCS}) call is missing MODULE_ONLY, MODULE_OUTPUT_NAME "myvector", and LINK_LIBRARIES ${LIBMYSQLCLIENT} that were present in the old CMakeLists.txt. Without MODULE_ONLY, MySQL's CMake defaults to building the plugin as a static library rather than a shared .so module. This means myvector.so is never produced, breaking the CI build job, test job, and Docker publish workflow which all expect plugin_output_directory/myvector.so to exist.

Additional Locations (1)

Fix in Cursor Fix in Web

@askdba
Copy link
Owner Author

askdba commented Mar 1, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b2bac5500d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +510 to +512
if (!vec || vecsz == 0) {
continue; /* skip row: no vector data */
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exit row parse loop when vector payload is missing

When a row has no usable vector payload, this branch continues before the loop’s only index >= event_len termination check, so the next iteration can read event_buf[index] past the event buffer. In practice, inserts with NULL/empty vector values can make the parser spin on invalid memory or crash instead of safely skipping the row.

Useful? React with 👍 / 👎.

Comment on lines +1230 to +1233
// error_print("Binlog fetch failed: %s", err.c_str());
// TODO: Replace with component-specific logging
break;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Signal worker shutdown before breaking binlog fetch loop

On non-timeout mysql_binlog_fetch errors, this break exits the fetch loop without notifying the queue/worker threads to stop. The function then joins worker_threads_, but those threads are blocked in gqueue_.dequeue() waiting for items, so the binlog thread can block indefinitely and incremental index updates stop until an external stop path forces shutdown.

Useful? React with 👍 / 👎.

--suppress=missingIncludeSystem \
--error-exitcode=0 \
-I include -I src/component_src \
src/component_src/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect YAML indentation breaks cppcheck run block

High Severity

The second cppcheck invocation is indented at 8 spaces, matching the run: key level, while the first cppcheck inside the run: | block is at 10 spaces. In a YAML literal block scalar, content indentation is set by the first line (10 spaces). Lines at 8 spaces fall outside the block, causing a YAML parse error that will break the entire lint job.

Fix in Cursor Fix in Web

askdba added 2 commits March 7, 2026 19:34
- CI: add DROP for all UDFs in plugin reinstall test
- CI: allow build-component-9-6 to fail (dnf flaky)
- Charset metadata for component UDFs (ann_set, display)
- NULL check for myvector_construct first arg
- Stub comments in mysql_version.h, my_config.h
- Failure modes docs for binlog state
- MYSQL handle leak fix on connect retry
- request_shutdown before worker join on binlog error
- escape_identifier buffer overflow fix
- Release: use build-component-9.6-docker.sh for 9.6
- Update PR76_STATUS with completion log

Made-with: Cursor
@askdba askdba force-pushed the feature/component-migration-8.4-9.6 branch from 43a5e42 to d2d5f33 Compare March 7, 2026 16:37
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

-DWITH_NDBCLUSTER=OFF \
-DWITH_EXAMPLE_STORAGE_ENGINE=OFF \
-DCMAKE_BUILD_TYPE=Release \
>/dev/null 2>&1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing -DWITH_BOOST causes MySQL 9.6 cmake to fail

High Severity

The cmake invocation inside the docker build script specifies -DDOWNLOAD_BOOST=1 but omits the required -DWITH_BOOST=<path> flag. MySQL's build system requires both flags together — without -DWITH_BOOST, cmake will fail with an error asking for the download destination. The error output is suppressed by >/dev/null 2>&1, making diagnosis difficult. Every other cmake invocation in the codebase (CI build-component job, sandbox-component-test.sh) correctly includes both flags. The release workflow also calls this script for 9.6.0 builds, meaning 9.6 component releases will fail.

Additional Locations (1)

Fix in Cursor Fix in Web

delete item;
}
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worker threads busy-spin when binlog loop exits on error

Medium Severity

When the binlog fetch loop exits on a non-timeout error, gqueue_.request_shutdown() is called but shutdown_binlog_thread_ is never set to true. Worker threads loop on !shutdown_binlog_thread_.load(), and since the queue's shutting_down_ flag makes dequeue() return nullptr immediately without blocking, workers enter a tight CPU-burning busy-spin. The subsequent join() calls in binlog_loop_fn will block until stop_binlog_monitoring() is eventually called from component deinit, which finally sets shutdown_binlog_thread_.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant