Skip to content

Comments

[Misc] Add Mariadb compose example#79

Open
grooverdan wants to merge 2 commits intobytedance:mainfrom
grooverdan:mariadb_compose_example
Open

[Misc] Add Mariadb compose example#79
grooverdan wants to merge 2 commits intobytedance:mainfrom
grooverdan:mariadb_compose_example

Conversation

@grooverdan
Copy link

Pull Request Summary

Add a MariaDB compose file to show how VIDEX and MariaDB components are related and the steps required.

Detailed Description

With videx as a plugin as a MariaDB now, I was looking at how to make use of it. I thought a compose example that included a container of just the VIDEX server would show the steps that need to occur.

  • What problem does this PR solve?
  • Current documentation is very much focused on MariaDB or MySQL in the same container as videx.
  • This shows the separation of components.
  • How does your solution work?

With single service per function, a user can run the explain in the mariadb container and the videx plugin enabled mariadb container for comparison.

$ docker-compose  -f build/mariadb-compose.yml  up
$ podman exec -ti  videx-mariadb-1 mariadb -u root -prootpwd tpch_tiny -e "EXPLAIN 
FORMAT = JSON
SELECT s_name, count(*) AS numwait
FROM supplier,
     lineitem l1,
     orders,
     nation
WHERE s_suppkey = l1.l_suppkey
  AND o_orderkey = l1.l_orderkey
  AND o_orderstatus = 'F'
  AND l1.l_receiptdate > l1.l_commitdate
  AND EXISTS (SELECT *
              FROM lineitem l2
              WHERE l2.l_orderkey = l1.l_orderkey
                AND l2.l_suppkey <> l1.l_suppkey)
  AND NOT EXISTS (SELECT *
                  FROM lineitem l3
                  WHERE l3.l_orderkey = l1.l_orderkey
                    AND l3.l_suppkey <> l1.l_suppkey
                    AND l3.l_receiptdate > l3.l_commitdate)
  AND s_nationkey = n_nationkey
  AND n_name = 'IRAQ'
GROUP BY s_name
ORDER BY numwait DESC, s_name\G" > /tmp/t.json

$ podman exec  videx-mariadb_videx-1 mariadb -u root -prootpwd videx_tpch_tiny -e "EXPLAIN 
FORMAT = JSON
SELECT s_name, count(*) AS numwait
FROM supplier,
     lineitem l1,
     orders,
     nation
WHERE s_suppkey = l1.l_suppkey
  AND o_orderkey = l1.l_orderkey
  AND o_orderstatus = 'F'
  AND l1.l_receiptdate > l1.l_commitdate
  AND EXISTS (SELECT *
              FROM lineitem l2
              WHERE l2.l_orderkey = l1.l_orderkey
                AND l2.l_suppkey <> l1.l_suppkey)
  AND NOT EXISTS (SELECT *
                  FROM lineitem l3
                  WHERE l3.l_orderkey = l1.l_orderkey
                    AND l3.l_suppkey <> l1.l_suppkey
                    AND l3.l_receiptdate > l3.l_commitdate)
  AND s_nationkey = n_nationkey
  AND n_name = 'IRAQ'
GROUP BY s_name
ORDER BY numwait DESC, s_name\G" > /tmp/u.json
  • Any trade-offs or alternative approaches considered?]

I attempted to use tpch_tiny.sql.tar.gz directly however MariaDB container uses a tar.gz file as an indicator that its a MariaDB backup that is attempted to be restored.

I think that build/Dockerfile.videxserver should be a separately published container rather than a build in compose, but feedback welcome on this.

I probably could factor the compose file env variables rather than having multiple copies of the same thing (passwords, database names..) in multiple services.

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to VIDEX! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Core]: Changes to core engine functionality
  • [Opt]: Changes to VIDEX-Optimizer-Plugin
  • [Stats]: Changes to VIDEX-Statistic-Server
  • [Algo]: Implementation of new algorithms for NDV, cardinality estimation, etc.
  • [Pipe]: Enhancements to the pipeline (e.g., data collection, environment setup)
  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [Test]: Adding or updating tests
  • [Perf]: Performance improvements
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use the most specific prefix or multiple prefixes in order of importance (e.g., [Algorithm][Stats]).

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • [N/A] New and existing tests pass successfully
  • [?] Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • [N/A] Changes have been tested on both Plugin-Mode and Standalone-Mode (if applicable)
  • [N/A] Statistical accuracy has been verified (for algorithm or optimizer changes)
  • [N/A] No regression in query plan accuracy compared to InnoDB (if applicable)
  • [N/A] Performance benchmarks conducted (for performance-sensitive changes)

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@kr11
Copy link
Member

kr11 commented Dec 24, 2025

Huge thanks to @grooverdan ! It provides a great reference. Haibo and I will review it ASAP.

@YoungHypo
Copy link
Contributor

Great work! Thanks @grooverdan . The one-command setup makes it perfect for quick testing and demos.
As you said before, database names, credentials, and data paths are hardcoded. Using .env files would make this more maintainable and reusable.
Apart from that, LGTM. This is a solid foundation for the MariaDB integration. Thanks again for putting this together!

@CLAassistant
Copy link

CLAassistant commented Dec 29, 2025

CLA assistant check
All committers have signed the CLA.

@kr11
Copy link
Member

kr11 commented Jan 11, 2026

@grooverdan Hi Daniel, thanks for your great PR. Your docker-compose stack is a really nice end-to-end example (bringing up all components and validating tpch-tiny).

Sorry for the delayed reply — while reproducing this PR on Debian + Docker, we ran into (and spent some time chasing down) a mismatch between mariadb and mariadb-videx. After digging in, we confirmed the issue is not caused by this PR.

The root cause is a stats-refresh issue during metadata collection: if we run videx_build_env immediately after loading data into a fresh MariaDB instance, information_schema.TABLES.TABLE_ROWS may temporarily be 0 (observed on supplier), which leads to inconsistent stats and can produce different EXPLAIN results between mariadb and mariadb-videx (details in PR #81).

We’ve opened PR #81 to address this by running ANALYZE TABLE on all tables in the target schema before collecting stats. If you have any suggestions, we’d love to hear them — otherwise we’re happy to proceed and expect to merge both PRs soon.

@kr11
Copy link
Member

kr11 commented Jan 11, 2026

I think that build/Dockerfile.videxserver should be a separately published container rather than a build in compose, but feedback welcome on this.

@grooverdan

Agree — I also think build/Dockerfile.videxserver would be better as a separately published image rather than built inside compose.

In the spirit of https://jira.mariadb.org/browse/MDEV-38409 (make VIDEX easy to install/start), as the next step, we plan to provide a dedicated videx-server image and bundle the videx_build_env workflow with it.

videx_build_env (possibly renamed to videx-sync better) is to let users specify the target dbname, then the scripts would collect metadata and import it into the long-lived VIDEX server.

@grooverdan
Copy link
Author

Great. I'll leave this on hold until there's a published container image and then rework this as an example to use the published container image.

This removes the tar layer for a single file.
This shows how a MariaDB instance is created, and videx
uses the server implementation whcih is populated from
the instance data. A videx plugin is used in a second
instance that forms the virtual plans.
@grooverdan grooverdan force-pushed the mariadb_compose_example branch from 428d3ca to 5650b81 Compare February 9, 2026 03:51
@grooverdan
Copy link
Author

updated with env file and test container

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants