Google Summer of Code 2025
This year we are again participating in the Google Summer of Code. We, joined with the MariaDB Foundation, believe we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, C++, ODBC, Java, Node.js) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. And we have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.
Contents
- Where to Start
- List of Tasks
- MariaDB Server
- MDEV-35997 MariaDB support in ann-benchmarks and VectorDBBench
- MDEV-28395 LOAD DATA plugins
- MDEV-36100 Generate vector embeddings automatically on INSERT
- MDEV-36107 expressions in mysqltest
- MDEV-36108 variable substitutions in mysqltest
- MDEV-18827 Create utility to parse frm files and print their DDL
- Buildbot build statistics dashboard
- MariaDB Server
- Suggest a Task
Where to Start
Please join us on Zulip to mingle with the community. You should also subscribe to the developers mailing list (this is the main list where we discuss development - there are also other mailing lists).
To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.
Also see the List of beginner friendly issues from the MariaDB Issue Tracker.
List of Tasks
MariaDB Server
MDEV-35997 MariaDB support in ann-benchmarks and VectorDBBench
Part-time project 175h
Add MariaDB support to ann-benchmarks and VectorDBBench.
Skills needed: Python, Docker
Mentors: Vicențiu Ciorbaru
MDEV-28395 LOAD DATA plugins
Full-time project 350h
LOAD DATA INFILE
can flexibly load data into a table from CSV-like files accessible by the mariadbdb process. LOAD XML INFILE
can do it for XML files. LOAD DATA LOCAL INFILE
and LOAD XML LOCAL INFILE
can do it with files accessible by the client, but not by the server. But there are requests to suport loading more file formats and from other locations, for example, from S3.
This project is to implement support for LOAD plugins and refactor the current LOAD code accordingly. There are two kind of plugins — data parser plugin (CSV-like and XML) and transfer plugin (file and LOCAL). Implementing new plugins is not in the scope of this task, this task is mainly about moving existing code around, creating a possibility for new plugins (like JSON or S3).
Skills needed: C++, bison
Mentors: Sergei Golubchik
MDEV-36100 Generate vector embeddings automatically on INSERT
Full-time project 350h
Implement a syntax and a plugin API that the server will use to generate embeddings for documents that the user stores in the database. This should allow to simplify significantly the vector pipeline. mariadbd will not generate embeddings internally, it will invoke a plugin to do that.
Skills needed: C++
Mentors: Sergei Golubchik
MDEV-36107 expressions in mysqltest
extend mysqltest language to support
- standard arithmetic
+
,-
,*
,/
,%
- comparisons
==
,!=
,<
,<=
,>
,>=
- boolean
&&
,||
, may be? :
- if possible: string repetition, perl-style
x
(to replaceSELECT REPEAT()
in test files)
This should work in commands let
, if
, while
Skills needed: C++
Mentors: Sergei Golubchik
MDEV-36108 variable substitutions in mysqltest
extend mysqltest language to support bash-like substitutions:
${var}
${parameter:offset:length}
${#parameter}
${parameter/pattern/string/flags}
- may be
${parameterˆ}
,${parameterˆˆ}
,${parameter,}
,${parameter}
- may be
${parameter@function}
with functions likeu
,U
,Q
, etc
recursive expansion:
${${var}}
Skills needed: C++
Mentors: Sergei Golubchik
MDEV-18827 Create utility to parse frm files and print their DDL
Full-time project - potential part-time (175 - 350h, depending on scope)
FRM files are what MariaDB uses to store metadata about tables. These files can be used to generate DDL statements (CREATE TABLE ...). We are lacking a utility to parse these files which could in turn make DBAs lives easier. The task of this project is to have this utility implemented, making use of MariaDB's FRM parsing logic. You may have to carry out some refactoring to extract the parsing code into a reusable library, once for MariaDB Server, once for the FRM parsing tool.
Skills needed: C/C++, understanding libraries and APIs.
Mentors: Vicențiu Ciorbaru / Sergei Golubchik
Buildbot build statistics dashboard
Part-time project 175h TODO - A more ample description will be created.
Skills needed:
Mentors: Vlad Radu
Suggest a Task
Do you have an idea of your own, not listed above? Do let us know in the comments below (Click 'Login' on the top of the page first)!