OneFreeVoice

November 18, 2008

Interesting SQL Challenge

Filed under: Code, MySQL — Gregory Haase @ 3:45 pm

A former colleague recently posed a problem to me: “This guy has a table full of table names, and he wants to run a query that unions those tables.” I spent a couple of minutes telling him why it was a bad idea. It’s not a very robust solution, and an incorrect value in your table of tables could cause the query to fail, as could a missing, renamed, or altered column on one of the target tables. In the end, my former colleague was able to argue for an architecture change. However, my curiosity had been piqued.

I ask: “Can it be done in a stored proc?”

Answer: “No.”

Hmmm… now I feel a challenge!

The Setup
First, I created a simple test environment – three simple identical tables and a table of tables.

USE test;

CREATE TABLE `a`
(`a_id` INT AUTO_INCREMENT,
 `some_data`  VARCHAR(20),
 PRIMARY KEY (`a_id`)
);
INSERT INTO `a` (`some_data`) VALUES ('z'),('y');

CREATE TABLE `b`
(`b_id` INT AUTO_INCREMENT,
 `some_data`  VARCHAR(20),
 PRIMARY KEY (`b_id`)
);
INSERT INTO `b` (`some_data`) VALUES ('x'),('w');

CREATE TABLE `c`
(`c_id` INT AUTO_INCREMENT,
 `some_data`  VARCHAR(20),
 PRIMARY KEY (`c_id`)
);
INSERT INTO `c` (`some_data`) VALUES ('v'),('u');

CREATE TABLE `union_tables`
(`union_tables_id` INT AUTO_INCREMENT,
 `table_name` VARCHAR(64) NOT NULL,
 PRIMARY KEY (`union_tables_id`)
);

INSERT INTO `union_tables` (`table_name`)
       VALUES ('a'),('b'),('c');

The Execution
Once I thought about it, the solution was actually really easy. I new that I was going to be using CONCAT, and suspected that GROUP_CONCAT would also come into play. The hinge pin would be making “UNION” the concatenation separator. I already new about CONCAT_WS(), but I didn’t know about the SEPARATOR keyword for GROUP_CONCAT (see: MySQL Reference Manual). With that, everything else fell into place:

SELECT GROUP_CONCAT( CONCAT( 'SELECT `some_data` FROM `', table_name,'`' ) SEPARATOR ' UNION ' )
  INTO @sqlstring
  FROM `union_tables`;

PREPARE sqlquery FROM @sqlstring;
EXECUTE sqlquery;
DEALLOCATE PREPARE sqlquery;

Now this example runs inside of the MySQL command line client. But you should be able to run the initial SELECT statement inside of any programming language and create a prepared statement from the results.

About these ads

3 Comments »

  1. Every time you try to mix data and metadata in SQL, you run into this kind of trouble. The solutions are fragile and potential security vulnerabilities. This is not how SQL is intended to be used.

    If you want a technology that can query data and metadata orthogonally, use RDF.

    Comment by Bill Karwin — November 18, 2008 @ 5:29 pm

  2. Not a bad hack, though it is a hack, as Bill says.

    The biggest issue I see is that you have to update the union_tables data…..I would use a VIEW for that, accessing table names from the INFORMATION_SCHEMA database.

    Alternatively, you can also use a MyISAM table for the underlying tables, and a MERGE table for the wrapper table (instead of UNIONs). There’s still the problem of keeping the MERGE table up to date, but the UNION disappears and now you can also use a fulltext index.

    Comment by Sheeri K. Cabral — November 19, 2008 @ 6:40 am

  3. I see a couple of issues using the MERGE approach.

    First of all, it requires a CREATE statement. Presumably, they would want to add and remove table records on the fly – otherwise there would be no need for a dynamic statement to begin with. I’d be hesitant to have the application DROP and CREATE a table every time a query needs to be run. If you limit the changes to only occurring during a major update – it’s just as easy to skip the whole exercise and just include the UNION in the application itself.

    Second, there is no guarantee that the tables have exactly the same column names and indexes.

    Regarding the INFORMATION_SCHEMA, you would need to some how identify the tables that you need in the UNION. You might be able to limit it to tables that include some kind of secret/specific column name, but then that’s really a hack! You risk someone coming in and adding that column name to an unrelated table and breaking the query.

    Regarding Bill’s comments… after going out and looking up RDF and orthogonally… I come to the conclusion that I’m a.) not storing XML in my database, and b.) not recommending that a developer with only a limited grasp of SQL go out and learn a new meta-language and then go out and learn SPARQL.

    Comment by Gregory Haase — November 19, 2008 @ 10:13 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Silver is the New Black Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: