Friday, June 26, 2015

Oracle dump utility

I have created a small program to extract data from an Oracle database in a format suitable for importing into MariaDB (and MySQL I guess). It is pretty fast as it is written in C and uses the OCI interface. Also, it supports most of the Oracle basic types, including BLOB, CLOB and LONG. UTF8 is also supported and there are many generic features, as well as some features specific for later import into MariaDB.

Download the 1.0 version from Sourceforge where the programs source, that is using autotools for building, as well as documentation is available. I have not tested to build on any other version of Oracle than 11, but maybe someone could help me there.

/Karlsson

MariaDB with Galera available on the IBM Power8 platform

It was a very long time since I wrote something in this blob, but I have been very busy this spring with MariaDB on Power mostly. This has been a lot of work, but also a lot of fun. So, what is this MariaDB on Power thing all about, well I wrote an introduction to the Power platform late last year. Since then a lot of things has happened though.

One thing is that several service providers out there has adopted Power8 as a platform. To be honest, this really isn't sexy, but it is useful and as a user of one of these services, you will just see the same old Linux you are used to, but potentially it is more powerful and reliable. One such provider is OVH, whose service is more known as RunAbove. If you want to try it, you can do so for free for 7 7 days, just go there and off you go.

Another important thing is that MariaDB is now available on Power8 running, RedHat, SUSE or Ubuntu Linux. To get access to this, pop by MariaDB and if you are not yet signed up, then do this now and then go to "My Portal", further to "Downloads" and then select "MariaDB Enterprise and MariaDB Enterprise Cluster". You are now ready to install using the operating system of your choise, but on Power you are, as I said before,limited to SUSE, RedHat and Ubuntu, and if you want to test MariaDB Enterprise Cluster, i.e. MariaDB with Galera, you have to go with Ubuntu.

Installing MariaDB Enterprise Cluster on Power8 is no more complex than on Intel. There are a few thing to adjust before you can get started with this, after having installed the software. The first node has, as usual, to be configured with wsrep_cluster_adress set to gcomm:// to ensure that this first node will bootstrap without having to connect to a cluster. Once the cluster is up and running though, this variable is set to the cluster addresses. In my case, this what the Galera setting look like in /etc/mysql/my.cnf which is the location of this file on Ubuntu.
# Galera
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="GaleraPower1"
wsrep_cluster_address=gcomm://92.127.22.124
wsrep_node_address=92.127.22.121
wsrep_node_name=galera3
binlog_format=ROW

Note in particular the binlog_format setting. This MUST be set to ROW for Galera to work. But fact is that these setting are not particular to MariaDB on Power, this is the same even on Intel.

Tf this isn't enough to convice you about the advantages of running MariaDB on IBM Power, then see what Foedus in Italy has to say about this combination in this Video:


There is more to say about running MariaDB on Power and there is more to come here, I'll look at some performance data, we'll have a look at MaxScale on Power (this is not official yet, but that isn't stopping me) as well as a blog on how to run a Power8 emulation on Intel which I have promissed before.

So, don't touch that dial!

/Karlsson

Tuesday, December 16, 2014

If you can't do it with Power, you can do it with MORE Power

IBM has released their Power8 CPU and with that a bunch of servers and also some other news around this, like the OpenPower consortium that has been around for a year or so now. Besides IBM, Tyan and Google have announced that they are building systems around the Power8 CPU, with the a Tyan server being on sale for some $2800, but this is not a server with a high-end configuration, but it is in a 2U rack mount case and is extensible with more disks and RAM.

There is one thing with the Power8 CPU that sets it apart from the previous Power CPUs that is worth mentioning, which is the byte ordering or endianess of the CPU. The order of the bytes in value in a computer comes in two main flavours, the "Motorola style" AKA "big endian" which is how we write numbers in general, with the most significant digit (or byte in the case of computers) first, and then we have the "Intel style" or "Little endian" where the most significant byte comes last. All of this has little meaning to most of you so far. Thar Big endian is old school mainframe style, but is also used by many RISC CPUs. For those of you with a flair for weird facts one oddball of endianess was PDP-11, where each byte in a 16-bit part in a 32-bit word was swapped (this is called PDP-endian). And by the way, Oddball of Endianess would be a great name for a Country-Rock band (having at least 3 Telecaster equipped guitarists on stage).

What makes Power8 a bit different is that it supports operating systems using either big endian or little endian. Big endian is used by AIX and some Linuxes, but IBM is trying to get more Linuxes onto the little endian train. Among the Big endian Linuxes on Power8 are Red Hat and derivatives, like Fedora (note that CentOS isn't one of them, so far, as Centos 7 doesn't yet support Power at all, nor is 32-bit Intel supported anymore) as well as Debian and SUSE 11 (which will continue to be supported). The little endian Linuxes on Power8 so far includes SUSE 12 and Ubuntu 14.04.  But IBM really wants Linux on Power to run Little endian, so it seems that Debian is on it's way there and Red Hat 7.1 will also support Little endian. But to be honest, we don't have the full story yet.

There are some really good things with the Power8 architecture, like really good performance (see this one for example by MariaDB and IBM) and also it is a great platform for Virtualization / Cloud infrastructure. And all that fuzz with endianess will hopefully be history soon when we have all Linuxes on Power8, as well as x86_64 of course, running Little endian (thank you). And the Country-Rock band mentioned above could call their first record "Fuzz with Endianess".

So, I have convinced you? If so, I guess you want to know what it costs. Well, Power8 machines from  IBM starts at around $10.000, which is a lot it seems, but then you have to remember that used the way this type of system is probably mostly used, as a host for numerous virtual machines, then it is actually pretty reasonable. If you really want a less expensive machine, Tyan has one available which should sell for some $3.000 which is looking more reasonable. Eventually I guess Tyan will sell their motherboards, and the components in this, beside the CPU, is pretty standard stuff (DDR RAM, PCI buses, SATA disks etc). But the CPU is pretty expensive. Anyway, there is a way around all this, if you really want to try running Power8, and that is to use emulation, and once you get the hang of it, this works well, although slowly, but no so slow that you cannot test things on it (but too slow for production use I guess).

I'll show you how to set up an emulated Power8 system in a later blog post, so stay tuned and Don't touch that dial!

/Karlsson

Thursday, October 2, 2014

OraMySQL 1.0 Alpha released - Replication from Oracle to MariaDB and MySQL!!

Now it's time to release something useful! At least I hope so. I have been going through how I came up with this idea and how I came up with the implementation in a series of blog posts:
 But now it's time for the real deal, the software itself. This is an Alpha 1.0 release but it should work OK in the more basic setups. It's available for download from sourceforge, and here you find the source package which uses GNU autotools to build. The manual is part of this download, but is also available as a separately.

You do need MariaDB libraries and includefiles to build it, but no Oracle specific libraries are needed, only Oracle itself of course. There are build instructions in the documentation.

Cheers and happy replication folks

/Karlsson

Wednesday, October 1, 2014

Replication from Oracle to MariaDB the simple way - Part 4

Now it's time to get serious about replicating to MariaDB from Oracle, and we are real close now, right? What I needed was a means of keeping track of what happens in a transaction, such as a LOG table of some kind, and then an idea of applying this log to MariaDB when there is a COMMIT in Oracle. And thing is, these two don't have to be related. So I can have a table which I write to and also have a Materialized View that is refreshed on COMMIT on, and I need a log table or something. And when the Materialized View is refreshed, as there is a COMMIT, then the log can be applied. From a schematic point-of-view, it looks something like this:
This looks more complex than it is, actually, all that is needed is some smart PL/SQL and this will work. I have not done much of any kind of testing, except checking that the basics work, but the PL/SQL needed I have done for you, and the order table triggers and what have you not is also created for you by a shell script that can do this for any table.

As for the DUMMY table that I have to use to get a trigger on COMMIT, this doesn't have to have that many rows, I actually just INSERT into it once per transaction, and then I INSERT the transaction id, which I get from Oracle. This table will have some junk in it after a while, all the transactions that were started and COMMITted will have an entry here. But in my code for this, I have included a simple job that purges this table from inactive transactions.

Best of all is that this works even with Oracle Express, so no need to pay for "Advanced Replication", not that I consider it really advanced or anything. I'd really like to know what you think about these ideas? Would it work? I know it's not perfect, far from it, for for the intent of having a MariaDB table reasonable well syncronized with an Oracle, this should work. Or? The solution is on one hand simple and lightweight, but I have given up on the number of features and possibly also the design affects performance a bit.But it should be good enough for many uses I think?

Let me hear what you think, I'm just about to release this puppy!

/Karlsson

Monday, September 29, 2014

Replication from Oracle to MariaDB the simple way - Part 3

In this third installment in this series, I'll explain why the smart solution I described in the previous post actually wasn't that good, and then I go on to explain how to fix it, and why that fix wasn't such a smart thing after all. So, this was the design we ended with last time:
We have Oracle replicating to a Materialized View, this to ensure that we can run triggers when the is a commit, and then triggers on this Materialized View updates MariaDB by sending a UDP message to a server that in turn is connected to MariaDB.

The issue with the above thingy was that a Materialized View by default is refreshed in it's entirety when there is a refresh, so if the table has 10.000 rows and 1 is inserted, then there will be 20.001 messages sent to MariaDB (10.000 rows deleted, 10.001 inserted). Not fun. And it seems that Materialized Views in Oracle aren't so smart, but I was sure they were this dumbed down, if they were, noone would be using them. So I rush for the Oracle documentation, yiihaa!

The default way of updating a Materialized View is not that fast, but there is a supposedly fast, alternative, method, appropriately named FAST (that the default method isn't called something like AWFULLY CRAZY SLOW is beyond me). So the materialized view using FAST REFRESH for the orders table should really be created like this:
CREATE MATERIALIZED VIEW orders_mv
  REFRESH FAST ON COMMIT
  AS SELECT *
  FROM orders;

But this gives an error from Oracle:
ORA-23413: table "SYSTEM"."ORDERS" does not have a materialized view log
Well the, let's create a MATERIALIZED VIEW LOG for table ORDERS then, that's no big deal:
CREATE MATERIALIZED VIEW LOG ON t1_mv;
But again I get an error, and this time indicating that old Larry has run out of gas in his MIG-21 and need my money to fill it up again, so he can fly off to his yacht:
ORA-00439: feature not enabled: Advanced replication

Grrnn (this is a sound I make when I get a bit upset)! Yes, if you want Materialized Views to work properly, they way the were designed, you need to part with some $$$, and as the cheap person I am, I run Oracle Express instead of SE or EE editions, as I rather spend my hard earned money on expensive beer than on Larrys stupid MIG-21. So, as they say, "Close, but no cigar".

But I'm not one to give up easily, as you probably know. And fact is, I don't need the whole Materialized View thing, all I want is a TRIGGER to execute on COMMIT. Hmm this requires a huge box of prescription drugs to fix, and I am already on the case. Sorry Larry, but you'll have to park your MIG-21 and have someone else buy you some gas.

More details on how I tricked Larry in the next part of this series on replication from Oracle to MariaDB.

/Karlsson

Replication from Oracle to MariaDB the simple way - Part 2

The theme for this series of posts is, and indicated in the previous post, "Try and try, again", and there will be more of this now when I start to make this work by playing with Oracle, with PL/SQL and with the restrictions of Oracle Express (which is the version I have available).

So, what we have right now is a way of "sending" SQL statements from Oracle to MariaDB, the question is when and how to send them from Oracle. The idea for this was then to use triggers on the Oracle tables to send the data to MariaDB, like this, assuming we are trying to replicate the orders table from Oracle to MariaDB:
In Oracle, and assuming that the extproc I have that created to send UDP messages is called oraudp, then I would do something like this:
CREATE OR REPLACE TRIGGER orders_insert
AFTER INSERT ON orders
FOR EACH ROW
BEGIN
   CALL oraudp('INSERT INTO orders VALUES '||
     :new.order_id||', '||:new.customer_id||', '||
     :new.amount||')');
END;
/

CREATE OR REPLACE TRIGGER orders_update
AFTER UPDATE ON orders
FOR EACH ROW
BEGIN
   CALL oraudp('UPDATE orders SET order_id = '||:new.order_id||
     ', customer_id = '||:new.customer_id||', amount = '||:new.amount||
     ' WHERE order_id = '||:old.order_id);
END;
/

CREATE OR REPLACE TRIGGER orders_delete
AFTER DELETE ON orders
FOR EACH ROW
BEGIN
    CALL oraudp('DELETE FROM orders WHERE order_id = '||:old.order_id);
END;
/
 As I noted in my previous post though, DML TRIGGERs execute when the DML statement they are defined for executes, not when that DML is committed, which means that if I replicate to MariaDB in a trigger and then roll back, the data will still be replicated to MariaDB. So the above design wasn't such a good one after all. But the theme is, as I said, try and try again, so off to think and some googling.

Google helped me find a workaround, which is by using Oracle MATERIALIZED VIEWs. And I know what you think, you think "What has dear old Karlsson been smoking this time, I bet it's not legal", how is all this related to MATERIALIZED VIEWs? Not at all, actually, but this is a bit of a kludge, it does actually work, and this is the nature of a kludge (I've done my fair share of these over the years). There are two attributes of Oracle MATERIALIZED VIEWs that makes these work for us:
  • A MATERIALIZED VIEW can be updated ON COMMIT, which is one of the two ways a MATERIALIZED VIEW is updated (the other is ON DEMAND). So when there is a COMMIT that affects the tables underlying the MATERIALIZED VIEW, we have something happening.
  • And the above would be useless if you couldn't attach a TRIGGER to a MATERIALIZED VIEW, but you can!
So, to make this work, I can define a MATERIALIZED VIEW, one that I don't really use for any other purpose, on the table I want to replicate. So then the design gets something like this, again using the orders table as an example:
In practice, if we continue to use theorders table, this is what I would do in Oracle:
CREATE MATERIALIZED VIEW orders_mvREFRESH ON COMMIT
AS SELECT order_id, customer_id, amount
FROM orders;

CREATE OR REPLACE TRIGGER orders_mv_insert
AFTER INSERT ON orders_mv
FOR EACH ROW
BEGIN
   CALL oraudp('INSERT INTO orders VALUES '||
     :new.order_id||', '||:new.customer_id||', '||
     :new.amount||')');
END;
/

CREATE OR REPLACE TRIGGER orders_mv_update
AFTER UPDATE ON orders_mv
FOR EACH ROW
BEGIN
   CALL oraudp('UPDATE orders SET order_id = '||:new.order_id||
     ', customer_id = '||:new.customer_id||', amount = '||:new.amount||
     ' WHERE order_id = '||:old.order_id);
END;
/

CREATE OR REPLACE TRIGGER orders_mv_delete
AFTER DELETE ON orders_mv
FOR EACH ROW
BEGIN
    CALL oraudp('DELETE FROM orders WHERE order_id = '||:old.order_id);
END;


Now we have a solution that actually works, and it also works with Oracle transactions, nothing will be sent to MariaDB if a transaction is rolled back. There is a slight issue with this though, you might think it's no big deal, but fact is that performance sucks when we start adding data to the orders table, and the reason for this is that the default way for Oracle to refresh a materialized view is to delete all rows and then add the new ones. In short, Oracle doesn't keep track of the changes to the base table, except writing them to the base table just as usual, so when the MATERIALIZED VIEW is to be refreshed, all rows in the materialized view will first be deleted, and then all the rows in the base table inserted. So although this works, it is painfully slow.

So it's back to the drawing board, but we are getting closer, now it works at least, we just have a small performance problem to fix. I'll show you how I did that in the next post in this series, so see you all soon. And don't you dare skip a class...

/Karlsson