2
votes

I have a lot of (about 1 million in second)structural data that must be insert to database I see a lot of benchmark about sql vs noSql and type of Nosql then collect cassandra as database

but I create a benchmark to test mysql vs cassandra in write/update/select speed mysql have better performance in my benchmark, I want to know what is my mistake??

php use as programming language YACassandraPDO and cataloniaframework use as php driver and PDO use as mysql driver

my server is centOS 6.5 with 2 core CPU and 2GB RAM, mysql and cassandra have default configuration

detail of benchmark:

cassandra keyspace and column family structure: CREATE KEYSPACE test2 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 } AND durable_writes = false;

CREATE TABLE test (
    uuid int PRIMARY KEY,
    symbol_id int,
    bid int,
    ask int,
    time timestamp,
);

mysql database and table structure: CREATE DataBase test;

CREATE TABLE `test` (
    `id` INT NOT NULL ,
    `symbol_id` int,
    `bid` int,
    `ask` int,
    time timestamp,
    PRIMARY KEY (id)
)ENGINE=MyISAM;

my result of benchmark:

Insert each 100000 record in cassandra in about 26s, Insert each 100000 record in mysql in about 11s

UPDATE each 100000 in 24s in about cassandra, UPDATE each 100000 in 12s in about mysql

SELECT each 10000 in 741s in about cassandra, SELECT each 10000 in 21s in about mysql

my php code use for benchmark:

cassandra code:

$db_handle = new PDO("$dbtype:host=$dbhost;port=$dbport;cqlversion=3.0.0;dbname=$dbname", $dbuser, $dbpass);
while ($count < $rowNum){
    $stmt = $db_handle->prepare("INSERT INTO test (uuid, symbol_id, bid, ask, time) values ($count, " . rand(1, 100) . ", " . rand(1, 10000) . ", ".  rand(1, 10000).", dateof(now())); ");
    $exec = $stmt->execute();     
 }
 unset($db_handle);

mysql code:

$db_handle = new PDO("$dbtype:host=$dbhost;dbname=$dbname", $dbuser, $dbpass);
while ($count < $rowNum){
    $stmt = $db_handle->prepare("INSERT INTO test (id, symbol_id, bid, ask, time) values ($count, " . rand(1, 100) . ", " . rand(1, 10000) . ", ".  rand(1, 10000).", now()); ");
    $exec = $stmt->execute();
}
unset($db_handle);
2
Why do you think you did something wrong?Dan Bracuk
@shahab . . . Your test is likely more of the prepare time for the queries than of the underlying database, and I wouldn't be surprised if MySQL makes better use of a query cache. You can test this by moving the prepare outside the while and running it a zillion times. Another possibility is to prepare the statement once with parameters and then just assign the parameters inside the loop.Gordon Linoff
It could be also that the driver is slow. MySQL’s driver is years old and was improved over that time. Cassandra’s driver is rather new comparingly.peter
@DanBracuk In most benchmark show cassandra write is better than mysql but my benchmark show opposite of it, why?shahab
I'm talking about client processes, not server. If you have a single client writing to cassandra, you won't get anywhere near to the maximum write speed, even for a single server node - cassandra is designed for a lot of concurrency.DNA

2 Answers

0
votes

Disable possible MySQL caching by saying

SELEC SQL_NO_CACHE ...

MySQL checks the PRIMARY KEY integrity with each INSERT/UPDATE. MariaDB at least can disable this for updates, might work in MySQL, too.

0
votes

if you wanna test cassandra, you can simply use cassandra-stress tools, installed with datastax. you can find it in C:\Program Files\DataStax-DDC\apache-cassandra\tools\bin it's a bat file. no need to write even a line of code, simply execute it with desired parameter and benchmark the cassandra.