Author: barce

Benchmarking Inserts on Drizzle and MySQL

I’m not comparing apples to apples yet… but out of the box, drizzle does inserts faster than MySQL using the same table type, InnoDB.

Here’s what I’m comparing:
drizzle r1126 configured with defaults, and
MySQL 5.1.38 configured with

./configure --prefix=/usr/local/mysql --with-extra-charsets=complex \
--enable-thread-safe-client --enable-local-infile --enable-shared \
--with-plugins=partition,innobase

which is really nothing complicated.

SQL query caching is turned off on both database servers. Both are using the InnoDB engine plug-in.

I’m running these benchmarks on a MacBook Pro 2.4 GHz Intel Core 2 Duo with 2GB 1067 MHz DDR3 RAM.

I wrote benchmarking software about 2 years ago to test partitions but I’ve since abstracted the code to be database agnostic.

You can get the benchmarking code at Github.

At the command-line, you type:

php build_tables.php 10000 4 drizzle

where 10000 is the number of rows allocated total, and 4 is the number of partitions for those rows.

You can type the same thing for mysql:

php build_tables.php 10000 4 mysql

and get interesting results.

Here’s what I got:

MySQL

bash-3.2$ php build_tables.php 10000 4 mysql
Elapsed time between Start and Test_Code_Partition: 13.856538
last table for php partition: users_03
Elapsed time between No_Partition and Code_Partition: 14.740206
-------------------------------------------------------------
marker           time index            ex time         perct   
-------------------------------------------------------------
Start            1252376759.26094100   -                0.00%
-------------------------------------------------------------
No_Partition     1252376773.11747900   13.856538       48.45%
-------------------------------------------------------------
Code_Partition   1252376787.85768500   14.740206       51.54%
-------------------------------------------------------------
Stop             1252376787.85815000   0.000465         0.00%
-------------------------------------------------------------
total            -                     28.597209      100.00%
-------------------------------------------------------------
20000 rows inserted...

drizzle

bash-3.2$ php build_tables.php 10000 4 drizzle
Elapsed time between Start and Test_Code_Partition: 7.502141
last table for php partition: users_03
Elapsed time between No_Partition and Code_Partition: 7.072367
-------------------------------------------------------------
marker           time index            ex time         perct   
-------------------------------------------------------------
Start            1252376733.68141500   -                0.00%
-------------------------------------------------------------
No_Partition     1252376741.18355600   7.502141        51.47%
-------------------------------------------------------------
Code_Partition   1252376748.25592300   7.072367        48.52%
-------------------------------------------------------------
Stop             1252376748.25627400   0.000351         0.00%
-------------------------------------------------------------
total            -                     14.574859      100.00%
-------------------------------------------------------------
20000 rows inserted...

MySQL: 699 inserts per second
drizzle: 1372 inserts per second
As far as inserts go, drizzle is about 2 times faster out of the box than MySQL.

September 8, 2009

If You Miss tr.im use j.mp

I’ve been using j.mp for two weeks now and it’s filled the void that tr.im left after going out of business.

Long Live j.mp.

September 7, 2009
Advice for Middle Management

Your team will sabotage your career worse than any other nemesis at work, if you let them.

Here’s what you need to know to protect yourself and your company from sabotage:

Who’s popular? Yeah, I know. It sounds like highschool, but like then it’s still in important and socially real factor that’s now kept track of on social media sites.

What is your team’s weakness as perceived by those outside? By the team itself? A good manager can appease the two.

Whose skills are the most respected? Yup you have to get along with this douchebag, if she or he is one. Just create enough space between the two of you.

Any others?

August 27, 2009
Amazon EC2 in the Enterprise

This is just a quick summary of what it was like implementing Amazon’s EC2 in an enterprise environment.

1. You’ll need to write your own LDAP plug-ins to interface with any access control lists. E.G. where I work WordPress is used for corporate communications so an LDAP plug-in had to be written to make sure the right people saw the right information.

2. Migration can be expensive if you’re using EBS on the first go. On windows, and I’m not sure why, it can cost about $50 to migrate 2GB of data into EBS. In linux, it happens at a fraction of that cost and as advertised.

3. Windows can be very expensive. Although they say it’s 12 cents per hour per small instance beware of hidden costs like authentication services and SQL server. With both, you are using a server at the cost of $1.35 / hour, which IMHO could be run cheaper with just a small linux instance and do the same thing at 10 cents per hour.

I’m pretty sure that with the right Amazon EC2 set up you could run a cluster of servers for a Fortune 500 company for under $1000.00 (one thousand dollars) per month without the CapEX costs associated with new hardware.

If you have any more questions about Amazon EC2 in the enterprise I’d be happy to answer them. Please ask them in the comments below.

August 27, 2009
How to Load Balance and Auto Scale with Amazon’s EC2
This blog post is a quick introduction to load balancing and auto scaling on with Amazon’s EC2.

I was kinda amazed about how easy it was.

Prelims: Download the load balancer API software, auto scaling software, and cloud watch software. You can get all three at a download page on Amazon.

Let’s load balancer two servers.
```
elb-create-lb lb-example --headers \
--listener "lb-port=80,instance-port=80,protocol=http" \
--availability-zones us-east-1a
```
The above creates a load balancer called “lb-example,” and will load balance traffic on port 80, i.e. the web pages that you serve.

To attach specific servers to the load balancer you just type:
```
elb-register-instances-with-lb lb-example --headers \
--instances i-example,i-example2
```
where i-example and i-example2 are the instance id’s of the servers you want added to the load balancer.

You’ll also want to monitor the health of the load balanced servers, so please add a health check:
```
elb-configure-healthcheck lb-example --headers \
--target "HTTP:80/index.html" --interval 30 --timeout 3 \
--unhealthy-threshold 2 --healthy-threshold 2
```
Now let’s set up autoscaling:
```
as-create-launch-config example3autoscale --image-id ami-mydefaultami \
--instance-type m1.small
```
```
as-create-auto-scaling-group example3autoscalegroup  \
--launch-configuration example3autoscale \
--availability-zones us-east-1a \
--min-size 2 --max-size 20 \
--load-balancers lb-example
```
```
as-create-or-update-trigger example3trigger \
--auto-scaling-group example3autoscalegroup --namespace "AWS/EC2" \
--measure CPUUtlization --statistic Average \
--dimensions "AutoScalingGroupName=example3autoscalegroup" \
--period 60 --lower-threshold 20 --upper-threshold 40 \
--lower-breach-increment=-1 --upper-breach-increment 1 \
--breach-duration 120
```
With the 3 commands above I’ve created an auto-scaling scenario where a new server is spawned and added to the load balancer every two minutes if the CPU Utilization is above 20% for more than 1 minute.

Ideally you want to set –lower-threshold to something high like 70 and –upper-threshold to 90, but I set both to 20 and 40 respectively just to be able to test.

I tested using siege.

Caveats: the auto-termination part is buggy, or simply didn’t work. As the load went down, the number of the server on-line remained the same. Anybody have thoughts on this?

What does auto-scaling and load balancing in the cloud mean? Well, the total cost of ownership for scalable, enterprise infrastructure just went down by lots. It also means that IT departments can just hire a cloud expert and deploy solutions from a single laptop instead of having to figure out the cost for hardware load balancers and physical servers.

The age of Just-In-Time IT just got ushered in with auto-scaling and load balancing in the cloud.
August 2, 2009
Monitoring Websites on the Cheap: Screen and Sitebeagle

If you don’t fail fast enough, you’re on the slow road to success.

One idea that I recently failed was using a screen and sitebeagle to monitor sites.

It’s not a complete failure… it works okay.

Due to budget constraints, I put my screen and sitebeagle set up on a production server.

For some reason that production server ran out of space and became unresponsive. Screen no doubt caused this. I was alerted of the issue and did a reboot.

After the reboot, although Amazon’s monitoring tools told me the server was okay, the server was not. The MySQL database was in an EBS volume and needed to be re-mounted.

The solution I now have in place is still screen and sitebeagle. But I use another server with screen and sitebeagle on it to monitor the production server that gave me the issue in the first place.

It’s a question of who will monitor the monitors… in a world of web sites with few site users the answers pretty bleak. In the world of super popular commercial sites, the answer’s clear. The wisdom of crowds will monitor the web sites.

July 29, 2009
Lone Coder in a Sea of Power Users?

Hey folks, I might expand this into a larger article for either mashable.com or techcrunch.com .

I was wondering if you folks who are coders feel that you’ve been put in a situation where you are the lone coder in a sea of power users?

If so, is this situation ideal for you? Not ideal?

How do you deal with job queues?

How do you deal with working with power users with conflicting interests?

I’m really interested in war stories where you feel you’re the lone expert.

Cheers, Barce

July 14, 2009
Git: How to Cherry Pick Commits and Package them Under a Tag

I’ve pretty much come to rely on git to pull me out of any bad jams in the chaotic environment I work in.

One thing I’ve had to learn to do is cherry pick commits and package them under a tag in git.

Here’s how to do it if you were working with my newLISP project called Sitebeagle:

fork sitebeagle on this page

cd sitebeagle

git fetch –tags

git checkout 8f5bb33a771f7811d21b8c96cec67c28818de076

git checkout -b sample_cherry_pick

git cherry-pick 22aab7

git cherry-pick b1334775

git diff sample_cherry_pick..master

git tag leaving_out_one_commit

git push origin –tags

At this point, you should have a tagged branch that doesn’t have the commit with the change to the “2nd file.” The diff should look exactly like this:

diff –git a/test.lsp b/test.lsp
index 9cf1667..158b625 100755
— a/test.lsp
+++ b/test.lsp
@@ -1,6 +1,7 @@
#!/usr/bin/newlisp

; test tag test_a
+; cherry pick test 2

(load “sitebeagle.lsp”)
(load “twitter.lsp”);

July 8, 2009
A Cross Platform Browser, Windows 2003 EC2 AMI
I recently created a cross platform browser, Windows 2003 EC2 AMI: ami-69739500

It has the following pre-installed:
- gvim
- IE 7
- Firefox 3 with Web Developer, yslow & Firebug
- opera
- Putty SSH
- Putty SCP
Pretty much with that list you’re all set to do troubleshooting for cross platform browser issues.

There’s IIS 6.0 and SQL Server, too.

I’ve linked the password to this ami at http://www.codebelay.com/ami-69739500.txt . It’s a short-coming of Windows AMIs on EC2 that I have to link the password, so please change it once you get into the instance.
June 15, 2009
Cylons will be built with Open Source Software
It looks like Open Source will contribute to the building of Cybernetic Life Nodes.
These open source projects are leading the way:
- http://www.leafproject.org/ – vision recognition
- http://cmusphinx.sourceforge.net/html/compare.php – voice recognition
- http://rossum.sourceforge.net/ – protocol for sharing code with other robots
- http://www.opencog.org/wiki/The_Open_Cognition_Project – advanced machine cognition
- http://playerstage.sourceforge.net/ – robot device interface, outdoor robotic movement
- http://alchemy.cs.washington.edu/ – entity resolution, link prediction, social network modeling
- http://www.alicebot.org/about.html – linguistic
The only thing missing is a robotic body to put this stuff in.

Â

It's not easy being virtual.
May 28, 2009