Author: barce

  • Social Media Backup – What are the options now?

    Bad things happen:

    Dear Twitter,

    My Twitter page @mostlylisa has been hacked and deleted. It’s GONE!!! I am currently catatonic. Please help me restore my account, it’s like, my meaning in life.

    Much love to whom ever helps me!

    PS. If you miss me like I miss you, you can always be my Friend OR Fan on Facebook. I know it’s not the same, but it’s all I have now. *hold me*

    It only took twitter about 3 days to recover from this.

    djsteen

    Is there a faster way?

    First let’s look at the current options:

    • TweetBackup (NY Times)
    • BackUpMyTweets
    • If you are popular enough and folks raise a raucous, Twitter will go into the database back-up and restore your account into its pristine set up.

    BackupMyTweets required too much info to get it working. No, you cannot have my gmail password.

    I’ve tried Tweetbackup and they get kudos for using OAuth to make it easy to back your tweets up.

    The 3rd option, begging Twitter, simply can’t scale and will only work for those few elites close to Twitter or popular enough. There isn’t a consumer solution.

    How do we solve the problem of social media backup?

    The great thing is the problem is:

    • technical
    • can have the same business model as insurance
    • will gain recognition as more snafus happen

    Once again, if you haven’t already, use BackUpMyTweets.

  • Benchmarking Inserts on Drizzle and MySQL

    I’m not comparing apples to apples yet… but out of the box, drizzle does inserts faster than MySQL using the same table type, InnoDB.

    Here’s what I’m comparing:
    drizzle r1126 configured with defaults, and
    MySQL 5.1.38 configured with

    ./configure --prefix=/usr/local/mysql --with-extra-charsets=complex \
    --enable-thread-safe-client --enable-local-infile --enable-shared \
    --with-plugins=partition,innobase
    

    which is really nothing complicated.

    SQL query caching is turned off on both database servers. Both are using the InnoDB engine plug-in.

    I’m running these benchmarks on a MacBook Pro 2.4 GHz Intel Core 2 Duo with 2GB 1067 MHz DDR3 RAM.

    I wrote benchmarking software about 2 years ago to test partitions but I’ve since abstracted the code to be database agnostic.

    You can get the benchmarking code at Github.

    At the command-line, you type:

    php build_tables.php 10000 4 drizzle

    where 10000 is the number of rows allocated total, and 4 is the number of partitions for those rows.

    You can type the same thing for mysql:

    php build_tables.php 10000 4 mysql

    and get interesting results.

    Here’s what I got:

    MySQL

    bash-3.2$ php build_tables.php 10000 4 mysql
    Elapsed time between Start and Test_Code_Partition: 13.856538
    last table for php partition: users_03
    Elapsed time between No_Partition and Code_Partition: 14.740206
    -------------------------------------------------------------
    marker           time index            ex time         perct   
    -------------------------------------------------------------
    Start            1252376759.26094100   -                0.00%
    -------------------------------------------------------------
    No_Partition     1252376773.11747900   13.856538       48.45%
    -------------------------------------------------------------
    Code_Partition   1252376787.85768500   14.740206       51.54%
    -------------------------------------------------------------
    Stop             1252376787.85815000   0.000465         0.00%
    -------------------------------------------------------------
    total            -                     28.597209      100.00%
    -------------------------------------------------------------
    20000 rows inserted...
    

    drizzle

    bash-3.2$ php build_tables.php 10000 4 drizzle
    Elapsed time between Start and Test_Code_Partition: 7.502141
    last table for php partition: users_03
    Elapsed time between No_Partition and Code_Partition: 7.072367
    -------------------------------------------------------------
    marker           time index            ex time         perct   
    -------------------------------------------------------------
    Start            1252376733.68141500   -                0.00%
    -------------------------------------------------------------
    No_Partition     1252376741.18355600   7.502141        51.47%
    -------------------------------------------------------------
    Code_Partition   1252376748.25592300   7.072367        48.52%
    -------------------------------------------------------------
    Stop             1252376748.25627400   0.000351         0.00%
    -------------------------------------------------------------
    total            -                     14.574859      100.00%
    -------------------------------------------------------------
    20000 rows inserted...
    

    MySQL: 699 inserts per second
    drizzle: 1372 inserts per second
    As far as inserts go, drizzle is about 2 times faster out of the box than MySQL.

  • If You Miss tr.im use j.mp

    I’ve been using j.mp for two weeks now and it’s filled the void that tr.im left after going out of business.

    j.mp

    Long Live j.mp.

  • Advice for Middle Management

    Your team will sabotage your career worse than any other nemesis at work, if you let them.

    Here’s what you need to know to protect yourself and your company from sabotage:

    Who’s popular? Yeah, I know. It sounds like highschool, but like then it’s still in important and socially real factor that’s now kept track of on social media sites.

    What is your team’s weakness as perceived by those outside? By the team itself? A good manager can appease the two.

    Whose skills are the most respected? Yup you have to get along with this douchebag, if she or he is one. Just create enough space between the two of you.

    Any others?

  • Amazon EC2 in the Enterprise

    This is just a quick summary of what it was like implementing Amazon’s EC2 in an enterprise environment.

    1. You’ll need to write your own LDAP plug-ins to interface with any access control lists. E.G. where I work WordPress is used for corporate communications so an LDAP plug-in had to be written to make sure the right people saw the right information.

    2. Migration can be expensive if you’re using EBS on the first go. On windows, and I’m not sure why, it can cost about $50 to migrate 2GB of data into EBS. In linux, it happens at a fraction of that cost and as advertised.

    3. Windows can be very expensive. Although they say it’s 12 cents per hour per small instance beware of hidden costs like authentication services and SQL server. With both, you are using a server at the cost of $1.35 / hour, which IMHO could be run cheaper with just a small linux instance and do the same thing at 10 cents per hour.

    I’m pretty sure that with the right Amazon EC2 set up you could run a cluster of servers for a Fortune 500 company for under $1000.00 (one thousand dollars) per month without the CapEX costs associated with new hardware.

    If you have any more questions about Amazon EC2 in the enterprise I’d be happy to answer them. Please ask them in the comments below.

  • How to Load Balance and Auto Scale with Amazon’s EC2

    This blog post is a quick introduction to load balancing and auto scaling on with Amazon’s EC2.

    I was kinda amazed about how easy it was.

    Prelims: Download the load balancer API software, auto scaling software, and cloud watch software. You can get all three at a download page on Amazon.

    Let’s load balancer two servers.

    elb-create-lb lb-example --headers \
    --listener "lb-port=80,instance-port=80,protocol=http" \
    --availability-zones us-east-1a

    The above creates a load balancer called “lb-example,” and will load balance traffic on port 80, i.e. the web pages that you serve.

    To attach specific servers to the load balancer you just type:

    elb-register-instances-with-lb lb-example --headers \
    --instances i-example,i-example2

    where i-example and i-example2 are the instance id’s of the servers you want added to the load balancer.

    You’ll also want to monitor the health of the load balanced servers, so please add a health check:

    elb-configure-healthcheck lb-example --headers \
    --target "HTTP:80/index.html" --interval 30 --timeout 3 \
    --unhealthy-threshold 2 --healthy-threshold 2

    Now let’s set up autoscaling:

    as-create-launch-config example3autoscale --image-id ami-mydefaultami \
    --instance-type m1.small
    as-create-auto-scaling-group example3autoscalegroup  \
    --launch-configuration example3autoscale \
    --availability-zones us-east-1a \
    --min-size 2 --max-size 20 \
    --load-balancers lb-example
    as-create-or-update-trigger example3trigger \
    --auto-scaling-group example3autoscalegroup --namespace "AWS/EC2" \
    --measure CPUUtlization --statistic Average \
    --dimensions "AutoScalingGroupName=example3autoscalegroup" \
    --period 60 --lower-threshold 20 --upper-threshold 40 \
    --lower-breach-increment=-1 --upper-breach-increment 1 \
    --breach-duration 120

    With the 3 commands above I’ve created an auto-scaling scenario where a new server is spawned and added to the load balancer every two minutes if the CPU Utilization is above 20% for more than 1 minute.

    Ideally you want to set –lower-threshold to something high like 70 and –upper-threshold to 90, but I set both to 20 and 40 respectively just to be able to test.

    I tested using siege.

    Caveats: the auto-termination part is buggy, or simply didn’t work. As the load went down, the number of the server on-line remained the same. Anybody have thoughts on this?

    What does auto-scaling and load balancing in the cloud mean? Well, the total cost of ownership for scalable, enterprise infrastructure just went down by lots. It also means that IT departments can just hire a cloud expert and deploy solutions from a single laptop instead of having to figure out the cost for hardware load balancers and physical servers.

    The age of Just-In-Time IT just got ushered in with auto-scaling and load balancing in the cloud.

  • Monitoring Websites on the Cheap: Screen and Sitebeagle

    If you don’t fail fast enough, you’re on the slow road to success.

    One idea that I recently failed was using a screen and sitebeagle to monitor sites.

    It’s not a complete failure… it works okay.

    Due to budget constraints, I put my screen and sitebeagle set up on a production server.

    For some reason that production server ran out of space and became unresponsive. Screen no doubt caused this. I was alerted of the issue and did a reboot.

    After the reboot, although Amazon’s monitoring tools told me the server was okay, the server was not. The MySQL database was in an EBS volume and needed to be re-mounted.

    The solution I now have in place is still screen and sitebeagle. But I use another server with screen and sitebeagle on it to monitor the production server that gave me the issue in the first place.

    It’s a question of who will monitor the monitors… in a world of web sites with few site users the answers pretty bleak. In the world of super popular commercial sites, the answer’s clear. The wisdom of crowds will monitor the web sites.

  • Lone Coder in a Sea of Power Users?

    Hey folks, I might expand this into a larger article for either mashable.com or techcrunch.com .

    I was wondering if you folks who are coders feel that you’ve been put in a situation where you are the lone coder in a sea of power users?

    If so, is this situation ideal for you? Not ideal?

    How do you deal with job queues?

    How do you deal with working with power users with conflicting interests?

    I’m really interested in war stories where you feel you’re the lone expert.

    Cheers, Barce

  • Git: How to Cherry Pick Commits and Package them Under a Tag

    I’ve pretty much come to rely on git to pull me out of any bad jams in the chaotic environment I work in.

    One thing I’ve had to learn to do is cherry pick commits and package them under a tag in git.

    Here’s how to do it if you were working with my newLISP project called Sitebeagle:

    fork sitebeagle on this page

    cd sitebeagle

    git fetch –tags

    git checkout 8f5bb33a771f7811d21b8c96cec67c28818de076

    git checkout -b sample_cherry_pick

    git cherry-pick 22aab7

    git cherry-pick b1334775

    git diff sample_cherry_pick..master

    git tag leaving_out_one_commit

    git push origin –tags

    At this point, you should have a tagged branch that doesn’t have the commit with the change to the “2nd file.” The diff should look exactly like this:

    diff –git a/test.lsp b/test.lsp
    index 9cf1667..158b625 100755
    — a/test.lsp
    +++ b/test.lsp
    @@ -1,6 +1,7 @@
    #!/usr/bin/newlisp

    ; test tag test_a
    +; cherry pick test 2

    (load “sitebeagle.lsp”)
    (load “twitter.lsp”);

  • A Cross Platform Browser, Windows 2003 EC2 AMI

    I recently created a cross platform browser, Windows 2003 EC2 AMI: ami-69739500

    It has the following pre-installed:

    • gvim
    • IE 7
    • Firefox 3 with Web Developer, yslow & Firebug
    • opera
    • Putty SSH
    • Putty SCP

    Pretty much with that list you’re all set to do troubleshooting for cross platform browser issues.

    There’s IIS 6.0 and SQL Server, too.

    I’ve linked the password to this ami at http://www.codebelay.com/ami-69739500.txt . It’s a short-coming of Windows AMIs on EC2 that I have to link the password, so please change it once you get into the instance.