Finally decided to spend time and move the blog to my own domain instead of a wordpress domain. Looked at different options including hosting a wordpress on my own server but tumblr seemed like the best option. It still lacks some features but its getting there and I like its interface.

http://himanshubaweja.com

Rss feed for blog:
http://himanshubaweja.com/rss

Note: This guide works as on 11th April, 2010. It’s a bit more technical but I have tried to make it as step by step as I can. Post your doubts in comments and I will try to help you out. Since its 5am, the blog post is going to contain a whole lot of rants :). Skip portion which is in italics and contained in “#region”.

Background Storyline:

(Skip if you don’t know me and just want to download video)

#region Rant

I hate lock downs imposed by other companies on their users. Yeah there are people out there who will misuse your service but don’t put roadblocks in the faithful users of your service. I love ustream and at least use them once a week but I guess they don’t like me that much.

#regionend

I had a video of my presentation at TC50 on Ustream. My not so computer savvy sis has a habit of backing everything on her backup hard-disk (every photo, every video about your family – well at least she is not taking print-outs). She asked me for the video and I am like “Okie give me two minutes, I will send over Skype”. Since the video was uploaded by techcrunch team, Ustream didn’t provide me a direct way.

I start searching on the internet and found couple of how-to  blog posts but going through them realized, Ustream team has been working extra time to disable workarounds. I search for a software and its $90 but yippee! there is a trial version. Normally I would have given the link here for the maker’s website but on their website they say its 15 days trial and you download, install and its start downloading the video. Half way through it says trial version only downloads first 10mb so “Buy now” or “No video for you :(“. Damn!!! So keep search some more and didn’t find anything and we are already 10 minutes in and my sis pings me again.

Time to go turn on the geek mode and download and send the video before the next ping – expected time “t-10 minutes”.

Step 1:

Download and install Wireshark from http://www.wireshark.org/download.html.

Well you might ask what is Wireshark? The answer in our case is it will ustream servers are going to send the video and its details to your computer over the network. Wireshark will capture all this data flowing over your network and write it to a text file.

#region rant

Surprisingly Wireshark’s website has a whole lot of crap on their front-page except saying what does the damn software do!!! Oh, there is a link to “Learn Wireshark”. Cool. But even that doesn’t say what it does!!! Yeah there target base is a whole lot different but still.

#regionend

Step 2:

Run wireshark and capture the data

i) Go to “Capture –>Interfaces”

image

ii) It will bring up a popup window. You will see all your network cards like wifi, wired etc. The simplest option is to disconnect all others and just use one type of network for example wifi. Click “Start” button in the row whose number of packets is non-zero. If there is more than one, just select the one with highest number.

image

iii) Wireshark will minimize itself. Open the browser, open the ustream page which contains the video and start playing the video.

#region rant

If you need a screenshot for this, tell me how in the world were you able to do step 1?

#endregion

iv) Wait till video starts playing and go back to Wireshark and stop the capture from “Capture –> Stop”

image

v) Export all the data you captured to a text file for easy searching.

image

v) Write the name of the file and save it on desktop. Change the “Save as type” to “Plain text”, Check the “Packet Bytes” checkbox on right, make sure “All packets” is selected on left and then click “Save”

 

image

Step 4:

Find the url of video

Open the file you just saved it any text editor like Notepad (wordpad will be better if you know how to do it. Search for “.flv”. You will find couple of instances. Select the line which contains a lot of numbers and 3-4 above it contain the words “http://” and “ustreamstorage”. Concentrate on the rightmost column only. The rest is just gibberish :).

image

Now now start copying the text from right column starting with the word “http://” till “.flv” In the above example I copied “http://vod-sto”, "rage1.ustream.tv", "/ustreamstorage/", "content/0/1/2/21", "63/2163474/1_280" and "032_2163474.flv". Yup this also seems like gibberish but if you combine all of them and it becomes a url “http://vod-storage1.ustream.tv/ustreamstorage/content/0/1/2/2163/2163474/1_280032_2163474.flv”. Yippie! We are almost done :).

Step 5

Download the video

Since the last step was complex, lets make this one simple. Open the url you just made in browser and it will prompt you to “Save” or “Open” the file. Save it and that’s your video. If you want in some other format, search on google on how to convert “flv” into “avi” or “mp4”.

#region recommendedTools

i) Tools used in writing this blog post: wordpress, Windows Live Writer and Paint.Net. All three are pretty good. Paint.Net will be excellent if they just find another name :).

ii) I downloaded the video using “wget”. If you are a hacker on windows and don’t have cygwin or GnuWin32 or some other linux shell installed on windows go die!

iii) Its 6am. Do you seriously expect me write a third one!!!

#regionend

One of the features we wanted in iMo windows version was it should periodically check for updates and prompt user to upgrade if updates are available. Update should be just 1 click process. Since iMo integrates deeply into the operating system, using ClickOnce was not an option. One additional constraint was we are using Visual Studio Setup to deploy our application so update mechanism should also be through it (to avoid inconsistencies like in “Add-Remove programs” it shows version 1.0.9 where as you really have 1.1.3).

As always, I tried to find an existing opensource solution but found none which satisfied complete requirements so I went ahead and wrote a custom solution. Most of it is copy-paste from code project, msdn, google etc. Links to referances at the bottom.

I will outline my approach here. Looking forward to heard your approaches and any loopholes mine might have. So here we go :).

Step 1 : Download the text file from server and compare with current version:
I used WebRequest asynchonously to download the file. Don’t use the main thread as it might take few seconds and user might feel application hung. Now you don’t want to be doing this each time your application starts (once a day is good enough number). To save the lastUpdateCheckTime, I used User Settings. Its much better and cleaner than using registry or maintaining your own text file. You can get more info on User Settings here.

Step 2 : Update available, Now What:
Prompt the user if he wants to update now or later. If he clicks once, I don’t prompt him for next 4 days but start showing a small button in the corner for “Update Now”. If he says, yeah I want to update now, download the latest Msi setup from webserver in “Temp” folder. To download the file, I used WebClient.DownloadFileAsync. Show a progress bar on your application to show the download progress. Bonus points for allowing user to use the application while the update is being downloaded :).

Step 3 : Verify Downloaded File:
Haven’t finalized this, but will most probably be using the signature to verify.

Step 4 : Prepare for the update:
Now, you can’t replace an application while its running. Some people rename their main application and update but I feel its a very ugly hack. I included a small “iMoUpdate.exe” which was copied to “Temp” folder by main application and started and then the main application shut itself down. The job of iMoUpdate.exe was to just run the setup we downloaded in Step 2.

Step 5 : Run update:
“msiexec” is a small utility included with all new versions of windows (XP+). Run it using System.Diagnostics.Process and give arguments “/qb” and path to your downloaded Msi setup which runs it in quite mode ie it will just show a progress bar and not prompt the user at all. Bonus points for restarting the main application again after update is complete:).

Thats it! Simple and easy. Less than 50 lines of code.

Update:

  • One important thing which I added later was a Mutex in “Step 5″ which makes sure that the main application has exited before running the update setup using “msiexec”
  • Based on feedback, I am attaching the file containing code snippets for each of the steps. You can download it here. Its not commented or Step by Step but should help in copy pasting/googling :)

Mozilla released the problem statement for Summer Design Challenge a few days back. Instead of directly submitting by entry, I thought I will publish it in parts on my blog and iterate based on the feedback I receive.

The Problem Statement

For this Design Challenge we are focusing on finding creative solutions to the question: “Reinventing Tabs in the Browser – How can we create, navigate and manage multiple web sites within the same browser instance?”

Tabs worked well on slow machines on a thin Internet, where ten browser sessions were “many browser sessions”. Today, 20+ parallel sessions are quite common; the browser is more of an operating system than a data display application; we use it to manage the web as a shared hard drive. However, if you have more than seven or eight tabs open they become pretty much useless. And tabs don’t work well if you use them with heterogeneous information. They’re a good solution to keep the screen tidy for the moment. And that’s just what they should continue doing.

My Solution
My browsing always starts with with a few tabs – My RSS reader, Two news website and Facebook. While browsing from each of tabs I keep opening new tabs, reading them and closing some of them. Before I know it, I have more than 25 tabs.

Browing Patterns

Browing Patterns

Something very similar happens when I am doing some research. I will start with a google search tab. From the results I will open 5 new tabs. Then I will open a few fresh tabs to try searching for new keywords I learnt from my earlier results.

So using the same tree level structure, we will now have two levels of tab. Mockup will make it more clear:

Mockup for Tabs

Mockup for Tabs

We have two levels tabs. When you open a new blank tab, always a first level tab is opened. Whenever you click any link to open in a new tab from top level tab, a new second level tab(subtab) is created within it. So lets say I start searching for “Firefox reviews” on google. Thats the top level tab. As I open each of the results, they will each be opened in a a new subtab.

So the assumption we are using is all tabs opened from a new tab are related and can be clubbed under the initial tab.

Note: Lets say we currently have just one maintab with google search. So when I open my first google result in a new tab, two subtabs will be created. The first being the google results listing (content of maintab) and second being the new subtab. For all new tabs I open in any of the subtabs, they will remain under the main tab which is google results listing.

First Subtab

First Subtab

Q1: What happens for Newbies?
The subtabs are turned off by default. The first time a user opens more than 15 simultaneous tabs, we prompt him to enable subtabs and show him the preview. Otherwise he can enable from settings. Once enabled, it remains enabled in upgrades etc.

Q2: What happens when I close the first tab which started it all (google results page)?
Since now its a subtab, its closed while the name of maintab and the maintab itself are retained as it is.

Q3: Why subtabs at the top?
Width is a valuable resource so can’t put it on left (One solution on mozilla site does put it on left). Bottom doesn’t make sense as having closely related browsing elements should be nearby, instinctively for changing the tab, my mouse goes up so continuing with the convention. New users who transform later will also find the shift easier.

Q4: How do i create maintabs?
By new tab(Ctrl+T) or right click – open in main tab

Q5: When i go from main tab 1 to main tab 2 and then come back to 1, what do i see?
Same as what you were seeing earlier ie Subtab opened does not change.

Q6: New tabs created by opening links from a subtab?
These new tabs are created as subtabs within the same higher level maintab.

“This work is licensed under a Creative Commons Attribution 3.0 United States License (http://creativecommons.org/licenses/by/3.0/us/)”

So you are free to copy, modify, reuse etc etc (I don’t really understand these licenses, from my side its as good as ours so iterate over it and make it better and submit in your name)

So let start the questioning, bashing and proposing alternate solutions :).

Overview: I will go over sharding basics and how to overcome problems like calculating ranks and inbox.

Sharding is splitting your database into partitions and keeping each of the partitions on different servers.
The following figure will make it more clear.

Database Sharding Basic Overview

Database Sharding Basic Overview

What we did is, instead of keeping details of all user on one database server, we kept the users with odd userid on one server and others on second server. Lets say, if our system allowed us to make only 1000 queries on user, we can now make 2000 (1000 on each server more or less). You can split your data into any number of shards and it scales horizontally.

Q1: How many shards (partitions)?
The main effort in sharding is upfront while doing the partiions. So its better to plan for next 6 months or so. Lets say have only 2 servers, you can partition into 10 shards and place 5 of them on one server and other 5 on second. Later on, if add one more server, you can move 2 shards each from first and second server onto the third one. You might ask why 10, let like many other decisions in startup, this is more about gut feel then anything else.

Now, how to implement sharding in your code. There are a number of options like mysql proxy but the easiest method (both in effort and maintenance) change your database api layer(the class in your codebase which handles the actual query execution, maintaining connections etc) and pass an extra parameter the userid. The database layer can determine the server and database name to use based on the userid. So:

apps_mysql_query(query) => apps_mysql_query(userid, query)

The biggest issue with sharding is how to compute something at a global level. Lets say in above example, each user had a score. You want to know the top 25 leaders. Earlier it was one simple query:
“SELECT userid, name, photo FROM user WHERE user = ‘active’ ORDER BY totalscore DESC LIMIT 10″
Since now you have 10 different shards, you need to execute the same query on 10 different shards, merge the results and select top 25 from this combined list. Downright Ugly!

To get around this problem, what needs to be done is maintain two snapshots of database, one sharded version and one complete version (called central repository). So what we now have is:

Sharding - Central Repository Model

Sharding - Central Repository Model

Q2: Doesn’t this take away all advantage we had with sharding?
Not really. Actually far from it. You need to keep make a central repository of those tables on which you need to perform global queries. So User table is one. Tables like “User Permissions” or “User Actions Done” don’t need to be kept on central repository. Now why have a central repository for user table at all? Things like user complete details for user profile, user name and picture etc to show, all these will go to the corresponding user shard. But when you want to show things like leaderboard, only then you will query the central repository.

We need to change our database api so that if userid sent is null, it should query the central repository

Now, we come to the second problem. Let say, user 1 had sent a message to user 2, do we keep it on user1 shard or user2 shard. If we keep it on any one of them, to show messages recieved by user2 do we query all shards? The answer is, keep it on both shards. The one extra insert is much more efficient than 10 selects you will have to do each time user wants to see messages he has recieved. But doesn’t that increase data size? Isn’t disk space close to free :).

So how does our database api change to reflect the above:

apps_mysql_query(userid, query) => apps_mysql_query(userid1, userid2, query)

In case of write query, if userid2 is null, it will just write to shard 1, but in case userid2 is not null, it will write to both shards. In case of read, userid2 is ignored.

So thats it. Let me know in comments if you have used an alternative strategy or any feedback on current system.

pain point:

i use my google search box as a shortcut to the web. atleast 95% of my queries are things i know where to find and might have even used them in the past but too lazy to goto site and browse to that link. Lets take a simple example, i want to see the imdb rating of dark knight so that i can decide whether i should buy the dvd or not.

case 1:

  • goto imdb.com
  • search for dark knight
  • click on the search results and open the movie page

case 2:

  • search for “imdb dark knight” in my search box in browser
  • click on the search results and open the movie page

so normal method is 3 step. using google search is 2 steps. can we make it 1?

solution:

enter split search.

the above is what you see when you search for “imdb dark knight” on split search. the screen is split into two columns with left hand side showing the actual search results and right side opens up the first result so that you can directly see the info you want. so 1 step search now :).

how to use:

click here and try out the search. if you like it, you can directly add it to your browser (almost all supported) by clicking on “add to browser”.

technologies used:

since my school days i have doing a lot of coding and building things but lifespan of each was till the next time i formatted my computer. one of my friends gave an idea that i should start organizing all this stuff instead of just letting it to down the draing.

so here we go :).

to answer in two lines: i will be writing a blog post each time i build. some info on architecture and tools used.

to bootstrap the blog, over next week or so, i will be adding blog post on things which i built since december 2008. also i might do guest author posts if any friend of mine has built something really interesting and wants to share :).

Follow

Get every new post delivered to your Inbox.