Steven You

A Forged Geek.

Alternative of Symbolic Link : Mount –bind

Symbolic link works well in most cases, but it differs from the “real” file path. For example, symb link could be disabled by nginx. Hard link has it’s own limitations like you cannot hard link directory.

The alternative is using:

$ sudo mount --bind /source_path /dest_path

What is does is:

Remount a subtree somewhere else (so that its contents are available in both places)

The only thing to keep in mind is that you can’t rm -rf the dest folder, you have to unmount it first.

Interview Questions - Programming

Tomorrow will be my first day of my new job. After graduation, I have gone through quite a large number of articles regarding interview questions and programming tricks. I am gonna list some interesting ones.

1. Check if at least 2 out of 3 booleans is true

This one come from StackOverflow. I like to solve simple problem in one line which is:

return a ?(b || c):(b && c);

or

return a &&(b || c)||(b && c);

or, here is some awesomeness:

return a ^ b ? c : a

Vertically Scale Web Application via Using Non-blocking I/O

I have been working on web application scaling for a while. There are lots of interesting stuff. I will scale vertically first.

Non-blocking I/O

Blocking I/O means that the program execution is put on hold while the I/O is going on, which means the program waits until the I/O is finished and then continues it’s execution.

However, in non-blocking I/O, the program can continue during I/O operations, and is notified via a callback when IO operation is finished. This forces programmers to design programs differently making them perform a lot better.

The Non-blocking I/O was supported by most Operation Systems. On Windows there is underlying OS support for non-blocking I/O, and Microsoft’s CLR(Common Language Runtime, virtual machine component of .NET framework) takes advantage of that.

Non-blocking I/O for web applications

The implementation of non-blocking IO contribute the success of projects like node.js.

For common web applications like JAVA Servlet, Apache web server , there is a design flaw, which introduce lots of overheads consume lots of mem and cpu because the thread is expensive. Non-blocking I/O is smart as the pending I/O connections will not consume any resources. On the other hand, underlying OS will be exhausted when looping through threads and checking status if each request will be handled via threads.

Here is a good slide describe node.js:

Download YouTube Videos Using Python

I have searched around to find a simple way to download youtube video. However, google hasn’t provide an API for downloading, and youtube page content structure seems has changed a lot, some old post regarding how to download youtube video in one line python script doesn’t work any more.

Finally I find an actively repository on github and forked it: python-youtube-download. To make it simpler to use rather than selecting the definition and specifying the format, I added main script to enable it to run in one line:

python youtube.py "http://www.youtube.com/watch?v=6bXOOz8mN6U"

Concurrent Tasks Execution in Python

There are tasks need to be done with multiple thread, e.g.: I need to request thousands of urls, in order to training the collaborative filtering service. This could easily be done using python.

First way: Manage the thread yourself

I have a repo on Github, Tumblr Image Downloader, which is used for batch download images from a tumblr blog using tumblr API.

Basically, there is a task queue:

Liquid error: Could not open library ‘lib.so’: lib.so: cannot open shared object file: No such file or directory

and a worker:

Liquid error: undefined method `Py_IsInitialized’ for RubyPython::Python:Module

What the download_img function does is get the image url and save it to the save_path.

The program will call the download_imgs function: Liquid error: undefined method `Py_IsInitialized’ for RubyPython::Python:Module

Better and Simpler way: Using concurrent.futures module

PEP 3148 gives the motivation for this module:

Python currently has powerful primitives to construct multi-threaded and multi-process applications but parallelizing simple operations requires a lot of work i.e. explicitly launching processes/threads, constructing a work/results queue, and waiting for completion or some other termination condition (e.g. failure, timeout). It is also difficult to design an application with a global process/thread limit when each component invents its own parallel execution strategy.

This module will make the life easier. Download link is here. There are two types of executor: ThreadPoolExecutor and ProcessPoolExecutor.

I will take ThreadPoolExecutor for example:

Liquid error: undefined method `Py_IsInitialized’ for RubyPython::Python:Module

-EOF-

Switching to ZSH

Following Mako, after reading some posts regarding the benefit of using ZSH, finally I execute the command jumping to the zsh world:

curl -L https://github.com/robbyrussell/oh-my-zsh/raw/master/tools/install.sh | sh

The good thing

It works pretty awesome. It “acts extremely similar to bash”, but not always, which I will explain in the painful part. It does the typo correction which is very helpful to careless typers like me.

The pain during the switching

After typing around with zsh, I was going to spreading it. I am using Octopress for blogging, I am astonished when I find out zsh shows the git folder status in the prompt.

However, when I using rake to create a new post, it shows zsh: no matches found: new_post[Switching to ZSH]. I thought it was the ruby gem version problem as I installed something for OPENSHIFT of RedHat. But, it’s not.

The reason is zsh doesn’t know about the RVM function, RVM need to be loaded into the shell session as a function.

Append

[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm"  # This loads RVM

to the .zshrc will fix the problem.

Investigate more

Go through the zsh user guide, enjoy!

Play Around With Django

Worked on a simple repo recently, which is a diary site using Django-nonreal wich can use google app engine authentication.

GitHub: https://github.com/GoSteven/Diary

Django-nonreal contain all the major functionalities of Django which can host pure Django application on google application. It just works awesome!