## Wednesday, May 28, 2014

### Fix Mid 2009 MBP RAM not recognized issue

Possible cause of unrecoganized RAM

A few days ago, I encountered an issue which seems to be common among mid-2009 MBPs: one of the RAM (in slot 1) is not recognized anymore. Or, sometimes it is recognized, but after sleep and wake up, the computer freezes and impossible to recover but force power off.

It turns out this is a common issue in this model. See the discussion in this thread and this thread. In the following, I am going to present my temporary fix for this problem. For those of you that still want to stick to the old MBP, the fix shall last for a while. But I do recomend you backup all the files and prepare to migrate some day soon.

As I suggested in my reply, this may due to a RAM slot degradation. My guess is, the RAM slot cannot align the RAM to a correct contact positions any more. Precisely, see the post image above. Notice that the two clips on the left and right are used to hold the rams in a horizontal position, otherwise they will bend upwards. I took a close look at those clips and found that the plastic wore out and cannot hold them as original. I don't have great ways to fix them, so I just cropped some papers and insert them between the RAMs and the back edges on the body, hoping they can help tucking the RAMs.

As I am fixing it, I accidentally broke the left holder. So I have to customized a plastic holder and stuck it with the logic board to hold the inner ram (slot 1).

To verify my theory, I then taped a padding at the corresponding RAM position in the back cover:

And they look like the following before I close:

Note that you have to screw it real tight to create the pressure such that the RAM is aligned. That said, there will be a 'bump' at that position, and will easily cause scratch. So I use an apple sticker to cover my ass:

This method worked for me, well, at 99% of the time. Sometimes after sleep, the MBP still won't wake up. I notice that this usually due to running the MBP for a long time, and it's hot inside. Nevertheless, this is the best solution I can come up with by now. If you have any other cheap solution that does not require replacing the logic board, please let me know in the comments.

Finally, Zach Clawson created a dedicated page for this issue, which lists lots of reference and provides explanation to it. Make sure you check it out if you have encountered similar issue.

There are other common issues for this model, and they can be easily fixed. See my following posts:

If you have similar experience, do not hesitate to let me know. If you find my instruction helpful, leave a comment and share it!

via

### Replacing SATA cable in Mid 2009 MBP

Replacing MBP cable

Recently, my mid 2009 MBP (Model A1278) fails to recognize the hard drive. My first bet was another disk failure on me, but it was not the case. I took down the hard drive and put it to a mobile hard drive case and it can be read smoothly.

It turns out that it is due to the SATA cable fault, which is a notorious problem for mid 2009 MBP model. See the threads and discussions here, here, here and here.

Luckily, the solution is simple, just go ahead and purchase a replacement cable and replace it. iFixit has a very detailed illustrated document on the procedures. However, Amazon has a cheaper option, and it works fine for me. I didn't test the infra-red though, since I never and not plan to use it.

There are other common issues for this model, and they can be easily fixed. See my following posts:

If you have similar experience, do not hesitate to let me know. If you find my instruction helpful, leave a comment and share it!

via

## Friday, May 23, 2014

### My Octopress Blogging Flow

Writing in Sublime Text and previewing in Marked

After blogging with Octopress for a while, I have already gained some insights on it, and my publishing flow has been smoother. I think it is right time to share my flow as a reference.

## The Flow

The following sections outlines the flow. The last section contains assorted tips and tricks. For the basic configuration of Octopress, please refer to the official website. I also recommend installing Alfred.app.

### Creating a post: alfred workflow (blog publish)

Artem Yakimenko created an awesome alfred workflow for publishing and generating octopress websites. Use blog publish [title] to create a new post:

It then opens the post in your specified text editor with template.

### Editing: Sublime Text.app + Marked.app + Marked App Menu (ST Plugin)

I use Sublime Text (ST) as my text editor, because it provides VIM keybinding and there is a huge repository of plugins. While editing, I use [Marked] to instantly render the markdown file and view the result. In fact, the title image shows my editing and previewing in action.

To make the process sweeter, there is a ST plugin called Marked App Menu that allows you to open the current file in Marked. Just search in ST Package Control to install it.

### Previewing: pow + rake watch + pow alfred workflow

To preview the generate website, simply install pow and execute rake watch under octopress directory to monitor the change. Octopress official website provides some explanation. After installation, you can view your website locally at http://octopress.dev.

You should also install the pow alfred workflow, which can help you open pow website in a breeze.

### Math Rendering: MathJax

Since I am an EECS student, I write a lot of Optimization, Machine Learning and Computer Vision stuff, which relies heavily on mathematics. Thus, writing math formulas is a must for me. MathJax is a javascript library for rendering math by writing LaTeX math. To do this, one needs to configure the site and link to the library. Put these two lines into <octopress>/source/_includes/custom/head.html:

 1 2 3 4 5   

The first script block loads MathJax, and the second loads a custom configuration in source/javascripts/MathJaxLocal.js. It is a good place to write your own macro there. For instance:

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  MathJax.Hub.Config({ TeX: { equationNumbers: { autoNumber: "AMS" }, TagSide: "left", Macros: { reals: ['\\mathbb{R}'], complex: ['\\mathbb{C}'], norm: ['\\left\\lVert#1\\right\\rVert', 1], given: ['\\:#1\\vert\\:', 1], data: ['\\mathcal{D}'], } } }); MathJax.Ajax.loadComplete("/javascripts/MathJaxLocal.js"); 

Now you can write math!

$e^{i \pi} + 1 = 0$

There are a couple of good articles for reference:

### Image hosting: Dropbox public folder

I host my images in Dropbox Public folder, since you can simply copy the public link and paste it to the post source, for example:

### Image editing: Skitch + OmniGraffle

The previous image is done by a timed capture from skitch. For advanced vector graphics, I use OmniGraffle.

### Markdown converter: pandoc

Pandoc is a swiss-army knife like tool that convert documents in multiple formats to several dozens of output formats. I mainly use it as the markdown converter for Octopress. A plugin can help you with that.

After installation, I update the markdown section of _config.yml with the following:

 1 2 3 4 5 6 7 8  markdown: pandoc pandoc: format: html5 extensions: - smart - mathjax - bibliography: source/blog.bib - csl: _style/ieee.csl 

which tells Octopress to use pandoc, and pass the option smart, mathjax and use the style file ieee.csl to format the biliography blog.bib. For example, refer to [@xiao2013optimally] generates refer to [1] (scroll down to see the References section).

### Pandoc's Markdown Reference: Dash.app and Dash alfred workflow

Dash.app is an API Documentation Browser and Code Snippet Manager. It provides an convinient alfred workflow that searches the documents:

opens

### Tips and tricks

Continuously growing…

• There are lots of undocumented/less documented things in Octopress, which can help you write blog posts kinda 'programmatically'. For example, http://bit.ly/1om3icj returns the url of site, which is http://bit.ly/1om3icj in my case. In fact, anything in _config.yml is a variable under site.

## Conclusion

In conclusion, Octopress is a revolutionary blogging framework. It provides a robust static site building framework (jekyll, bootstrap, scss, etc.) and allows complete control over the source, which is perfect for users that have basic coding and source control skills. In fact, it gives me a similar feeling of getting touch with a Mac. That is, compared to Windows, which is too close and does not provide built-in programming-friendly environment (Console, UNIX stuff, etc.), and compared to Linux, which is very open but too many variations and too many customizations needed, it combines their advantages by presenting a user- friendly interface and provides all sorts of underlying UNIX tools. I am very satisfied about this and my intention to write posts have revived. However, some sort of basic configuration is still needed. In particular, I would say Mathjax rendering and better image support definitely need to be integrated in the next release.

## References

[1] Z. Xiao, Y. Du, H. Tian, and M. D. Wong, "Optimally minimizing overlay violation in self-aligned double patterning decomposition for row-based standard cell layout in polynomial time," in Computer-aided design (iCCAD), 2013 iEEE/aCM international conference on, 2013, pp. 32–39.

via

## Thursday, May 22, 2014

### Sparse Gradient Image Reconstruction via L1-minimization

 Original Minimum Energy Reconstruction Sparse Reconstruction

## Introduction

This is a follow up of the L1-minimization series. The previous two posts are:

We have explored using L1-minimization technique to recover a sparse signal. The example shows a 1D example. This post demonsrates on a 2D example, where the image is viewed as a signal. This makes sense as we can perform 2D Fourier Transform in the image, where the basis are a combination of horizontal and vertical waves. For a complete introduction to FFT on images, refer to this tutorial. Notice that similar to 1D signal, we do not measure the image directly in time domain, but we do it in the frequency domain. Concretely, say $x$ is the 2D image collapsed to 1D, and $A \in \reals^{k\times n}$ is the measurement matrix, $b$ is the observation, we then have $Ax=b$. Usually we will require $k = n$ to obtain an exact solution for $x$ given $A$ and $b$. Now, if we use FFT and obtain the frequency coefficients as $\hat{x}$, we can also perform similar measurements $\hat{A} \hat{x} = \hat{b}$, and the requirement $k = n$ is the same. In other words, the required samples (the information) is the same. By using the inverse fourier transform, we can convert $\hat{x}$ back to $x$. The only difference is that the measurement $\hat{A}$ is taken in frequency (Fourier) domain. As we can see later, we can utilize sparse information to reduce $k$.

## Image Gradients and Total Variation

We first introduct the concept of image gradients. For any 2D real image I, if we think about each row as a signal, we can then view the 'difference' between adjacent pixels as (horizontal) gradient Gx(I), this makes sense since a sharpe change denotes an edge. Similary, we can define the vertical gradient Gy(I) for columns. Thus, we have

$Gx(I) = \begin{cases} I_{i+1, j} - I_{ij} & i < n \\ 0 & i = n \end{cases} \qquad Gy(I) = \begin{cases} I_{i, j+1} - I_{ij} & j < n \\ 0 & j = n \end{cases}$

where the image size is $n\times n$.

Collectively, the image gradient G(I) is defined as the magnitude (2-norm) of both components:

$G(I)_{ij} = \sqrt{(Gx(I)_{ij})^2 + (Gy(I)_{ij})^2}$

The following shows Gx, Gy and G of the phantom image:

 gx(I) Gy(I) G(I)

The total variation TV(I) of an image is just the sum of this discrete gradient at every point.

$TV(I)= \norm{G(I)}_1 = \sum_{i,j} G(I)_{ij}$

We notice that $TV(I)$ is just the L1-norm of $G(I)$, which leads us to the following: if we have an image that is sparse in its image gradients, we can exploit that and use our L1-minimization trick.

The ratio of non-zero elements in Gx, Gy and G of the phantom image is 0.0711, 0.0634 and 0.0769, respectively. These ratios are really small - and we consider the gradient as sparse.

Let $F: \reals^{n\times n} -to \complex^{n\times n}$ be the FFT operator, and $F I$ be the Fourier transform taken on image I. Define a set $\Omega$ as the $k$ two-dimensional frequencies chosen according to some sampling pattern from the $n \times n$. We further define $F_\Omega I: \reals^{n \times n} \to \complex^k$ as the $k$ observation taken from the fourier transform of image I. We can then solve the following optimization problem to recover $I$:

$\min_I \norm{F_\Omega I - b}^2_2$

where $F_\Omega$ can be view as the measurement matrix, $b$ is the observation, and we want to find $I$ such that the reconstruction cost (energy) is minimized.

However, the above does not quite work. As we can see in the following images, the L2-minimization does a poor job, either for a random measurement or a radial measurement [4] in Fourier domain.

 Random measurement L2-minimization L1-minimization

To utilize the sparse information, we add a L1-regularization term to the above objective function, which yields the following:

$(TV_1) \quad \min_I \norm{F_\Omega I - b}^2_2 + \lambda TV(I)$

Without surprise, optimizing the above gives us a perfect reconstruction of the original image. It is shown that if there exists a piecewise constant I with sufficiently few edges (i.e., $G(I)_{ij}$ is nonzero for only a small number of indices i, j), then $(TV_1)$ will recover I exactly.

A heavily commented code example is available in my github repository. Leave a comment if you have any question.

## Probing Further

Now, take a look at another example cameraman, which has the following gradients (intensity rescaled using matlab's imagesc.

The following shows the reconstructions (left two are using random measurements, right two are using radial measurements).

As we can see, the results are not as good. In fact, the non-zero ratio of its gradient is 0.9928, which is not sparse at all. However, if we plot the histogram of gradients, we will find that most of the gradient magnitudes are small:

In particular, most of them are smaller than 200, which means the number of 'changes' that are larger than 200 is small. In fact, the ratio of gradient > 200 is only 0.0964! Thus, there are two possible ways to discard these information and get a 'compressed' image that is sparse in gradients:

1. Use mean-shift algorithm to segment the regions such that they have the same color intensities. K-means or quantization should achieve a similar result, though might not as good as mean-shift.
2. Use image filtering to smooth the image, which can effectively average colors and discard high frequency information.

I'll leave these conjectures for furture implementation. For those intereted, please try them yourself and let me know your results. If you have any thoughts, do not hesitate to leave a comment.

## References

[1] E. Candes and J. Romberg, "l1-magic: Recovery of sparse signals via convex programming," vol. 4, 2005.

[2] J. S. Hesthaven, K. Chowdhary, E. Walsh, and others, "Sparse gradient image reconstruction from incomplete fourier measurements and prior edge information," IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012.

[3] J. K. Pant, W.-S. Lu, and A. Antoniou, "A new algorithm for compressive sensing based on total-variation norm," in Circuits and systems (iSCAS), 2013 iEEE international symposium on, 2013, pp. 1352–1355.

[4] E. J. Candès, J. Romberg, and T. Tao, "Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information," Information Theory, IEEE Transactions on, vol. 52, no. 2, pp. 489–509, 2006.

via

## Monday, May 19, 2014

### Sparse Signal Reconstruction via L1-minimization

This is a follow-up of the previous post on applications of L1 minimization.

As we know, any signal can be decomposed into a linear combination of basis, and the most famous one is Fourier Transform. For simplicity, let's assume that we have a signal that is a superposition of some sinusoids. For example, the following:

 1  .5*sin(3*x).*cos(.1*x)+sin(1.3*x).*sin(x)-.7*sin(.5*x).*cos(2.3*x).*cos(x); 

With discrete consine transform (DCT), we can easily find the coefficients of corresponding sinusoid components. The above example's coefficients (in frequency domain) and signal in time domain are shown in the post figure.

Now, let's assume we do not know the signal and want to reconstruct it by sampling. Theorectically, the number of samples required is at least two times the signal frequency, according to the famous Nyquist–Shannon sampling theorem.

However, this assume zero-knowledge about the signal. If we know some structure of the signal, e.g., the DCT coefficients are sparse in our case, we can further reduce the number of samples required.1

The following code snippet demonstrates how this works. We generate the original signal in time domain and then perform a DCT to obtain the coefficients.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  % sparse signal recovery using L1 rng(0); N = 256; R = 3; C = 2; % some superposition of sinoisoids, feel free to change and experiment f = @(x) .5*sin(3*x).*cos(.1*x)+sin(1.3*x).*sin(x)-.7*sin(.5*x).*cos(2.3*x).*cos(x); x = linspace(-10*pi, 10*pi, N); y = f(x); subplot(R,C,1); coef = dct(y)'; stem(coef); xlim([0 N]); title('Original signal in frequency domain'); subplot(R,C,2); plot(x,y); xlim([min(x) max(x)]); title('Original signal in time domain'); 

Let's assume that we have a device that can sample from the frequency domain. To do this, we create a random measurement matrix to obtain the samples. We use 80 samples here. Note that we normalize the measurement matrix to have orthonormal basis, i.e., the norm of each row is 1, and the dot product of different row is 0.

 1 2 3 4 5 6 7  % measurement matrix K=80; A=randn(K, N); A=orth(A')'; % observations b=A*coef; 

We first try a least-square approach, which boils down to inverse the matrix and obtain $\hat{x}=A^{-1} b$. Note that as A is not square, we are using its pseudo-inverse here. Furthermore, as A is othornormal, its transpose is the same as pseudo-inverse.

 1 2 3 4 5 6 7 8 9 10 11  % min-energy observations c0 = A'*b; % A' = pinv(A) here since A is a full-rank orthonormal matrix subplot(R,C,3); stem(c0); xlim([0 N]); title('Minimum energy recovery - coef'); subplot(R,C,4); y0 = idct(c0, N); plot(1:N, y0,'r', 1:N, y, 'b'); xlim([0 N]); title('Minimum energy recovery - signal'); legend('Recovered', 'Original'); 

As we can see, there are lots of non-zeros in the coefficients, and the recovered signal is very different from the original signal.

Finally, we use L1-minimization for reconstruction. I used lasso to perform a L1-regualarized minimization. Another package that performs various L1-minimization is l1-magic.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14  % L1-minimization [c1, fitinfo] = lasso(A, b, 'Lambda', 0.01); % If use L1-magic % addpath /path/to/l1magic/Optimization % [c1] = l1eq_pd(c0, A, [], b, 1e-4); subplot(R,C,5); stem(c1); xlim([0 N]); title('L1 recovery - coef'); subplot(R,C,6); y1 = idct(c1, N); plot(1:N, y1, 'r', 1:N, y, 'b'); xlim([0 N]); title('L1 recovery - signal'); legend('Recovered', 'Original'); 

The above shows that L1-minimization successfully recovered the original signal. A complete code snippet can be found here.

1. In order to recover f perfectly, we need at least $B \log (N)$ samples (source).

via

## Thursday, May 15, 2014

### A Comparison of Least Square, L2-regularization and L1-regularization

Recovered Coefficients by Different Methods

# Problem Setting

Ordinary Least Square (OLS), L2-regularization and L1-regularization are all techniques of finding solutions in a linear system. However, they serve for different purposes. Recently, L1-regularization gains much attention due to its ability in finding sparse solutions. This post demonstrates this by comparing OLS, L2 and L1 regularization.

Consider the following linear system:

$Ax = y$

where $A \in \reals^{m \times n}$, $m$ is the number of rows (observations) and $n$ is the number of columns (variable dimension), $x$ is the variable coefficients and $y$ is the response. There are three cases to consider:

1. $m=n$. This is the common-seen case. If $A$ is not degenerate, the solution is unique.
2. $m>n$. This is called over-determined linear system. There is usually no solutions, but an approximation can be easily found by minimizing the residue cost $\norm{Ax-y}^2_2$ using least square methods, and it has a nice closed-form solution $x_{ls}=(A^T A)^{-1} A^T y$. In L2-regularization, we add a penalize term to minimize the 2-norm of the coefficients. Thus, the objective becomes: $\min_x \norm{Ax-y}^2_2 + \alpha \norm{x}_2$ where $\alpha$ is a weight to decide the importance of the regularization.
3. $m<n$. This is called under-determined linear system. There is usually no solution or infinite solutions. This is where it get interesting: when we have some prior knowledge in the solution structure, such as sparsity, we can have a 'metric' to find a better solution among a whole bunch. The objective is thus: $\min_x \norm{Ax-y}^2_2 + \alpha \norm{x}_1$ The optimization technique for the above problem is called lasso, and there is an advanced version called elastic net, which combines the L2 and L1 regularization together, hoping to get the advantages of both: L1 regularization finds sparse solution but introduces a large Mean Square Error (MSE) error, while L2 is better at minimizing MSE.

## An Example

In the following, we show their performances by solving a simple case.

regression_ex.m
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60  % Compare Ordinary Least square (no regularization), L2-reguarlized (Ridge), % L1-regualarized (Lasso) regression in finding the sparse coefficient % in a underdetermined linear system rng(0); % for reproducibility m = 50; % num samples n = 200; % num variables, note that n > m A = rand(m, n); x = zeros(n, 1); nz = 10; % 10 non-zeros variables (sparse) nz_idx = randperm(n); x(nz_idx(1:nz)) = 3 * rand(nz, 1); y = A*x; y = y + 0.05 * rand(m, 1); % add some noise % plot original x subplot(2, 2, 1); bar(x), axis tight; title('Original coefficients'); % OLS x_ols = A \ y; subplot(2, 2, 2); bar(x_ols), axis tight; title('Ordinary Least Square'); y_ols = A * x_ols; % L2 (Ridge) x_l2 = ridge(y, A, 1e-5, 0); % last parameter = 00 to generate intercept term b_l2 = x_l2(1); x_l2 = x_l2(2:end); subplot(2, 2, 3); bar(x_l2), axis tight; title('L2 Regularization'); y_l2 = A * x_l2 + b_l2; % L1 (Lasso) [x_l1, fitinfo] = lasso(A, y, 'Lambda', 0.1); b_l1 = fitinfo.Intercept(1); y_l1 = A * x_l1 + b_l1; subplot(2, 2, 4); bar(x_l1), axis tight; title('L1 Regularization'); % L1 (Elastic Net) [x_en, fitinfo_en] = lasso(A, y, 'Lambda', 0.1, 'Alpha', 0.7); b_en = fitinfo_en.Intercept(1); y_en = A * x_en + b_en; MSE_y = [mse(y_ols-y), mse(y_l2-y), mse(y_l1-y), mse(y_en-y)]; disp('Mean square error: ') fprintf('%g ', MSE_y); fprintf('\n\n'); % Plot the recovered coefficients figure, hold on plot(x_l1, 'b'); plot(x_en, 'r'); plot(x, 'g--'); legend('Lasso Coef', 'Elastic Net coef', 'Original Coef'); 

Output:

 1 2  Mean square error: 1.81793e-29 7.93494e-15 0.0975002 0.0641214 

The above code snippets generates an under-determined matrix $A$, and a sparse coefficients which has 200 variables but only 10 of them are non-zeros. Noises are added to the responses. We then run the proposed three methods to try to recover the coefficients. It then generates two plots:

1. The first plot is as shown in the top. As we can see, OLS does a very bad job, though the MSE is minimized to zero. L2-regularization do find some of the sparks, but there are also lots of non-zeros introduced. Finally, L1-regularization finds most of the non-zeros correctly and resembles the original coefficients most.
2. The second plot shows how similar the recovered coefficients by lasso and elastic nets resemble the original coefficients. As we can see, both of them can recover most parts, while elastic nets contain some small 'noise'. However, elastic net yields a slightly better MSE than lasso.
Plot of Coefficients

## Probing Further

Scikit has some excellent examples on regualarization (1, 2). Quora has an excellent discussion on L2 vs L1 regualarization. I found the top three answers very useful in understanding deeper, especially from the Bayesian regularization paradigm perspective by thinking the regularization as MAP (Maximum A Posteriori) that adds a Laplacian (L1) or Gaussian (L2) prior to the original objective.

via

### A Comparison of Least Square

Ordinary Least Square (OLS),

Consider the following linear system:

via

## Monday, May 5, 2014

### Debugging AppleScript: print to a file

Debugging AppleScript is easy when you work with the script editor, simply use log to print out anything in the console. However, after you compiled it to an app, this cannot work anymore.

I find there are several ways to do it in this thread. The two approaches that work best for me are:

1. Use logger to log to the syslog. E.g.,
 1  do shell script "logger -t 'AS DEBUG' " & myObj 

However, I don't know why sometimes this is not logged. So I will use the following:

1. Echo to file
 1 2  do shell script "echo " & quoted form of (myObj as string) & ¬ " >> ~/Desktop/as_debug.txt" 

via Zigang Xiao http://bit.ly/1nWnLGH