JavaScript: Does removing comments from a function make it faster?

tung's picture

A few days ago putting comments in a function slow them down. I decided to put this to the test: I couldn't believe it unless I saw it.

The method I used was a statistical hypothesis test. I'm not here to teach maths (I wouldn't be very good at it anyway), but the process involves gathering raw data, setting up hypotheses about what distribution those data fit, and calculating how likely the data fit that distribution. The higher the likelihood, the stronger the case for the hypothesis that nothing is happening.

(Somewhat) relevant info

Here, I take two JavaScript source files, differing only in the amount of non-code text present, and run them through the JavaScript console that comes with Mozilla's SpiderMonkey.

perf1.js:

function f1(n) {
    /**
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     * lolololololololololololololololololololololololololo
     */
    var i = n + 1;
    return i;
}

perf2.js:

function f2(n){var i=n+1;return i}
for (var i=0;i<10000000;++i)f2(42);

I'm running SpiderMonkey 1.7:

[tung@eee ~/Code/JavaScript]$ js -v
JavaScript-C 1.7.0 2007-10-03

Each sample can be obtained using the Unix time utility.

[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.239s
user    0m10.143s
sys 0m0.010s

real is wall-clock time, user is the CPU time spent on the user-space process, and sys system CPU time (mostly system call overhead). We choose the user time, since it's where the JavaScript process is running. real would include process scheduling overhead, while sys would only measure things like file loading, so we don't use those numbers.

Data

I sampled 10 running times for perf1.js and then 10 for perf2.js:

[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.239s
user    0m10.143s
sys 0m0.010s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.456s
user    0m10.159s
sys 0m0.010s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.323s
user    0m10.289s
sys 0m0.007s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.461s
user    0m10.383s
sys 0m0.023s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.481s
user    0m10.089s
sys 0m0.017s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.294s
user    0m10.289s
sys 0m0.003s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.272s
user    0m10.193s
sys 0m0.017s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.307s
user    0m10.283s
sys 0m0.003s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.518s
user    0m10.199s
sys 0m0.023s
[tung@eee ~/Code/JavaScript]$ time js perf1.js 
 
real    0m10.519s
user    0m10.129s
sys 0m0.003s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.407s
user    0m10.319s
sys 0m0.013s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.477s
user    0m10.216s
sys 0m0.003s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.398s
user    0m10.123s
sys 0m0.010s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.619s
user    0m10.209s
sys 0m0.010s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.250s
user    0m10.153s
sys 0m0.010s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.572s
user    0m10.236s
sys 0m0.007s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.303s
user    0m10.296s
sys 0m0.003s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.276s
user    0m10.176s
sys 0m0.003s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.476s
user    0m10.279s
sys 0m0.003s
[tung@eee ~/Code/JavaScript]$ time js perf2.js 
 
real    0m10.678s
user    0m10.169s
sys 0m0.007s

perf1.js runtime data without the cruft lines:

user 0m10.143s
user    0m10.159s
user    0m10.289s
user    0m10.383s
user    0m10.089s
user    0m10.289s
user    0m10.193s
user    0m10.283s
user    0m10.199s
user    0m10.129s

perf2.js runtime data without the cruft lines:

user 0m10.319s
user    0m10.216s
user    0m10.123s
user    0m10.209s
user    0m10.153s
user    0m10.236s
user    0m10.296s
user    0m10.176s
user    0m10.279s
user    0m10.169s

Null hypothesis

The null hypothesis is the opposite of the proposed idea in play. A low probability for the results against this is evidence for the alternate hypothesis. Conversely, a high probability here argues that this null hypothesis holds.

Also, it's hard to type subscripts here, so I just use and underscore to indicate them.

H_0: Comments in a function do not affect runtime performance.

mean(x) = mean(y)

x consists of the samples of perf1's runs, and y consists of the samples of perf2's runs.

Alternate hypothesis

H_1: Comments in a function affect runtime performance.

mean(x) != mean(y)

Test statistic

In the t_(n_x + n_y - 2) probability distribution (using a variant of the t-test):

tau = (mean(X) - mean(Y)) / (S_p * sqrt(1/n_x + 1/n_y))

tau: test statistic

X and Y: random variables representing the data sets

S_p: random variable representing the pooled, common standard deviation gotten from the standard deviations of each data set (more how to calculate this below)

n_x and n_y: the number of samples in each data set

We use this because there are two random, independant samples and we want to see how (un)likely it is that they share a common mean.

How to get S_p?

S_p^2 = ((n_x - 1) * S_x^2 + (n_y - 1) * S_y^2) / (n_x + n_y - 2)

S_p is the pooled standard deviation, while S_p^2 is the pooled variance.

n_x and n_y are as above.

S_x^2 and S_y^2 are the variances of each data set. Consult your stats course textbook for how to calculate that from the data.

Sampling distribution of the test statistic

As above, it's t_(n_x + n_y - 2), i.e. the so-called "Student's t distribution", order n_x + n_y - 2.

What values of the test statistic argue against the null hypothesis?

A large observed tau argues against the null hypothesis.

What is the observed value for the test statistic?

First we need the variances S_x^2 and S_y^2:

S_x^2 = 1 / (n - 1) * (sum(x^2 for each x) - 1 / n * (sum(all x))^2)

x^2 for each x (all values are in seconds):

10.143 -> 102.880449
10.159 -> 103.205281
10.289 -> 105.863521
10.383 -> 107.806689
10.089 -> 101.787921
10.289 -> 105.863521
10.193 -> 103.897249
10.283 -> 105.740089
10.199 -> 104.019601
10.129 -> 102.596641
sum = 1043.660962
sum of all x = 102.156

S_x^2 = 1 / (10 - 1) * (1043.660962 - 1 / 10 * 102.156^2)
        = 1 / 9 * (1043.660962 - 1 / 10 * 102.156^2)
        = 0.008458711

S_y^2 = 1 / (n - 1) * (sum(y^2 for each y) - 1 / n * (sum(all y))^2)

y^2 for each y:

10.319 -> 106.481761
10.216 -> 104.366656
10.123 -> 102.475129
10.209 -> 104.223681
10.153 -> 103.083409
10.236 -> 104.775696
10.296 -> 106.007616
10.176 -> 103.550976
10.279 -> 105.657841
10.169 -> 103.408561
sum = 1044.031326
sum of all y = 102.176

S_y^2 = 1 / (10 - 1) * (1044.031326 - 1 / 10 * 102.176^2)
        = 1 / 9 * (1044.031326 - 1 / 10 * 102.176^2)
        = 0.004203156

With S_x^2 and S_y^2 we can get the pooled variance S_p^2 and thus the pooled standard deviation S_p:

S_p^2 = ((n_x - 1) * S_x^2 + (n_y - 1) * S_y^2) / (n_x + n_y - 2)
        = ((10 - 1) * 0.008458711 + (10 - 1) * 0.004203156) / (10 + 10 - 2)
        = (9 * 0.008458711 + 9 * 0.004203156) / 18
        = 9 * (0.008458711 + 0.004203156) / 18
        = (0.008458711 + 0.004203156) / 2
        = 0.006330934

S_p = sqrt(S_p^2)
        = sqrt(0.006330934)

Finally, we can get our observed test statistic tau:

tau = (mean(X) - mean(Y)) / (S_p * sqrt(1/n_x + 1/n_y))
        = (10.2156 - 10.2176) / (sqrt(0.006330934) * sqrt(1/10 + 1/10))
        = (10.2156 - 10.2176) / (sqrt(0.006330934) * sqrt(1/10 + 1/10))
        = -0.002 / (sqrt(0.006330934) * sqrt(1/5))
        = -0.056205796

What is the probability of a discrepancy at least as big as observed?

We look up the absolute value of tau in the t-test table for order 18. Consult your favourite stats table source for a t distribution table, e.g. a stats text book.

The probability we get is only for one-sided tests, but since this test is two-sided, we need to double whatever we get from it.

tau > 0.25 * 2
    > 0.50

That is, getting these measurements given H_0 holds is over 50%.

Findings

The chance of getting these kinds of samples, given that comments do not affect runtime performance, is very high: over 50%.

Conversely, the chance that comments in a JavaScript function do affect runtime performance is less than 50%. How much so is unknown, since the table in my textbook has a lower bound for tau lookups at 0.688 for p = 0.25, a value still not small enough to handle the calculated tau in this hypothesis test.

Correlation is not causation, and this test doesn't "prove" anything. Statistical methods only find likelihoods. To prove if comments have an effect on the runtime performance of functions in JavaScript, one could:

  • Google the question.
  • Ask somebody who works/has worked on Spidermonkey.
  • Read the Spidermonkey source code.
  • Dump the JavaScript byte code for the above source code samples.

Other conclusions

I should have used a spreadsheet for some of the longer calculations. You live you learn.

I could have scattered the comments through the source just to be sure, but I doubt the outcome would be much different.

This would all look a thousand times better in LaTeX. LaTeX is awesome and gets awesome results, but I wanted to keep this accessible.

I don't normally like maths, and I hadn't touched statistics in ages, but, dare I say it, I actually enjoyed doing this. Maybe this is what modern mathematics curriculums are missing: a reason for doing it!

Math majors may notice that the hypothesis test is a bit loose: shouldn't I have tested whether removing comments was faster, not merely different? I modelled it after a question in my stats textbook so I wouldn't screw it up. The numbers may have been different, but the outcome would have been the same, since both sets of hypotheses share their null hypothesis: that nothing is happening.

And there is quite a high probability that nothing is happening.