Darel Rex Finley in PhotoBooth

Faster Is Better

2007.11.21   prev     next

C is fast. Faster than most (all?) other popular programming languages. Fast enough that, earlier this year Daniel Jalkut called it “the new assembly language” (courtesy Daring Fireball), by which he probably meant that when you need some part of your app to be especially fast (say, a critical graphics function), you write that part in C, but otherwise you use a scripting language, or at least an OO language like ObjC or C++.

It’s easy to think that C++ is just as fast, because “the compilers are so good these days, they can turn my C++ code into machine code that’s probably just as speedy.” But that will be true only if you mix C into your C++ to the right extent. Using C++ as the OO system it was intended to be is not a recipe for C-like speed.

Here’s a simple array access in C:

x[i]

When compiled, how does it translate into machine code? Something like this fictional (pseudocode) assembly:

LOAD REG1 with i
LOAD REG2 with (x) indexed by REG1

Pretty simple.

But what if x is an OO array? How does this:

x[i].value

translate into machine code? My guess is that it becomes something like this:

LOAD REG1 with i
LOAD REG2 with x-relative index of x’s upper-bound value
LOAD REG3 with (x) indexed by REG2
COMPARE REG1 with REG3
BRANCH-IF-LESS to ErrorHandler
LOAD REG2 with x-relative index of x’s lower-bound value
LOAD REG3 with (x) indexed by REG2
COMPARE REG1 with REG3
BRANCH-IF-GREATER to ErrorHandler
LOAD REG2 with index of x’s element-location array
LOAD REG3 with (REG2) indexed by REG1
LOAD REG2 with (REG3) indexed by #[element-relative position of value]

(The above code assumes that the OO array x contains an element-location array — if instead the array is a linked-list where each element includes a pointer to the next element, then the above pseudo code is very overoptimistic.)

Assuming, arguendo, that each machine instruction takes about the same amount of time to execute, the OO-generated code takes about 6 times as long to execute (12 vs. 2 instructions). Or to put it another way, 83% of the processing power is being wasted on OO overhead.

How fast do processors need to get before it’s OK for your app to be wasting 83% of its processing power dealing with OO structures? Answer: A lot faster than they are today. Or will be tomorrow. We’re always coming up with new ways to challenge the speed of our current processors, and there’s no sign that we won’t continue to do so far into the future.

OK, you might say, but surely there are many common apps that don’t need to squeeze super speed out of the processor, right? For example, how fast does a word processor really need to be? Maybe it needs maximum speed when searching large documents for a specified phrase, but when you’re just merrily typing some text, the app is probably idle most of the time, so who cares how efficient the code is, right?

Wrong. When the user is typing text in a word processor, the time the user spends thinking about what to type next is indeed a processing eternity, and no special speed is needed for that. But no code is needed for that either, because an idle app isn’t running code. So it doesn’t matter how achingly long the delay is before the user presses a key — the relevant question is: What happens when the user does press a key?

Keep in mind that any half-ass touch-typist user is going to be pressing another key in maybe 0.1 seconds. Ideally, to make the typing process seem snappy and responsive to that user, the word processing app should update the window contents within a fraction of that 0.1 seconds — say, one quarter of it, which would be 0.025 seconds. So, even if the just-pressed key causes a cascading re-format of the entire paragraph and other text below the insertion point, the change should be almost instantaneous: all finished in just 0.025 seconds. And this must happen even as screen resolutions increase dramatically as they seem about to do in the next several years.

Even an allegedly processor-unhungry app like a word processor needs all the speed it can get. And so does every other app — really, the user wants everything to happen instantly.

Faster is better. Faster processors don’t change that equation, they just draw more tasks into the category of “tolerably fast” that previously resided in the “intolerably slow” bucket. Plus, I don’t think I’ve seen an app yet that didn’t exhibit a noticeable delay when under some computationally stressful situation.

And even if your app really doesn’t do anything processor-demanding enough to need C for near-instantanous reaction to all user input — wouldn’t it be nice if your app left more processing power for other apps and OS processes to use? And what happens if your app is sharing processing power with some other, processor-ravenous app? Every cycle counts.

- - - - -

Update 2008.01.07 — Fixed grammar of “extent” sentence.

 

Hear, hear

prev     next

Favorite links

Starbucks

Apple

Daring Fireball

RoughlyDrafted

Joel on Software

Macalope

Red Meat

Despair, Inc.

Zombie Survival Guide plus Dawn of the Dead (also check out HVZ)

Charlie Superfly Check “The First Time” to hear what she actually sang in the competition. HowardTV ripped it out and spliced in utter crap they had her sing later.

Real Solution #9 (Mambo Mania Mix) over stock nuke tests.

Ernie & Bert In Casino

Great Explanation of Star Wars

TV: Work Out; Confessions of A Matchmaker; Cavemen; Damages; The Shield

My vote for best commercial ever: Royal Bank of Scotland Group — wedding where groom says “Who among us will ever know?” I can’t find it on YouTube — anyone know where it might be?

Previous articles

Behavior and Free Will, Unconfused

“Reduced To” Absurdum

Suzie and Bubba Redneck — the Carriers of Intelligence

Everything You Need To Know About Haldane’s Dilemma

Darwin + Hitler = Baloney

Meta-ware

Designed For Combat

Speed Racer R Us

Bold — Uh-huh

Conscious of Consciousness

Future Perfect

Where Real and Yahoo Went Wrong

The Purpose of Surface

Eradicating Religion Won’t Eradicate War

Documentation Overkill

A Tale of Two Movies

The Changing Face of Sam Adams

Dinesh D’Souza On ID

Why Quintic (and Higher) Polynomials Have No Algebraic Solution

Translation of Paul Graham’s Footnote To Plain English

What Happened To Moore’s Law?

Goldston On ID

The End of Martial Law

The Two Faces of Evolution

A Fine Recommendation

Free Will and Population Statistics

Dennett/D’Souza Debate — D’Souza

Dennett/D’Souza Debate — Dennett

The Non-Euclidean Geometry That Wasn’t There

Defective Attitude Towards Suburbia

The Twin Deficit Phantoms

Sleep Sync and Vertical Hold

More FUD In Your Eye

The Myth of Rubbernecking

Keeping Intelligent Design Honest

Failure of the Amiga — Not Just Mismanagement

Maxwell’s Honey Do?

End Unsecured Debt

The Digits of Pi Cannot Be Sequentially Generated By A Computer Program

Faster Is Better

Goals Can’t Be Avoided

Propped-Up Products

Ignoring ID Won’t Work

The Crabs and the Bucket

Communism As A Side Effect of the Transition To Capitalism

Google and Wikipedia, Revisited

National Geographic’s Obesity BS

Cavemen

Theodicy Is For Losers

Seattle Redux

Quitting

Living Well

A Memory of Gateway

Is Apple’s Font Rendering Really Non-Pixel-Aware?

Humans Are Complexity, Not Choice

A Subtle Shift

Moralism — The Emperor’s New Success

Code Is Our Friend

The Edge of Religion

The Dark Side of Pixel-Aware Font Rendering

The Futility of DVD Encryption

ID Isn’t About Size or Speed

Blood-Curdling Screams

ID Venn Diagram

Rich and Good-Looking? Why Libertarianism Goes Nowhere

FUV — Fear, Uncertainty, and Vista

Malware Isn’t About Total Control

Howard = Second Coming?

Doomsday? Or Just Another Sunday

The Real Function of Wikipedia In A Google World

Objective-C Philosophy

Clarity From Cisco

2007 Macworld Keynote Prediction

FUZ — Fear, Uncertainty, and Zune

No Fear — The Most Important Thing About Intelligent Design

How About A Rational Theodicy

Napster and the Subscription Model

Intelligent Design — Introduction

The One Feature I Want To See In Apple’s Safari