16 Mar 2008
A good, three-part interview with Tim Sweeney (the other FPS graphics guru):
His main thesis is that soon
GPUs will be come so programmable that you won’t bother using a standard
Graphics API to program them. You’ll just fire up a C compiler. I think he’s
12 Mar 2008
John Carmack is experimenting with a “sparse octree” data structure for
accelerating 3D graphics rendering:
Carmack on Octrees
The direction that everybody is looking at for next generation, both console and
eventual graphics card stuff, is a “sea of processors” model, typified by
Larrabee or enhanced CUDA and things like that, and everybody is sort of
waving their hands and talking about “oh we’ll do wonderful things with all
this” but there is very little in the way of real proof-of-concept work going
on. There’s no one showing the demo of like, here this is what games are going
to look like on the next generation when we have 10x more processing power -
nothing compelling has actually been demonstrated and everyone is busy making
these multi-billion dollar decisions about what things are going to be like 5
years from now in the gaming world. I have a direction in mind with this but
until everybody can actually make movies of what this is going to be like at
subscale speeds, it’s distressing to me that there is so much effort going on
without anybody showing exactly what the prize is that all of this is going to
Second-best quote is that he wants lots of bit-twiddling operations-
per-second to traverse the data structures rather than lots of floating-point-
operations-per-second. Must be scary for the Larrabee and NVIDIA CPU
architects to hear that, this late in their design cycles.
Hopefully John will come up with a cool demo that helps everyone understand whether his approach
is a good one or not. Hopefully the Larrabee / NVIDIA architectures are
flexible enough to cope. (Interestingly, no mention of ATI – have they bowed
out of the high-end graphics race?)
09 Mar 2008
This is an interesting role-playing-game set in current-day web forum culture:
It’s somewhat not-safe-for-work, and the humor is
pretty low-brow. But what’s neat is that you play it through your browser, and
it recreates the look-and-feel of web forum culture perfectly. It wouldn’t
surprise me if the authors just captured the HTML for various real-world
forums to create the resources for the game. (Or alternately, created their
own fictional forums using web tools, and then captured the HTML from those
The actual game didn’t hold my interest for very long, but
it’s free and it’s fun for a few days.
21 Feb 2008
Silicon Alley Insider seems to
have the best coverage of the Microsoft / Yahoo Merger. But for the grumpy
inside-Microsoft point-of-view you can’t beat Mini-
21 Feb 2008
The Google Analytics numbers for this blog are dismal. (Hi Mom! Hi Friends!) I
think it’s because right now I don’t have much to say that’s both unique and
interesting. Partly this is because so much of my life is off limits: I don’t
want to talk about the joys & cares of raising a family, and I musn’t talk
about the joys & cares of raising a new product. What’s left are comments on
the general state of the web, and essays on general topics like this one.
write then, and who am I writing for? I write because something inside me
compells me to, and because it helps me think to get my ideas down in written
form. Who do I write for? From my Analytics numbers it’s clear that I’m
writing primarily for search engines (Hi Googlebot!) rather than people. And
that’s something interesting to think about: Baring a world-wide disaster or
cultural revoloution, what I write today will persist for thousands and
probably even millions of years, and will be read countless times by search
engines, and only occasionally, if at all, by people.
My words will be torn
apart and merged with other web pages from other authors, becoming a mulch out
of which new insights will be gleaned. (Hmm, not unlike how my body will be
recyled when I die, its atoms used to make new things.)
Perhaps the last time
my essay will ever be read by a live human is in some far distant future when
some graduate student is writing an essay on early-web-era civilization, and
is trying to find out what those poor benighted souls thought of the future.
No doubt my words will be automatically translated from 21st-
century English into whatever language wins the world-wide language wars.
Perhaps my essay will even be automatically annotated, with a description of
who I was, and a best guess at what I looked like, from searching the world’s
photo archives. There will be footnotes and links to explain the archaic
topics I’m referencing. “Search engine” - they used to store data in seperate
computers, and brute-force building the search index. How primitive! How
And no doubt the grad-student-of-the-future will glance over my words,
then move on to the hundreds of other essays on similar themes. (Good luck
with your own essay, future-guy!)
30 Jan 2008
I just noticed (while reading the 4chan prog forum for the first time) that
Paul Graham has put up a web site for his minimal Lisp language Arc:
Arc The language looks like a nice quiet Scheme-like
Lisp dialect. And it has a nice tutorial,
as you would expect from a Paul Graham language.
26 Jan 2008
3DMark is a GPU/CPU benchmark used by PC gamers to measure system performance.
Here are some great charts showing
My home computer system is very weak compared to these charts, except in one
dimension, which is that my 1600 x 1200 display puts me in the top 10% of
While many people (myself included) have switched to laptops
and/or all-in-ones, if you’re planning on building a new desktop, check out
the Ars Technica system guide. The guide
does a good job of speccing out a “Budget Box”, a “Hot Rod”, and a “God Box”,
and it’s updated every quarter.
25 Jan 2008
Currently I’m reading up on the following computer languages:
Python - fun, easy to learn, batteries included
Boo - fun like Python, but with macros and type
declarations so that it can run fast.
Erlang - very
brief code. I’m impressed by how concise the Wings3D source code is.
Typed Scheme - Scheme with type
checking. (Could in theory run fast.)
I may try implementing my old “Dandy”
game in these languages to see how they feel.
25 Jan 2008
Ever since I’ve upgraded to Apple Macintosh OS X 10.5 Leopard, I’ve run into
problems using the Darwinports “port” command to install new software.
The problem is that for some reason the version of GNU “patch” that I have
installed in /usr/bin/patch is version 2.5.8, and it doesn’t operate the way
that Darwin ports expects. A typical error message is:
---> Applying patches to erlang
Error: Target org.macports.patch returned: shell command " cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_lang_erlang/work/erlang-R12B-0" && patch -p0 < '/opt/local/var/macports/sources/rsync.macports.org/release/ports/lang/erlang/files/patch-toolbar.erl'" returned error 2
Command output: Get file lib/toolbar/src/toolbar.erl from Perforce with lock? [y]
Perforce client error:
Connect to server failed; check $P4PORT.
TCP connect to perforce failed.
perforce: host unknown.
patch: **** Can't get file lib/toolbar/src/toolbar.erl from Perforce
Error: Status 1 encountered during processing.
The work-around is to define the environment variable
POSIXLY_CORRECT=1 , as in:
POSIXLY_CORRECT=1 sudo port install erlang
Now, I’ve done some web searching, and I haven’t seen anyone else complaining
about this problem, so perhaps there’s something odd about my setup.
07 Jan 2008
Yesterday I wrote a web scraper. A web scraper is a program that crawls over a
set of web pages, following links and collecting data. Another name for this
kind of program is a “spider”, because it “crawls” the web.
In the past I’ve
written scrapers in Java and
F#, with good results. But
yesterday, when I wanted to write a new scraper, I though I’d try using a
dynamically-typed language instead.
What’s a dynamically-typed language you
ask? Well, computer languages can generally be divided into two camps,
depending on whether they make you declare the type of data that can be stored
in a variable or not. Declaring the type up front can make the program run
faster, but it’s more work for the developer. Java and F#, the languages I
previously used to write a web scraper, are statically typed languages,
although F# uses type inference so you don’t actually have to declare types
very often – the computer figures it out for you.
In order to scrape HTML you need three things:
- a language
- a library that fetches HTTP pages
- a library that parses the HTML into a tree of HTML tags
Unless you’re using Mono or Microsoft’s Common Language Runtime, the language
you choose will restrict the libraries that you can use.
So, the first thing I
needed to do was choose a dynamic language. Since I just finished reading
“Practical Common Lisp”, an excellent
advanced tutorial on the Lisp language, I though I’d try using Lisp. But that
didn’t work out very well at all. Lisp has neither a standard implementation
nor a set of standard libraries for downloading web pages and parsing HTML. I
did some Googling to try and find some combination of parts that would work
for me. Unfortunately, it seemed that every web page I visited recommended a
different combination of libraries, and none of the combinations I tried
worked for me. In the end I just gave up in frustration.
Then, I turned to
Python. I had not used Python much, but I knew it
had a reputation as an easy-to-use language with a lot of easy-to-use
libraries. And you know what? It really was easy! I did some web searches,
copied some example code, and voila, I had a working web spider in about an
hour. And the program was easy to write every step of the way. I used the
standard CPython implementation for the language, Python’s built-in urllib2
library to fetch the web data, and the Beautiful
Soup library for parsing the
How does the Python compare to Java and F# for web scraping?
- Very brief, easy to write code
- Libraries built in or easy to find
- Lots of web examples
- I didn’t have to think: I just used for loops and subroutine calls.
- Very fast turn-around.
- Easy to create and iterate over lists of strings.
Python non-issues for this application:
- Didn’t matter that the language was slow, because this task is totally I/O bound.
- Didn’t matter that the IDE is poor, using print and developing interactively was fine
- Good IDE (Visual Studio)
- Both URL fetching and HTML parsing libraries built in to CLR
- The CLR libraries for URL fetching and HTML parsing are more difficult to use than Python. It takes more steps to complete similar operations.
- Strong typing gets in the way of writing simple code.
- odd language syntax compared to Algol-derived languages.
- Hard-to-understand error messages from the compiler.
- Mixed functional/imperative programming is more complicated than just imperative programing.
- The language and library encourages you to use advanced concepts to do simple things. In my web scraper I wrote a lot of classes and had methods that took complicated curried functions as arguments. This made the code hard to debug. In retrospect perhaps I should have just used lists of strings, the same as I did in Python. Since F# supports lists of strings pretty well, maybe this is my problem rather than F#’s. ;-)
- Good debugger
- Good libraries
- Very wordy language
- Very wordy libraries
- No standard implementation
- No standard libraries
Looking to the future, I’d be interested in writing a web scraper in
IronPython, which has good IDE support, and in C# 3.0, which has some support
for type inference.
In any event, I’m left with a very favorable impression of
Python, and plan to look into it some more. In the past I was put off from it
because it was slow, but now I see how useful it is when speed doesn’t matter.
[Note: When I first wrote this article I was under the impression that CPython
didn’t support threads. I since discovered (by reading the Python in a
Nutshell book) that it does support threads. Once I knew this, I was able to
easily add multi-threading to the web scraper. CPython’s threads are somewhat
limited: only one thread is allowed to run Python code at a time. But that’s
fine for this application, where the multiple threads spend most of their time
blocked waiting for C-based network I/O ]