[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ProgSoc] question
On Thu, 11 Apr 2002, Andi Halliday wrote:
> the point to all of this is i'm interested if any of u have any idea
> of how this sort of thing is possible given the size and would be
> willing to discuss it. i have an interest in compression and graphics
> and i just can't comprehend how it does what it does with such a tiny
> file. any ideas?
Well okay, we have something that could get interesting when you go
in-depth here. There are an infinite number of ways to compress things,
and they all work on a vague principle, to one degree or another, of
describing what needs to be the output, rather than giving you the output
itself. So, in human terms, it would be a waste of your time and mine to
show you a large number with a hundred zeros on it, but it would be
quicker to say the world "google" (or "googol"?) and then you can turn
that into a number. I could also write "10^100" and save space.
The tricky thing is always deciding how to compress, not how to
decompress. I can make up specific examples for a few cases like the
google, but with much larger numbers I can't just make up names so I have
to go with some sort of systematic approach, like "10^100" and hope that I
can fit it in. Computer programs and data are just lots of large numbers
anyway ... a megabyte is just 8 megabits and so it's a binary number
that's eight million digits long. If you break it up, and there's a bit
of structure in there, you might be able to take the simpler parts and
represent them with shorter things instead. Some parts will be just
random and they're always going to be harder to compress. Writing a
computer program to do this sort of thing for you (eg, a zip program) is
going to be able to sort through this nice and quickly, but it's only a
program, and needs to cope with any file you ever throw at it, so it'll
have to take a fairly generic, systematic approach to this. Most things
out there, averaged up together, get a compression ratio of only 2:1 with
this kind of approach.
MP3s are lossy but the point is that you can't tell some of the time.
The MP3 encoders are compressors that know what makes music into the sound
you hear, and compress according to the characteristics that will make a
decompressed sound file that sounds almost, but not quite, like the
original sound you started with. You typically find 10:1 is the ratio
involved here. This is because the encoder is intelligent, and encodes
the sound you think you hear, not the sound you gave it originally, and it
just happens that the sound it's compressing is something that's much more
systematic and has less randomness to it. (Specifically: It picks which
bits you can't hear and "removes" them, with instructions to the decoder
to say "you can get away with recreating nothing in this place, because
the listener won't be able to tell there was something here, and I can't
be bothered including it in the file" -- of course, this instruction is
represented in a very small way, because it happens all the time in MP3s,
and you're basically getting away with next to zero data required for the
"lost" bits in "lossy" compression).
If you knew even more about what you wanted to compress, you, as a human,
could probably describe it in an even briefer way than this. I can say "a
million million megabytes of zeros" quicker than most computers can output
what it represents, and in a lot less space. Of course, the computer
could do this too, but you'd have to teach it English, or work out some
other system.
So what you can do, in certain cases, is write a program specifically to
tell the computer how to create the output for something. The best
example is the program that creates the Mandelbrot set. (Go on Google to
see a picture of this if you don't know what it looks like). Basically if
you think of the Mandelbrot set as a picture you want to store somewhere,
you could spend all day describing its infinite complexities, limit the
detail you give too whatever level of detail you want to have as output,
and you'll end up with something fairly long, a large file to represent
the picture, so this means lousy compression ratios.
But if you know how to create the picture in the first place, you can just
represent the picture with a short mathematical formula: z = a + bi,
and some coordinates to say where to start and finish, also what level of
detail you want to see the Mandelbrot set in, and maybe some instructions
about the colours to use. Now if you turned that into a computer program
you'd need a few extra bits and pieces to get into the detail of creating
the graphics and calculating the formula, but it wouldn't really be such a
big program. Now, you can say that this program is the compressed file;
executing it is doing the decompression; and the output it the
uncompressed data. And with this case, your compression ratio is infinity
because the output you create could be any part of the Mandelbrot set down
to any region, at any magnification.
But it's only for certain cases like that. You basically have to make
them up as you go along. Like, "paint the screen green with it fading to
blue 50% of the way across". You can go into more complex examples. You
could collect a bunch of ways to move a picture around inside a window.
Once you've collected a small set of nice examples, some of which take
input from each other, and you create it to show off in a window, you can
feed it a random number and away it goes, creating full-motion video that
looks kind of generic but is interesting to watch anyway. It's
full-motion video because it is represented by pixels, each have a value,
and the values update every second, so you could record this into one long
number and store the video like that (uncompressed). You could compress
this video like you do any other, by running it through video compression,
and you'd get maybe 10:1. But, the video could be a gigabyte big so it's
still a large compressed video file. Instead, you could grab the code you
used to generate the video, plus some more (tiny pieces of) data to get it
into the right starting point, and it will output the exact same thing
from almost nothing.
Okay, so not very useful unless you actually want to decompress and show
that particular video, or any others like it. Well you can and do, when
you listen to MP3s and watch the visualiser. It takes some of the music
as input data, and the rest is calculated to produce a video animation.
It also takes some other random numbers out of your computer, but if you
could make it take the same numbers, then you'd get the same animation as
an output to the very little pieces of data you started with. But you can
run it without music, and get it to create the same video over and over
again too, as long the program isn't getting random numbers out of the
computer somewhere. So really, you could produce that video with the same
tiny bits of starting data, plus the data of the program itself, and say
"this data self-expands into this other data", and show that it's going
from compressed to uncompressed.
Now with theproduct.de, what I take it they've done is get really really
imaginitive with what the program can output -- it's still really limited
compared to the output you could get out of an MPEG file, which has much
wider possibilities for output -- but you start with this tiny file and
you get a =respectable= amount of different outputs, too many to count,
you'll probably never see them all, but it's not as wide a range as how
many different outputs you could get from an MPEG file.
To compare this, you need to talk about really, =really= huge numbers ...
compressed size output possibilities in 0.1 secs
theproduct 64kB = 512000 bits millions, maybe billions
MPEG 512,000 bits = 2 ^ 5,120,000 * 0.1
0.1 seconds, at a = 2 ^ 5,120,000 * 2 ^ -8 (roughly)
typical 5Mbps bitrate = 2 ^ 5,119,992
It's kind of, your mountain is huge, but my galaxy is bigger.
There are millions of things you can represent in each 0.1 seconds, and in
the next 0.1 seconds you will need to pick another one out of that
millions to choose from. It's interesting, but after a while it gets
predictable and you may want to watch a TV program rather than swirling
colours.
So you switch to an MPEG decoder. I'll watch SBS, it sends me 4.5
megabits every second. Each second will have one of 2 ^ 4,500,000
audiovisual things to show me. In fact it's such a large range to choose
from that it's possible it's something I've never seen before. It's
entirely possible that in a 60 minute program it's all images and sounds
that are new to me, because this hour-long program contains 60 * 60 *
4,500,000 bits of data, which is just over 16,200,000,000 bits of data.
It could be any of 2 ^ 16,200,000,000 different hours of video+audio.
Granted, most of the range of possibilities would be just purely garble.
But leftover within in that range is every single standard-definition TV
program ever made, every single movie or part of one, that you could fit
into an hour. Plus, and this is the freaky bit, every single TV program
or film you could put on TV, that will ever be made in the future.
A better compression scheme, like MPEG-4, will remove more of the garble
combinations and only let you specify something to watch which has has a
significance to us humans. Fewer combinations per second, lower bitrate
and less data required. I could build an AI program and 3D render tools
that produces an hour of Max Headroom, and give it to you with much less
data than an hour of MPEG. But that would get limited after a while, by
repeating the same combinations or combinations that are too similar.
Interactive TV basically sends you just a few kilobytes of data, and keeps
you entertained for minutes at a time with just that. It draws pretty
shapes on the screen and writes text data into particular areas. You can
be entertained by this, if it's done properly. Requires a lot more
crafting and detailed work than pointing a camera at something and dishing
out the data coming out of the camera's cable.
But then, next time a plane parks itself on the 80th floor of a building,
you'll only be able to represent it really well using lots of megabits
each second, to show you a proper video. Just reading about it in text
form is a bit limited to what you can "show".
IMO video compression has been one of the miracle technologies of the
1990s.
CK.
-
You are subscribed to the progsoc mailing list. To unsubscribe, send a
message containing "unsubscribe" to progsoc-request@nospam.progsoc.uts.edu.au.
If you are having trouble, ask owner-progsoc@nospam.progsoc.uts.edu.au for help.