Microsoft
The evolution of a data structure - the WAVEFORMAT.
Friday, October 19th, 2007Thursday, October 18, 2007 3:13 PM LarryOsterman
The evolution of a data structure - the WAVEFORMAT.
In the beginning, there was a need to be able to describe the format contained in a stream of audio data.
And thus the WAVEFORMAT structure was born in Windows 3.1.
typedef struct WAVEFORMAT {
WORD wFormatTag;
WORD nChannels;
DWORD nSamplesPerSec;
DWORD nAvgBytesPerSec;
WORD nBlockAlign;
} WAVEFORMAT;
The problem with the WAVEFORMAT is that it was ok at expressing audio streams that contained samples whose size was a power of 2, but there was no way of representing audio streams that contained samples whose size was something other than that (like 24bit samples).
So the PCMWAVEFORMAT was born.
typedef struct PCMWAVEFORMAT {
WAVEFORMAT wf;
WORD wBitsPerSample;
} PCMWAVEFORMAT;
If the application passed in a WAVEFORMAT with a wFormatTag of WAVE_FORMAT_PCM, it was required to actually pass in a PCMWAVEFORMAT so that the audio infrastructure could determine the number of bits per sample.
That worked fine and solved that problem, but the powers that be quickly realized that relying on the format tag for extensibility was going to be a problem in the future.
So once again, the structure was extended, and for Windows NT 3.5 and Windows 95, we got the WAVEFORMATEX that we know and love:
typedef struct tWAVEFORMATEX
{
WORD wFormatTag; /* format type */
WORD nChannels; /* number of channels (i.e. mono, stereo…) */
DWORD nSamplesPerSec; /* sample rate */
DWORD nAvgBytesPerSec; /* for buffer estimation */
WORD nBlockAlign; /* block size of data */
WORD wBitsPerSample; /* number of bits per sample of mono data */
WORD cbSize; /* the count in bytes of the size of */
/* extra information (after cbSize) */
} WAVEFORMATEX, *PWAVEFORMATEX, NEAR *NPWAVEFORMATEX, FAR *LPWAVEFORMATEX;
This solved the problem somewhat. But there was a problem - while all the APIs were changed to express a WAVEFORMATEX, there were still applications that passed in a WAVEFORMAT to the API (and there were WAV files that had been authored with WAVEFORMAT structures). The root of the issue is that there was no way of distinguishing between a WAVEFORMAT (which didn’t have a cbSize field) and a WAVEFORMATEX (which did). To resolve this, for WAVEFORMAT structures kept in files, the file metadata provided the size of the structure, so we could use the size of the structure to distinguish the various forms.
When the structure was passed in as a parameter to a function, there was still a problem. For that, the code that parses WAVEFORMATEX structure must rely on the fact that if the wFormatTag field in the WAVEFORAMAT structure was WAVE_FORMAT_PCM, then the WAVEFORMAT structure is actually a PCMWAVEFORMAT, which is the same as a WAVEFORMATEX with a cbSize field set to 0. For all other formats, the code simply assumes that the caller is passing in a WAVEFORMATEX structure.
Unfortunately, the introduction of the WAVEFORMATEX wasn’t quite enough. When you’re dealing with two channel audio streams, it’s easy to simply say that channel 0 is left and channel 1 is right (or whatever). But when you’re dealing with a multichannel audio stream, it’s not possible to determine which channel goes with which speaker. In addition, with a WAVEFORMATEX, there’s still a problem with non power-of-2 formats. This time, the problem happens when you take a 24bit waveformat and try to pack it into 32bit samples - doing this can dramatically speed up any manipulation that needs to be done on the samples, so it’s highly desirable.
So one final enhancement was made to the WAVEFORMAT structure, the WAVEFORMATEXTENSIBLE (introduced in Windows 2000):
typedef struct {
WAVEFORMATEX Format;
union {
WORD wValidBitsPerSample; /* bits of precision */
WORD wSamplesPerBlock; /* valid if wBitsPerSample==0 */
WORD wReserved; /* If neither applies, set to zero. */
} Samples;
DWORD dwChannelMask; /* which channels are */
/* present in stream */
GUID SubFormat;
} WAVEFORMATEXTENSIBLE, *PWAVEFORMATEXTENSIBLE;
In the WAVEFORMATEXTENSIBLE, we have the old WAVEFORMATEX, and adds a couple of fields that allow the caller to specify packing of the samples, and to allow the caller to describe which channels in the stream should be redirected to which speaker. For example, if the dwChannelMask is SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_LOW_FREQUENCY | SPEAKER_TOP_FRONT_LEFT, then channel 0 is the front left channel, channel 1 is the front right channel, channel 2 is the subwoofer, and channel 3 is the top front left speaker. The way you identify a WAVEFORMATEXTENSIBLE is that the Format.wFormatTag field is set to WAVE_FORMAT_EXTENSIBLE and the Format.cbSize field is always set to 0×16.
That’s where things live for now - who knows if there will be another revision in the future.
Filed under: Microsoft History, Audio
Comment Notification
If you would like to receive an email when updates are made to this post, please register here
Subscribe to this post’s comments using RSS
Larry and the “Ping of Death”
Wednesday, October 17th, 2007Tuesday, October 16, 2007 9:36 AM LarryOsterman
Larry and the "Ping of Death"
Also known as “Larry mounts a DDOS attack against every single machine running Windows NT”
Or: No stupid mistake goes unremembered.
I was recently in the office of a very senior person at Microsoft debugging a problem on his machine. He introduced himself, and commented “We’ve never met, but I’ve heard of you. Something about a ping of death?”
Oh. My. Word. People still remember the “ping of death”? Wow. I thought I was long past the ping of death (after all, it’s been 15 years), but apparently not. I’m not surprised when people who were involved in the PoD incident remember it (it was pretty spectacular), but to have a very senior person who wasn’t even working at the company at the time remember it is not a good thing :).
So, for the record, here’s the story of Larry and the Ping of Death.
First I need to describe my development environment at the time (actually, it’s pretty much the same as my dev environment today). I had my primary development machine running a version of NT, it was running a kernel debugger connected to my test machine over a serial cable. When my test machine crashed, I would use the kernel debugger on my dev machine to debug it. There was nothing debugging my dev machine, because NT was pretty darned reliable at that point and I didn’t need a kernel debugger 99% of the time. In addition, the corporate network wasn’t a switched network - as a result, each machine received datagram traffic from every other machine on the network.
Back in that day, I was working on the NT 3.1 browser (I’ve written about the browser here and here before). As I was working on some diagnostic tools for the browser, I wrote a tool to manually generate some of the packets used by the browser service.
One day, as I was adding some functionality to the tool, my dev machine crashed, and my test machine locked up.
*CRUD*. I can’t debug the problem to see what happened because I lost my kernel debugger. Ok, I’ll reboot my machines, and hopefully whatever happened will hit again.
The failure didn’t hit, so I went back to working on the tool.
And once again, my machine crashed.
At this point, everyone in the offices around me started to get noisy - there was a great deal of cursing going on. What I’d not realized was that every machine had crashed at the same time as my dev machine had crashed. And I do mean EVERY machine. Every single machine in the corporation running Windows NT had crashed. Twice (after allowing just enough time between crashes to allow people to start getting back to work).
I quickly realized that my test application was the cause of the crash, and I isolated my machines from the network and started digging in. I quickly root caused the problem - the broadcast that was sent by my test application was malformed and it exposed a bug in the bowser.sys driver. When the bowser received this packet, it crashed.
I quickly fixed the problem on my machine and added the change to the checkin queue so that it would be in the next day’s build.
I then walked around the entire building and personally apologized to every single person on the NT team for causing them to lose hours of work. And 15 years later, I’m still apologizing for that one moment of utter stupidity.
Filed under: It’s Funny
, Microsoft History, Things you shouldn’t do.
Comment Notification
If you would like to receive an email when updates are made to this post, please register here
Subscribe to this post’s comments using RSS
How do I compare two different NetBIOS names?
Thursday, August 23rd, 2007Wednesday, July 11, 2007 1:05 PM LarryOsterman
How do I compare two different NetBIOS names?
On one of our internal aliases, someone asked the following question:
[i]s there any API that I can use to do case insensitive comparison of two OEM strings? (NetBIOS names are encoded in OEMCP.)
Wow, that’s question was a blast from the past. Windows Networking before NT 3.1 (which includes NetBIOS) had this undeclared and undefined construct called the “network codepage”. Essentially an administrator was required to decide what the single codepage was for every computer on the network, and ensure that all computers were running in the same codepage.
History lesson: A NetBIOS “name” is actually a series of 16 octets, and as such can only be compared by memcmp. In DOS 3.1 (1984, which was before DNS was designed), Microsoft layered the concept of a “computer name” on top of a NetBIOS name. It did that by uppercasing (using the internal DOS case mapping table) the computername being contacted and setting that as the NetBIOS name on the PC Lan Adapter.
When DOS 3.3 came out, it’s major innovation was to borrow the concept of a “code page” from IBM’s mainframe systems. Essentially it meant that instead of the case mapping table being hard coded into the OS, it was loaded by an application (chcp). Note that there was still only one codepage per system, and that codepage case maping was still per-machine. As such, if you have machines with more than one codepage on the network, you’re likely to have issues if those computernames contain internationalized characters. We received complaints from customers at the time about this Microsoft’s answer was essentially “don’t use international characters in your computer names”.
Windows 1.0 added a second codepage to DOS, called the “ANSI Codepage” (or Windows Codepage). Windows applications used this codepage, while MS-DOS continued to use the codepage loaded by chcp. This MS-DOS codepage became known as the OEM codepage.
Fast forward 8 years when NT 3.1 came out. NT 3.1 still had a single OEM codepage, but added support for Unicode. Millions rejoiced, especially those customers who got pissed off by our answer from 8 years earlier. NT 3.1’s rules were slightly better than MS-DOS’s rules. NT 3.1 took the Unicode computername and uppercased it using the current active codepage. It then converted that uppercase Unicode string to the single OEM codepage and used that series of octets as the computer name.
The customers who had been pissed off by our answer 8 years earlier were somewhat happier, but not very much. If you have more than one codepage on the network, you STILL can have issues because the upper casing rules are still per-machine, and characters uppercase differently depending on the character set on the machine. Essentially NT 3.1 helped things for some computer names, but we STILL had this undefined, undeclared concept of a “network codepage”.
As far as I know, this is still the state of the world w.r.t. NetBIOS names.
In general, you’re better off matching two computernames (i.e. before the Unicode to OEM conversion) before you try matching two NetBIOS names).
If you were to root cause the problem, the issue is that most networking protocols were not designed with internationalization in mind - as a result, most of them seem to have an assumption that the both sides of the network transaction are running with the same internationalization rules. It’s not surprising, honestly - I was involved in some of the efforts to define internationalization extensions to the IMAP4 protocol and it turned into a swamp pit (the problem is that at the time (late 1990s) there weren’t many international standards for case folding and thus the group was stuck with essentially punting the problem to the host OS, which wasn’t considered a good solution because many OS’s had limited support for supporting multiple case folding tables). As a result, networking protocols that specify case insensitivity tend to describe their command verbs as being in the 7bit ASCII set (which has relatively straightforward case folding rules) and punt the problem of case folding to the server (which essentially means that you either support case sensitivity or you assume some kind of network codepage). Filed under: Microsoft History
Comment Notification
If you would like to receive an email when updates are made to this post, please register here
Subscribe to this post’s comments using RSS
FPO
Thursday, August 23rd, 2007Monday, March 12, 2007 9:44 AM LarryOsterman
FPO
I was chatting with one of the perf guys last week and he mentioned something that surprised me greatly. Apparently he’s having perf issues that appear to be associated with a 3rd party driver. Unfortunately, he’s having problems figuring out what’s going wrong because the vendor wrote the driver used FPO (and hasn’t provided symbols), so the perf guy can’t track the root cause of the problem.
The reason I was surprised was that I didn’t realize that ANYONE was using FPO any more.
What’s FPO?
To know the answer, you have to go way back into prehistory.
Intel’s 8088 processor had an extremely limited set of registers (I’m ignoring the segment registers), they were:
| AX | BX | CX | DX | IP |
| SI | DI | BP | SP | FLAGS |
With such a limited set of registers, the registers were all assigned specific purposes. AX, BX, CX, and DX were the “General Purpose” registers, SI and DI were “Index” registers, SP was the “Stack Pointer”, BP was the “Frame Pointer”, IP was the “Instruction Pointer”, and FLAGS was a read-only register that contained several bits that were indicated information about the processors’ current state (whether the result of the previous arithmetic or logical instruction was 0, for instance).
The BX, SI, DI and BP registers were special because they could be used as “Index” registers. Index registers are critically important to a compiler, because they are used to access memory through a pointer. In other words, if you have a structure that’s located at offset 0×1234 in memory, you can set an index register to the value 0×1234 and access values relative to that location. For example:
MOV BX, [Structure]
MOV AX, [BX]+4
Will set the BX register to the value of the memory pointed to by [Structure] and set the value of AX to the WORD located at the 4th byte relative to the start of that structure.
One thing to note is that the SP register wasn’t an index register. That meant that to access variables on the stack, you needed to use a different register, that’s where the BP register came from - the BP register was dedicated to accessing values on the stack.
When the 386 came out, they stretched the various registers to 32bits, and they fixed the restrictions that only BX, SI, DI and BP could be used as index registers.
| EAX | EBX | ECX | EDX | EIP |
| ESI | EDI | EBP | ESP | FLAGS |
This was a good thing, all of a sudden, instead of being constrained to 3 index registers, the compiler could use 6 of them.
Since index registers are used for structure access, to a compiler they’re like gold - more of them is a good thing, and it’s worth almost any amount of effort to gain more of them.
Some extraordinarily clever person realized that since ESP was now an index register the EBP register no longer had to be dedicated for accessing variables on the stack. In other words, instead of:
MyFunction:
PUSH EBP
MOV EBP, ESP
SUB ESP, <LocalVariableStorage>
MOV EAX, [EBP+8]
:
:
MOV ESP, EBP
POP EBP
RETD
to access the 1st parameter on the stack (EBP+0 is the old value of EBP, EBP+4 is the return address), you can instead do:
MyFunction:
SUB SP, <LocalVariableStorage>
MOV EAX, [ESP+4+<LocalVariableStorage>]
:
:
ADD SP, <LocalVariableStorage>
RETD
This works GREAT - all of a sudden, EBP can be repurposed and used as another general purpose register! The compiler folks called this optimization “Frame Pointer Omission”, and it went by the acronym FPO.
But there’s one small problem with FPO.
If you look at the pre-FPO example for MyFunction, you’d notice that the first instruction in the routine was PUSH EBP followed by a MOV EBP, ESP. That had an interesting and extremely useful side effect. It essentially created a singly linked list that linked the frame pointer for each of the callers to a function. From the EBP for a routine, you could recover the entire call stack for a function. This was unbelievably useful for debuggers - it meant that call stacks were quite reliable, even if you didn’t have symbols for all the modules being debugged. Unfortunately, when FPO was enabled, that list of stack frames was lost - the information simply wasn’t being tracked.
To solve the is problem, the compiler guys put the information that was lost when FPO was enabled into the PDB file for the binary. Thus, when you had symbols for the modules, you could recover all the stack information.
FPO was enabled for all Windows binaries in NT 3.51, but was turned off for Windows binaries in Vista because it was no longer necessary - machines got sufficiently faster since 1995 that the performance improvements that were achieved by FPO weren’t sufficient to counter the pain in debugging and analysis that FPO caused.
Edit: Clarified what I meant by “FPO was enabled in NT 3.51″ and “was turned off in Vista”, thanks Steve for pointing this out.
Filed under: Microsoft History, Fascinating geek stuff, Nifty Win32 tricks.Vista Ship Gift, Part 2
Wednesday, August 22nd, 2007Friday, February 09, 2007 10:16 AM LarryOsterman
Vista Ship Gift, Part 2
It’s a Microsoft tradition that the people who worked on a project get a copy of the project when it ships. I’ve got copies of OS/2 1.1, NT 3.1, Exchange 4.0, 5.0, 5.5 and 2000 on my shelves, for example, all with their shrinkwrap untouched.
Well, my copy of Vista finally showed up yesterday, and the ship gift people totally outdid themselves this time.
Here is.
Front:
Back:
Side:
You probably can’t make it out in the pictures, but across the front and back is a subtle wash consisting of code - don’t know what code it is, but it’s code.
On the front is the word “HANDCRAFTED” and the Vista logo
On the back is the text:
“BY YOU
We build software line by line, idea by idea, side by side. Our software is an expression of ourselves, our best moments, our toughest challenges, our greatest hopes. So it’s a strange and beautiful day when this handcrafted product leaves our labs and appears on millions of computers around the globe. Remember this day. You have changed the world.”
Inside the fold is a collection of pictures, some from the ship party, some from inside Microsoft:
I normally don’t open the packaging on my ship gifts but in this case I made an exception, because again, this one was special.
Front:
Back:
Side:
The text on the back reads:
“FOR ALL THE…
Delighted customers, great ideas, tough deadlines, clever solutions, lines of code, pages of specs, runs of automation, lines of text, screens of UI, missed dinners, fixed bugs, inspiring teamwork, countless iterations, courage to break the rules, time away from loved ones, times you rose to the occasion, late nights, early mornings, delayed vacations, chances you took, long meetings, short meetings, canceled meetings, killed features, features that wouldn’t die, crashed machines, moments of victory, moments of defeat, coffees, doughnuts, pizzas, beers, relentless dedication, blood, sweat, and tears.
THANK YOU”
And I want to thank whoever it was on the product team that designed this packaging. It’s absolutely awesome, and I think it totally captures the effort that went into Vista. I especially love the text on the inside package.
It’s funny - when the commemorative edition started showing up, I noticed something unique. In my 22+ years at Microsoft, I’ve NEVER seen people take the “thank you” copy of the product out and show it to others. But when we got this copy, there were lots of people walking around the halls showing the box off to others. Every one of them called out the text on the package as being meaningful.
Filed under: Microsoft History
Why was the ability to specify an allocator during CoInitialize removed from the system?
Wednesday, August 22nd, 2007Thursday, February 08, 2007 12:30 PM LarryOsterman
Why was the ability to specify an allocator during CoInitialize removed from the system?
Yesterday I talked about CoGetMalloc. One thing I didn’t include was why the ability to specify an allocator was removed from the system. If you’ve read Raymond’s blog, the answer should be obvious. I suspected it, but wasn’t sure, but after I submitted yesterday’s post, I got an email internally.
The ability to specify an allocator was a feature added back in the days when applications were trusted not to screw up.
Unfortunately, whenever applications are given the ability to screw up, they did. They provided their own IMalloc implementations, many of which didn’t correctly implement the IMalloc contract. Now this isn’t necessarily a big deal - the application just stomped on itself, right, so it’s the application’s fault, right? Well yeah, but sometimes these 3rd party allocators took out windows components as well. All in all, it wasn’t very pretty.
When the 32bit version COM was implemented (for NT 3.5), the decision was made to deprecate the first parameter to CoInitialize/CoInitializeEx. The IMallocSpy API was added to allow applications the abiity to track leaks and monitor memory. The COM guys were able to get away with this breaking change because all 32 bit applications were new applications, thus no existing applications would be broken by the change.
Filed under: Microsoft History, Fascinating geek stuff, Nifty Win32 tricks.
Hello world!
Tuesday, August 21st, 2007Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!






