Free Web Hosting Provider - Web Hosting - E-commerce - High Speed Internet - Free Web Page
Search the Web

Game programming under Windows using CDX

Abstract: Maintaining constant speed of a game on different computers

Author: Peter Eichler (mailto:peichler@nikocity.de)

1         Requirements:

The reader should have average to good knowledge about programming under Windows, especially using the programming language C++. He also should have a basic understanding of Direct X and the free class library CDX. This abstract refers to the CDX library version 2.4. The user should also have a basic understanding on how a game and a Windows program works internally, since this abstract does not discuss the theory of game programming or Windows programming itself.

In this abstract, the term game will refer to all computer games using animated graphics such as sprites drawn in real time. Although this abstract will talk mostly only about 2D based games, the discussed principles can be used for 3D based games also. This abstract does not refer to games, where real time graphics is not at high priority or simply not present, such as chess games or any kind of board games or turn based games. Please note, that this abstract is no replacement for the description of the Windows API.

2         The Problem

Current PCs consist of a variety of hardware components, each different in quality, price and speed. For instance, modern microprocessors may vary in speed by roughly 250% (considering 400 MHz as low end processors and 1 GHz as high end). Whereas in most cases the user is happy that a given software will run as fast as possible on his computer, a good game should run with the very same speed, regardless what computer will be used (except that the computer should meet the minimum requirements for the game). Otherwise the game will probably be unplayable on very fast computers.

So how can a computer game achieve this? The magic words in this case are of course timing, frame rates and synchronization.

2.1        How a Standard Game Loop Works

Under Windows, a usual game loop will mostly do its work after the same scheme. First, it should process any messages (if present) received by the game window (please note, that even a Direct X full screen game has its own window, although this is completely invisible for the user). Then it should check if the game is still the active application and if so, the game may render the current frame and do all other necessary stuff, e.g. checking input from game devices. If the game is not the active application, the game should simply wait for a message to retrieve. In this way, the game consumes no additional computer resources and the computer is still usable for other tasks. If the current frame or scene was rendered successfully, everything will start all over. Let us have a look to such a loop, where the window messages will be processed and the scene gets rendered:

for(;;)

{

   MSG msg;

   if(PeekMessage(&msg, NULL, 0, 0, PM_NOREMOVE)) // Do we have a message present?

   {

      if(!GetMessage(&msg, NULL, 0, 0 ))          // Can we process the message?

         return msg.wParam;                       // No? Return to Windows!

 

      if(msg.message == WM_QUIT)                  // Check for a quit message

         break;

 

      TranslateMessage(&msg);                     // The usual Windows stuff...

      DispatchMessage(&msg);

   }

 

   if(fGameIsActive)                              // If the game is active either

      RenderScene();                              // enter the game loop or

   else

      WaitMessage();                              // wait to become active again

}

 

For the sake of readability I have left out all initialization of Direct X / CDX, the opening of the game window and the window callback procedure. Also note that the unknown variable in this example, fGameIsActive, is set in a different location (usually in the window callback procedure, processing the message WM_ACTIVATE_APP).

As we can see here, the game will run as fast as possible, when no messages getting processed and the game is active. In this case the loop would reduce (theoretically) itself to a mere:

for(;;)

{

   RenderScene();

}

 

Since this is not what we want, a more sophisticated solution should be found.

3         Two Ways to Solution

In fact there are countless ways to solve this problem, but the abstract will discuss the two most common approaches:

3.1        Constant Frame Rates:

In this case a constant number of frames will be rendered every second. This technique is very similar to a cartoon or movie, where several still pictures combined make the illusion of movement.

To have smooth movement, the number of frames per second must not change over time and the frame rate itself has to be high enough to deceive the human eye (some scientists state, that the human eye can resolve only 24 pictures/second for moving objects, so the frame rate should be higher to fool even very sensitive humans, if possible, try a rate of 30 frames/second or even higher).

The speed of movement (velocity) – e.g. for a sprite or other moving objects – is therefore often described in pixels per frame, instead of pixels per second, although you can derive one from another.

To achieve constant frame rates, one second is divided in many equal sized time slices and the rendering of each frame starts at the beginning of each time slice. The game has to guarantee, that the rendering of each frame takes no longer than one time slice, otherwise the constant frequency of frames will be broken. To calculate the maximum possible frame rate for a game in this case, the amount of time needed to render the most complex scene – the worst case – has to be measured. From this the number of frames per second can be calculated. For example, if the most complex scene in a game needs 25 milliseconds to render, a maximum of 40 frames per second can be achieved (1 second = 1000 milliseconds / 25 milliseconds = 40). Of course you get better frame rates on faster computers but this will not help for slower computers. So always calculate the frame rate on the slowest supported computer system by your game (minimum requirement).

On the other hand, if the game does not need the full time slice to render a frame, this spare time will not be used by the game and either wasted in a busy loop or handed over to the operating system, so other processes can use this amount of time. Although really sophisticated game engines can even use this spare time for other tasks than rendering, this technique will not be discussed in this abstract.

3.1.1       Advantages

This approach is rather easy to implement, is reliable and looks good (very smooth, the higher the frame rate, the better). The game will have the very same frame rate on every computer matching at least the minimum requirements.

3.1.2       Disadvantages

Spare time on faster computers will not be used to increase the frame rate (not the speed). This resource will be wasted. If the game was designed to run on a slow computer, it will always run like using a slow computer. Additional power will not be used.

The best frame rate is not easy to find. Higher frame rates may require faster computers and reduces the time slice for each frame. But the time slice must not be too small; otherwise a complex scene may not get rendered in time.

3.2        Dynamic Frame Rates:

In this case the game will render each frame as fast as possible, not relying on constant time slices. Since complex scenes take longer to render than less complex scenes, the frame rate may vary heavily, depending on what the game has to do.

To avoid stuttering or other signs of speed difference, the game has to measure the amount of time each frame needed to be rendered and adapt the speed of all moving objects accordingly. This sounds more difficult than it is.

The speed of movement is therefore often described in pixels per second, instead of pixels per frame. In this case you cannot derive one from another.

To achieve dynamic frame rates, the time needed to render a frame will be calculated. This is simply done by getting some kind of a time stamp at the beginning of rendering the frame, and a time stamp after rendering the frame. The difference is the amount of time the frame needed to be rendered. Since the game knows how long it took to render the last frame and it also knows the speed of an object as pixel per second, it can therefore calculate the amount of pixels to move the object in the current frame. For example, if an object moves 200 pixels per second, and the last frame took 15 milliseconds to render, the object has (virtually) moved 3 pixels in this time period or delta (200 pixels / 1 second * 15 milliseconds = 200 / 1000 * 15 = 3 pixels). So if the current frame is rendered, the object appears now 3 pixels away from the old position. The fraction of the real movement has to be calculated every frame again for every object.

3.2.1       Advantages

The maximum possible frame rate will be achieved on every computer, although the game is running at the same speed (not frame rate) on different systems. The implementation is not much more complex than the first approach with constant frame rates. So the full power of the computer system will be used.

3.2.2       Disadvantages

Sadly, this approach is not the right one for every type of game. If the fraction of movement of an object is calculated, the result is mostly not an integral number (e.g. the object has to be moved by 3.226 pixels). Due to rounding problems, an object may jump by one pixel. Even worse, this number changes from frame to frame. In fact, the speed of an object is constant over a period of time, but not from frame to frame. This means, that if an object is moving with a constant speed, this will not appear so, because the frame rate is not constant and therefore not the movement per frame. For objects, which accelerate or slow down, this is mostly not visible, because the speed of the object is changing anyway. But for objects with a constant speed, the movement appears anything else than smooth.

This kind of solution is really great for games where nearly the whole content of the screen is changing for every frame and/or the display hardware can smooth or filter objects. This is true for nearly every 3D ego shooter or similar games. For 2D bitmap based games or for games, where only few objects are moving, or where you have smooth a scrolling background (classic side scroller or arcade games) the result is not smooth and therefore it is not advisable to use this approach.

3.3        Timing

As we can see, exact measurement of time is the central point in every of these solutions. This abstract will discuss in the next chapters all alternatives of time measurement, timers and counters supported by Windows, while explaining the first solution of constant frame rates. The explanation of dynamic frame rates will use this knowledge and will only show up the differences to avoid double explanations.

4         Constant Frame Rates

As explained, constant frame rates means, that the game will render a fixed number of frames per second, regardless of the speed of the computer and the complexity of the scene. But how do we get the game to render this fixed number of frames per second? Think of the clock frequency of a microprocessor. Fortunately, Windows has several ways to offer us clocked signals.

4.1        (Periodical) Timers

Do not mistake timers with clocks. Whereas clocks simply measure the time, a timer is like an alarm, sending some kind of a signal, when a given time has been reached or a defined amount of time has been elapsed. Timers can send such signals either once or periodically. Since the game may run a little bit longer than for one frame, the latter should be preferred.

4.1.1       The WM_TIMER Message

This is the first and easiest way of synchronizing the game to one speed. We can make Windows to send us periodically a message to our window queue. We will render only a frame, if such a message is received by our game.

There are two ways of using the WM_TIMER message. Either Windows will send us the message (as any other window message) to the message queue of the window, or it will enter a special callback function specified in the program. To activate the timer, a call to SetTimer() is necessary. The function looks like this:

SetTimer(WindowHandle, SelfDefinedTimerID, TimeSlice, TimerFunction);

If you need only one timer, regardless of the window, you can set the window handle to NULL, the timer ID will be ignored then. If you specify a timer function, this one will be called instead of sending a message. Specifying NULL means sending a message. So the simplest call to SetTimer() is:

SetTimer(NULL, 0, TimeSlice, NULL);

The TimeSlice is the amount of time between to signals, measured in milliseconds. If we want to have 25 signals per second, we must therefore define a time slice of 40 milliseconds (1 sec / 25 = 1000 milliseconds / 25 = 40). If we incorporate this into our standard game loop from the first chapters, this would look like this (the changes are marked in red, for those who have a color printer):

SetTimer(NULL, 0, 40, NULL);

 

for(;;)

{

   MSG msg;

   if(PeekMessage(&msg, NULL, 0, 0, PM_NOREMOVE)) // Do we have a message present?

   {

      if(!GetMessage(&msg, NULL, 0, 0 ))          // Can we process the message?

         return msg.wParam;                       // No? Return to Windows!

 

      if(msg.message == WM_QUIT)                  // Check for a quit message

         break;

 

      if(msg.message == WM_TIMER && fGameIsActive)// Render the scene only on timer signal

         RenderScene();

 

      TranslateMessage(&msg);                     // The usual Windows stuff...

      DispatchMessage(&msg);

   }

 

   if(!fGameIsActive)                             // If the game is not active

      WaitMessage();                              // wait to become active again

}

 

What did we change? First of all, we activate the timer at the beginning of the loop to send us 25 signals per second by defining a time slice of 40 milliseconds. The second change is that we moved the call of RenderScene() from outside the message loop to the inside, since RenderScene()is now called only by receiving the WM_TIMER message. Due to this, we simplified the call to WaitMessage() at the very end of the loop.

Instead of receiving a WM_TIMER message, we can specify a timer function in SetTimer(). In this case, no WM_TIMER message is ever received by the program; instead your timer function will be called directly in a second thread. The advantage of this is that your timer function (actually, this will be in our case RenderScene() itself) is independent of the rest of the program. The disadvantage is, that programming for multiple threads is far more complicated than programming for single threading (think only of concurrent access to the same resources). And, to make it not easier, a second thread makes the game not faster, only slower (even if it is only a mere one or two percent). And since it will not simplify a game or this abstract, this alternative will not be discussed further.

At the end of your program you should call KillTimer() to end the signaling of WM_TIMER messages (not shown in the example).

In principle, we have solved the problem of our abstract, so why it is this not the end of the abstract? Sadly, the usage of WM_TIMER has several drawbacks:

·         Inaccuracy: Although you specify the period between to signals in steps of one millisecond, the resolution of the timer is not better than 55 milliseconds. In other words, regardless of the time slice you specify, it is going to be at least 55 milliseconds between two WM_TIMER messages. It may even took much longer before they are processed, because Windows does not give timer messages any priority, they have to wait in line like any other message. In our example, the time slice between two timer messages is not 40 but 55 milliseconds, such changing the frame rate from 25 frames/second to 18 frames/second without our knowledge.

·         Missing a beat: To avoid a message queue overflow, Windows only sends a timer message, when there is no other timer message present in your message queue. So if Windows is busy doing other things or your game just have to render a scene more complex than expected, your game simply will skip some message rather than racing to catch up. Since most games depend on a frequent scenery update, missing a beat is often not an option.

Due to this, the usage of SetTimer()is not an option for games which rely on stable timing signals. But – as usually for Windows – you can achieve the same goal on different ways.

4.1.2       The Multi Media Timer

Included in the multi media extension to Windows, there is a multimedia timer with a resolution of one millisecond, invoked with timeSetEvent(). Besides the fact, that the resolution is much higher, this timer is even more accurate, because it does not send any timer messages, which may have to wait in a message queue. Every timer activated through timeSetEvent()is being put into a separate thread calling a callback function directly, when the signal occurs.

The usage of timeSetEvent() has the same implications as using a callback function in SetTimer(): the pitfalls of multithreading. Since every timer has its own thread, you cannot avoid the aspects of multithreading. If you use a multi media timer, make sure you understand about synchronization, semaphores and the other aspects of multithread programming.

Although the multi media timer has a resolution high enough to fit the needs for a game, even this timer has at least one drawback:

·         Timer latency. The thread containing the multimedia timer is in a blocked state until the specified time is elapsed. It then will change its state to ready, but will not run, until the current running thread has eaten up its time slice or given away the control of the processor for other reasons. Even raising the thread priority of your timer thread will not solve this problem. The amount of the latency depends on what you will do in your other threads and what other programs will do, which may still running. The latency may vary between 0 to several milliseconds, depending on the current load of threads and their amount of work. This problem of latency refers to all kind of multithreading, not only to timers.

Although WM_TIMER is nearly unusable for games, the usage of the multimedia timer may not. This depends on what for an accuracy you will need for your game. If your frame rate is rather low (and therefore the time slice specified in timeSetEvent() rather high) than the latency of the timer is compared to your time slice probably small enough to be negligibly. If you are unsure if the accuracy and the latency of the multimedia timers will fit your needs, simply skip this alternative and try a better solution.

4.1.3       Conclusion

Timers are not the best alternative to be used in games, although they are easy to use and support the event driven programming technique. Either timers have a bad resolution (inaccuracy) or the signal may not arrive to a specific time (latency). They can be still used for games, where precision is not at high priority.

4.2        Clocks

As stated before, clocks only measure the time itself, but have no way of signaling, that a given period of time has elapsed. This is far away of the event driven programming style and the older style of polling data has to be used. Although clocks do not support time slices like timers, this can be easily achieved by calculating the difference between two calls to a clock.

4.2.1       The System Clock

The system clock will return the number of milliseconds since the operating system has been started (the famous system time). The resolution of the timer is different from operating system to operating system. Under Windows 95/98 the resolution is 1 millisecond, under Windows NT/2000 at least 5 milliseconds. And do not expect, that any successor of Windows 2000 have the same behavior. Fortunately, you can change the resolution of the clock under Windows NT/2000.

Be careful, the system clock is (only) a 32 bit counter, so after 232 milliseconds (which is about 49.71 days), the clock wraps around to 0 again. So do not expect, that a latter call to retrieve the system time will result in a bigger number than the first call. To measure a time slice, always calculate it by the absolute value of the difference of both numbers. Do not laugh, about this matter, a lot of people run their computers day and night, so it is really possible, that a system runs for more than 40 days without interruption.

With the function timeBeginPeriod() the resolution of the counter can be changed. This should always be done in the case that the game is running under Windows NT/2000. The resolution of the counter has to be set back, if the counter is not needed anymore. This is done with the function timeEndPeriod(). The system time can be retrieved using timeGetTime(). An example:

timeBeginPeriod(1);                            // Set clock resolution to 1 ms

 

long lStartTime = timeGetTime;                 // Get the system time

 

// Imagine, something very time consuming happens here

 

long lEndTime   = timeGetTime();               // Get the system time again

long lTimeSlice = labs(lEndTime - lStartTime); // This is the time slice

 

timeEndPeriod(1);                              // Reset the resolution of the clock

 

The calls to timeBeginPeriod() and timeEndPeriod() are only needed once in a program, so do not try to wrap this around every call to timeGetTime(), this is only wasted processor power.

The resolution of 1 millisecond is sufficient for nearly all games, although the functions for retrieving the system time cannot guarantee to support this resolution on each system. So always check the return codes of the functions. And of course, for a complete and detailed explanation of these (and all other) functions, read the documents about the Windows API.

The system clock – and this is valid for all clocks – has much less overhead than timers, because the time is retrieved directly. The operating system does neither need to send any messages nor establish a second thread for any callback procedure. Thus the problem of multithreading programming does not apply to counters. And since the time is retrieved immediately, latency is not a problem either.

4.2.2       The High-Resolution Performance Counter

Although not present on ancient hardware, modern computer have a high-resolution counter, the so called “performance counter” (named that way, because it was first used under Windows only for software performance measurement techniques). The high-resolution performance counter usually makes about 1.1 to 1.2 million ticks per second, thus having a resolution over 1000 times better than the multi media timer. Due to this the counter is optimal for game programming, since it allows a very fine timing.

With the a call to QueryPerformanceFrequency() we can examine the exact frequency (or the amount of ticks per second) of the counter. This number is rather odd and rather high and may be different from system to system, so do not rely on constant numbers. The call looks like this:

BOOL QueryPerformanceFrequency(LARGE_INTEGER* pFrequency);

The function returns FALSE, if there is no counter present in the system. This happens mainly on older computer systems. The pointer pFrequency points to a variable, where the frequency of the counter should be stored. As you can see, this is a LARGE_INTEGER, which is type compatible with _int64, so 64-bit arithmetic is involved here. LARGE_INTEGER is provided for those compilers, which cannot handle 64-bit arithmetic. 32 bit is much too small for this counter, imagine how long a counter would need to wrap around 0, if over 1 million ticks are added each second (in fact: about one hour only). With 64-bit arithmetic the system need much more than 475,000 years to put the counter to 0 again.

Since the frequency is probably not the same you need to measure on of your time slices, you have to recalculate some numbers a bit, but this is fairly easy.

Imagine that you want to have a frame rate of 40 frames per second. Which is the maximum of time your game can spend in rendering one frame? What is your time slice? Well of course 1/40 second. Converted in milliseconds that is 25 milliseconds. Simplified, you could use this:

#define FRAME_RATE 40

#define TIME_SLICE (1000 / FRAME_RATE)

Since the performance counter does not use milliseconds but rather an odd frequency, we have to convert these two frequencies:

_int64 iFrequency;

 

QueryPerformanceFrequency((LARGE_INTEGER*) &iFrequency);

 

_int64 iTicksPerTimeSlice = TIME_SLICE / 1000 * iFrequency;

4.2.3       Conclusion

Clocks (or counters) are much more effective than timers. First of all, the resolution is at least the same, if not better (the resolution of the performance counter is currently unbeatable). Second, the calls to retrieve the counter values have less overhead and work faster. Counters also do not have any latency problems, since the counter values will be retrieved directly.

If possible, use the high-resolution performance counter, which should be present in all modern computer system. The second best alternative is to use the system time. If you are really clever, implement a small wrapper, which checks for the presence of the performance counter and if not, simply uses the system time. If you make this wrapper transparent, the rest of the game has not to care about timers or clocks anymore and will always use the best possible solution.

4.3        Back to Constant Frame Rates

After this lengthy discussion about timers and counters we should come back to the original matter of this big chapter. The matter of constant frame rates was explained in theory some chapters ago and we have found a sample implementation for this with the WM_TIMER message, but now it is time to look at the more practical part. First, get rid of the whole discussion about counter and let us try to implement such a wrapper class for counters we discussed earlier in this abstract. This class is so small; it should be selfexplanatory (for readability, both implementation and definition of this class are put together):

class GameClock

{

 

protected:

 

   BOOL   bHPCpresent;          // Is high resolution performance counter present

   _int64 iFrequency;           // Frequency of the counter

   _int64 iTicksPerTimeSlice;   // Ticks/second of the counter

 

   _int64 iTimeStart;           // Time where clock has started

   _int64 iTimeEnd;             // Time where clock has stopped

 

public:

 

   GameClock(double dTimeSlice)

   {

      // Check frequency of counter and see if it is present

      bHPCpresent = QueryPerformanceFrequency((LARGE_INTEGER*) &iFrequency);

 

      // Convert from milliseconds to ticks per second

      if(bHPCpresent)

         iTicksPerTimeSlice = (_int64) (dTimeSlice / 1000.0 * iFrequency);

      else

      {

         timeBeginPeriod(1);

         iTicksPerTimeSlice = (_int64) dTimeSlice;

      }

 

      // Init time values

      iTimeStart = 0;

      iTimeEnd   = 0;

   }

 

   ~GameClock()

   {

      if(!bHPCpresent)

         timeEndPeriod(1);

   }

 

   void Start()

   {

      // Retrieve the start time

      if(bHPCpresent)

         QueryPerformanceCounter((LARGE_INTEGER*) &iTimeStart);

      else

         iTimeStart = (_int64) timeGetTime();

   }

 

   void Stop()

   {

      // Retrieve the end time

      if(bHPCpresent)

         QueryPerformanceCounter((LARGE_INTEGER*) &iTimeEnd);

      else

         iTimeEnd = (_int64) timeGetTime();

   }

 

   _int64 ElapsedTime()

   {

      // Calculate the time elapsed between Start() and Stop()

      _int64 iResult = iTimeEnd - iTimeStart;

 

      // For the uncommon case, that the counter has wrapped around:

      if(iResult < 0)

         iResult *= -1;

 

      return iResult;

   }

 

   BOOL TimeSliceOver()

   {

      // See, if one time slice is over

      return ElapsedTime() >= iTicksPerTimeSlice;

   }

 

   _int64& StartTime()  { return iTimeStart;         }

   _int64& EndTime()    { return iTimeEnd;           }

   _int64& TicksPerTS() { return iTicksPerTimeSlice; }

};

 

This class is safe against counters which like to wrap around and feel comfortable to people, who likes to increase the measured time slice by using several Stops() but only one Start().

After we have solved finally the problem of time measurement, we can go now over to our well-known message loop and try to bring it up to date. The interesting parts are marked in red:

 

#define FRAME_RATE 40                  // 40 frames per second

#define TIME_SLICE (1000 / FRAME_RATE) // Max. time (in ms) to render 1 frame = 1/40 sec.

 

GameClock Clock(TIME_SLICE);

 

Clock.Start();                                    // Start the clock.

 

for(;;)

{

   MSG msg;

   if(PeekMessage(&msg, NULL, 0, 0, PM_NOREMOVE)) // Do we have a message present?

   {

      if(!GetMessage(&msg, NULL, 0, 0 ))          // Can we process the message?

         return msg.wParam;                       // No? Return to Windows!

 

      if(msg.message == WM_QUIT)                  // Check for a quit message

         break;

 

      TranslateMessage(&msg);                     // The usual Windows stuff...

      DispatchMessage(&msg);

   }

 

   if(fGameIsActive)                              // Is the game the active application?

   {

      Clock.Stop();                               // Stop the clock to see...

 

      while(Clock.TimeSliceOver())                // ... if one time slice has elapsed

      {

         RenderScene();                           // If so, render the current frame and

         Clock.StartTime() += Clock.TicksPerTS(); // set the time for the next frame.

      }

   }

   else