Hardware interface. --------------------- Real computer scientists despise the idea of actual hardware. Hardware has limitations, software doesn't. It's a real shame that Turing machines are so poor at I/O. What does any graphics product start with? Well, it is the surface things are being rendered on. We would be talking about raster graphics, the one which is used almost in all cases nowadays. The idea behind it is that the image is composed of little square cells - pixels (pictorial elements) or rasters each having particular colour. In terms of computer hardware it means that some part of the memory contains huge array filled with numbers interpreted as colour of screen pixels. Changing those numbers triggers changes on the screen. What should interactive graphics have? Well, it is plane we render on and the way our program reacts on actions, that is keyboard strikes or mouse moves. This is closely related to one other thing: execution flow. The reason we are talking of those things is: their implementation differ from one computer system to another and if graphic algorithms themselves are somewhat general, the way computer interfaces the screen or handles keyboard events is particular to every hardware on the market. So what is the base of a hardware interface and how can we provide the common denominator for different architectures? Before answering let's consider how any interactive 3-D application works: Since we want to see smooth motion whenever changing our position in the virtual world (or changes in the world itself), frames have to be completely composed when appearing on screen. We don't really want to see the process how polygons or lines appear one after another. What is being done instead is rendering frame in an off screen buffer, and then blitting completely composed frame on the actual visible surface. And no surprise here, the way different systems handle screen accesses does indeed vary. In case of MS-DOS the physical screen memory can be accessed directly. On the other hand when creating MS-WINDOWS,OS/2, X-WINDOWS or NeXTStep applications we can't really write directly to the screen. The memory protection simply won't allow that, so instead different API (Application Interface) functions delegating tasks to the operation system have to be called. (There actually are ways, to bypass some operation systems, but that is more of an exception then a general way to do things). Similar to that we can access keyboard hardware directly in, say, MS-DOS, but hardly under MS-WINDOWS or X-WINDOWS or NeXTStep. Later on the other hand use similar "event" based approach to deal with, beside everything else, keyboard or mouse. That is, operation system would notify our application in pre-determined way if a key was pressed for instance. We also have a bit of a problem finding common grounds in how different systems execute the code. "Event" based systems often presume that code must be executed as a function of occurred event. That is if button was pressed the code associated with this button is executed, this approach is used to the extreme in NeXTStep to lesser extend in MS-windows and to yet much smaller extend in X-Windows. On the other hand regular MS-DOS or in this matter UNIX application don't have built-in event mechanism, and the code is just being executed from the entry point until it exits (or until the operation system won't stand humiliation anymore and dumps core on us as UNIX or just simply freeze speechless as MS-DOS). This constitutes two areas to write code particular to every system: screen/window management and event/execution management. (There are other things too, like sound for example, but I am concentrating on those, interactive graphics related). Screen/window management. ------------------------- Obviously we would need a function opening and closing the output, and since the way to proceed is to make all drawings into an off-screen buffer and then blit it, there must be a routine to take care of that. Now, the output plane itself would either be a hole physical screen or a window allocated by an operation system. Effectively for us the off-screen buffer would be the video memory. This buffer we would also be calling a "colourmap". Few assumptions are to be made about it's format. Every screen/window pixel has to have a colour value associated with it. This value can actually be composed of bits located in different parts of the screen buffer/colourmap - so called planar format, or bunched together in bytes or words each carrying number of colour for some pixel - so called flat format. Different computers would have screen memory and consequently colourmaps having different formats. Most would support several. For the simplicity we would from now on be considering just one format, well supported under different architectures: one byte per pixel flat colourmaps. The first byte in the colourmap would describe colour of the top rightmost pixel, +-+-+-+-+ |G|R| ... 0x1,0x2,..., +-+-+ +-----------+ |G| | <--| palette |-- 0x1,..., +-+-+ | | | | | |0x0 - Blue | ...., +-+-+- |0x1 - Green| | ... |0x2 - Red | | ... | screen | | screen memory +-----------+ pic 1.1 The next byte, second top pixel from the right and so on until the end of the line. From some point bytes would start describing pixels in the second line and so on, until the end of the colourmap. What we have is a one dimensional array covering two dimensional rectangular screen. One other thing - a palette description would allow us to establish relation between values in the colourmap and physical colours on the screen. In this case since we are dealing with one byte per pixel format, there would be 256 possible values each byte can take and consequently 256 different colours accessible at once. The colours would be described in RGB intensities: that is in components of Red, Green and Blue. (There exist, actually, palettless colourmaps where RGB value is stored directly, and not via the palette) The bit sizes of components would be different on different systems. And in some cases, some palette entries might be reserved for the operation system. Most of the time we can take care of that in the hardware interface, If for example we would decide working with 8bit per intensity we can make sure that this value is readjusted inside the interface function to match particular hardware format (If, for example, actual bit size is 6 bits we can take only six leftmost bits from the intensity value and ignore two lower bits). It is now time to write prototypes for functions and structures we talked about: struct HW_palette_cell { int hw_r,hw_g,hw_b; }; int HW_open_screen(char display_name,struct HW_palette_cell palette[256], unsigned char *off_screen_buffer); void HW_blit(void); void HW_close_screen(void); The first function, HW_open_screen, would either set a required screen mode under some systems, or allocate a window under other systems, the ASCII string display_name is added to accommodate X11, or any other hypothetical architecture supporting multiple displays. The array of HW_palette_cell structures would determine colours we would have associated with colourmap values for this screen/window, and the purpose of HW_blit and HW_close_screen functions is self-evident - copying colourmap into physical screen memory in case of HW_blit and deleting window or restoring original screen mode in case of HW_close_screen. Event/execution management. --------------------------- Let's again take a look at how 3-D applications work. If it is an interactive game things on screen are happening all the time. And whether we apply pressure to the keyboard on not, we still can see if it is, say , a flight simulator the ground closing with increasing speed or some poignant creature drooling (best scenario) at us in some "Doom" style game. It all means that frames are being generated all the time. On the other hand code dealing with external exceptions like keyboard strikes is being executed only once such an event occurred. Some other kind of graphics application which doesn't involve motion may render frame only responding to an event and be idle otherwise. For example pressing arrow key to turn an object would cause the application to act and rerender the scene. What we are going to do would suit both situations. void HW_run_event_loop(void (*APP_main)(void), void (*APP_key_hadler)(int key_code) ); void HW_quit_event_loop(void); Now, in HW_run_event_loop, APP_main is a pointer to a function where we want to have code associated with rendering of a frame. This function is to be called continuously, every time creating an image in the colourmap. The other one APP_key_handler is a pointer to a function which would be called when a key was pressed. It would be passed a code associated with this key. Back to described before two approaches, when we don't want to render frames all the time APP_main may contain no code at all, and APP_key_handeler would first deal with key pressed and then rerender the image. The second function: HW_quit_event_loop would basically signal HW_run_event_loop to exit and can be called from insides of either APP_main or APP_key_handler. The HW_run_event_loop is most likely to be implemented as a physical loop inside of which APP_main is unconditionally called, then check for occurred events made and on the condition that this check returns success APP_key_handler is called. That's why the name for similar schemes is: an "event loop". There would be few definitions we would want to bring out in this hardware interface: Codes of keys to be passed into APP_key_handler for example, let's add few: #define HW_KEY_ARROW_LEFT ... #define HW_KEY_ARROW_RIGHT ... #define HW_KEY_ARROW_UP ... #define HW_KEY_ARROW_DOWN ... #define HW_KEY_PLUS ... #define HW_KEY_MINUS ... #define HW_KEY_ENTER ... The dimensions, number of pixels along X and Y axes of the screen or window, also maximum values for both X and Y (that is one less then sizes along axes) and coordinates of the screen centre: #define HW_SCREEN_X_SIZE ... #define HW_SCREEN_Y_SIZE ... #define HW_SCREEN_X_MAX ... #define HW_SCREEN_Y_MAX ... #define HW_SCREEN_X_CENTRE ... #define HW_SCREEN_Y_CENTRE ... One other thing would relate to hardware as well as compiler differences - bit length of numbers. Most of the time we would be using default int size not carrying much as to it's bit length. On few occasions however it would be important to deal with, say, 32bit integers. The problem is that with some compilers such a number would be a regular int, but with other compilers it might be called long int whereas regular int would have size of 16bit. Let's add in the interface a typedef to describe 32bit and 16bit signed and unsigned numbers. typedef ... signed_16_bit; typedef ... signed_32_bit; typedef ... unsigned_16_bit; typedef ... unsigned_32_bit; Although it all is to a large degree a simplification, functionally it would allow us to hide particular qualities of most systems within those 5 interface functions and few definitions, allowing to write machine independent code, and even nicer, to think more in terms of algorithms and less in terms of hardware we are lucky to have access to. MS-DOS. ------- MS-DOS is one of today's favorite operation systems of game programmers, and no, not because it is particularly good, only because it is not that much of a system, controlling very little and allowing to override itself in every inhuman way invented. the only problem, there are quite a few different C compilers on the PC market each of them having particular qualities having to be addressed. Of most popular there are: Turbo C/C++, Borland C/C++ from Borland international, Visual C/C++ from Microsoft, Watcom C , and finally my favorite ( read: the best ) DJGPP a shareware port of unixy GNU C to MSDOS by DJ Delorie. The problem with MSDOS is that it is essentially 16bit operation system, and a program under it uses segmented memory access so that the address of a location in memory is contained in two registers: base register has a number of 16byte paragraph within 1meg of total allowed memory and another register contain an offset from that position. It means that if we want to address more than 64K of memory (maximum value of bytes one can fit in 16bit address) we have to change value in the base register. this quite an ugly scheme causes compiler manufacturers to invent notions of memory models and other weird stuff to fit in. Until recently Borland and to my best knowledge Microsoft compilers would produce only 16bit MS-DOS code. Accessing extended memory (memory above 1meg) from such code might be tricky, trying to fit an application within 640K of total DOS memory can be tricky too. On the other hand today, we have an option of choosing 32bit compiler for MS-DOS, either latest releases of Borland and MS supporting as an option 32bit code for MS-DOS. or Watcom C and DJGPP which were doing it for a while already. 32bit compilers produce code taking advantage of protected addressing modes of i386+ based computers. they make executables to work under DOS extenders, programs that would switch processor into protected mode and manage memory addressing from then on. In most cases "flat" addressing is used which means that any location is addressed by a value located in one 32bit register. no more array size problems with that (in reality processor would use even more strange scheme then with plane 8086 mode segmentation but effectively it is flat mode for the program). DOS-extenders take care of extended memory so for a programmer there is just a big flat memory composed in reality from different kinds of RAM. Under MS-DOS video RAM can be directly accessed. first we would have to set a screen mode using interrupts (preferred) or direct manipulations with video cards (not recommended). And after that all we would be writing to the video memory would be reflected on the screen. the address of VGA (Video Graphics Array- card mostly used today and allowing 256 colours modes) video-RAM is 0xa000:0x0000 (base:offset). DOS-extenders rearrange memory for flat 32bit accesses so it would appear to be in 0xa0000 in case of dos4gw dos-extender from Rational Systems used together with Watcom C or in 0xd0000000 in case of go32 extender used in DJGPP. Implementations for event handling can either be quite tricky or extremely simple. The right way to do it is writing our own code for int 9h, hardware interrupt, the one being caused by keyboard hardware responding to what is happening with keys. Simple way is delegating those tasks to operation system, (or to C stdlib function, which would do delegating for us). This later approach is used in the enclosed code. This code covers 3 compilers we talked about: Turbo/Borland C++, Watcom C, DJGPP-GNU C. The former , Turbo C, is a 16 bit compiler, later are 32 bit compilers, and they are preferable to use for fast graphics, however, especially GNU C, lacks a bit in terms of friendliness which is needed by less experienced users. MS-WINDOWS. ----------- It is not my particular purpose to judge, but the reality is, MS-windows is yet to achieve reputation as a system for fast 3D graphics. (I guess it should read: "for fast anything") Still it might (then again, it might not). The program under MS-windows works quite different from the one under MS-DOS. Windows has an extensive set of API functions an application should use in order to access resources or perform actions. The messages from the operation system including information of events occurred are sent to special functions which we have to write for some of the windows we would want to open. We can't write directly to the screen, what we can do is to call a windows API bitblit function to do it for us. The windows application has to have few distinct parts: a WinMain -entry point function, one or several WndProc functions, that's the ones being called by windows when it feels like to send an event. In order to create a window program has to register some structure which would be shared by all instances of this window. After that we can call a CreateWindow function specifying geometry and style of this particular instance. The memory where we would be making all drawings has to be associated with a bitmap structure. There is one a bit tricky thing- how to manage the palette. The problem is that there might be several colour intensive application active, but hardware has only limited number of palette registers, hence colours accessible at once. To partly solve this problem Windows reserve 20 colours by calling them "system". They would be 10 first and 10 last colours of 256 available in 8bit modes. They are used to paint boarders menus etc. Those can be changed also (except for two: pure black and white) but I would recommend against doing so. In a lot of instances going from 256 to 236 colours doesn't make that much of a difference. How a windows application work? The execution starts from WinMain function. From here we would call all the initialization routines and finally start the "event loop". In the event loop we retrieve next event from the queue, translates the "virtual" key codes (whatever that might mean). After that the event is forwarded to particular WndProc - an event handling routine. Since windows v3.1 don't have true multitasking one has to be careful with implementing this event loop. The only way for other applications to get served is from within our calls to GetMessage or PeekMessage. It is not quite a good idea to locate function calls inside event loop, what can be done instead is putting those calls into WndProc function and making sure from within the loop that required message would be send. The enclosed implementation code is doing just that to manage the control flow. X11. ---- The beauty of X-windows - it is first of all an established protocol implementation of X on any hardware should abide to. So programs written on one UNIX machine supporting X would most likely recompile and execute just fine on some other. It also allows features such as remote clients, when a program is being actually executing on one computer and the graphics output it produces is being shown on another one via the network. X-windows function in the following way: X-server running on a particular workstation accepts requests from local X-applications or actually those running on the remote computers and executes requests allowing applications to display graphics or text. The majority of definition an application would need are concentrated in X11.h header file, functions defined there are located in libX11.a a library which should be linked together with any X application. The way to implement our hardware interface functions is extremely straightforward and short, unlike another previously described windowing system we wouldn't want (although should) point accusing finger upon. Blitting is achieved by XPutImage function. The event loop is exactly what it is- checking for events, and calling functions depending on the event queue state. NeXTStep. --------- NeXTStep first appeared together with NeXT inc. proprietary hardware which was based on 68040 Motorola processors. From some time in the past this powerful operation system became available for IBM PC compatible computers with i386+ processors and later for HP hardware. I believe today it has been ported to other architectures as well. NeXTStep contains a windowing system functioning on top of Mach-O UNIX-like operation system. Unlike previously described systems NeXTStep is internally object-oriented. Majority of things on the screen are indeed managed as objects. NeXTStep applications would usually be written using objective-C (an object oriented extension for C, completely different from C++ ). A lot of things (almost everything ) when creating an application interface can be done visually without writing a single line of code. Tools provided allow to manage projects very efficiently. But everything which is being done visually can also be achieved just by writing code ( as inefficient as it might sound ). The majority of objects provided in developers kit would have outlets to place our own functions into, for example we would usually define a function to be associated with a button, this function would be called when this button is pressed. The windowing manager is making sure the events are flowing to particular objects, but nevertheless we can check the event queue ourselves and there are also provisions for functions which should be executed on the timing basis. This operation system provides a great developer's platform, perhaps the most convenient there is today. As to the overall power, convenience and ease of use of different platforms, I suggest browsing enclosed .c codes of hardware interfaces counting their lengths and occurrences of places impossible to understand sober, ...guess who comes out looser?... * * *