Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Computer Networking over the IEEE-1394 Serial Bus Kelvin Lawson BEng Electronic & Computer Engineering Supervisor: Dr W. Buchanan Sponsor: British Telecom Research Labs April 1999 2 ABSTRACT This report documents the development of software to implement computer networking over the IEEE-1394 Serial Bus. Other than IEEE-1394 it encompasses a number of technologies, most notably the Windows networking model, NDIS, and hardware device drivers. The work was performed on behalf of British Telecom Research Labs, as part of their ongoing research into new communications technologies. The document provides a complete and in depth analysis of this wide-ranging project, and has been written such that it clearly explains the technologies involved. It investigates the practicalities of designing device drivers for IEEE-1394, and presents a working design for a Plug and Play device driver working under Windows. The information herein can also be applied to other device driver projects, as documentation for developers in this field is sparse. Thus the benefits of the experience gained on this project are made available to others. The work carried out has broken new ground in using IEEE-1394 in an innovative fashion, and the product of the work can be used to network Windows PCs over this new bus technology. This can be done seamlessly, without the need to change any current network application software. The project will be completed within budget, and without any support from, or recourse to, the customer, BT Labsystem Administrator Requirements ............................................................................................................... 8 2.5.2 End User Requirements.................................................................................................................................. 8 2.5.3 System Resourcesopology...................................................................................................................................................... 10 3.1.2 Asynchronous & Isochronous Transfer......................................................................................................... 11 3.1.3 1394 Packet Formats ................................................................................................................................... 13 3.1.4 Bus Management.......................................................................................................................................... 13 3.1.5 Cable ........................................................................................................................................................... 14 3.1.6 Transmission Rates ...................................................................................................................................... 14 3.2 THE PROSPECT OF NETWORKING ........................................................................................................................... 15 3.3 WHY IS 1394 NEEDED ?........................................................................................................................................ 17 3.4 PROJECT DEVELOPMENT EQUIPMENT..................................................................................................................... 18 4. NDIS – THE WINDOWS NETWORK DRIVER MODEL ............................................................................... 20 4.1 NDIS & THE OSI MODEL ................................................................................................................................. 22 4.2 WINDOWS DRIVER ARCHITECTURES ...................................................................................................................... 25 4.2.1 Universal Driver Architecture...................................................................................................................... 25 4.2.2 NDIS Architecturesynchronous/Isochronous Traffic ............................................................................................................... 32 5.1.2 Broadcast Networks ..................................................................................................................................... 33 5.1.3 Bus Management.......................................................................................................................................... 34 5.2 IEEE-1394 & NDIS............................................................................................................................................. 35 5.2.1 Legacy VxD Driver ...................................................................................................................................... 35 5.2.2 Miniport NIC Driver .................................................................................................................................... 36 5.2.3 Intermediate Driver / TI API ........................................................................................................................ 36 5.2.4 Intermediate Driver / WDM 1394 Bus Class Driver ..................................................................................... 37 5.2.5 Driver-Type Design Choice.......................................................................................................................... 38 5.2.6 Ethernet Emulation ...................................................................................................................................... 38 5.2.7 1394 Transaction Type................................................................................................................................. 41 5.2.8 MAC Addresses............................................................................................................................................ 42 5.2.9 MAC/Node ID Translation ........................................................................................................................... 43 5.2.10 Broadcast andicrosoft WDEB386 .................................................................................................................................... 54 4 5.5.2 SoftICE ........................................................................................................................................................ 56 5.5.3 Which Debugger to Use ............................................................................................................................... 56 5.6 PROJECT PLANNING .............................................................................................................................................. 57 6. FORMAL DESIGN SPECIFICATION .............................................................................................................. 59 6.1 DEVELOPMENT APPROACH .................................................................................................................................... 59 6.2 FILE MAP ............................................................................................................................................................. 60 6.3 MODULE SPECIFICATION ....................................................................................................................................... 61 6.3.1 tilynx.c ......................................................................................................................................................... 62 6.3.2 card.c........................................................................................................................................................... 71 6.3.3 send.c........................................................................................................................................................... 77 6.3.4 oid.c............................................................................................................................................................. 79 6.3.5 interrup.cesource Allocation/Deallocation................................................................................................................ 90 7.2.2 Hardware Communication ........................................................................................................................... 90 7.3 PLUG AND PLAY ................................................................................................................................................... 90 7.3.1 Integration into Windows Control Panel ...................................................................................................... 90 7.3.2 Self ID Configuration................................................................................................................................... 92 7.3.3 Transmission Between 1394 Nodes............................................................................................................... 92 7.3.4 Communication with Protocol Layer ............................................................................................................ 93 7.3.5 MAC/1394 Translation Table Mechanism .................................................................................................... 93 7.3.6 Broadcast Mechanismperating System......................................................................................................................................... 96 8.1.2 Hardware..................................................................................................................................................... 97 8.1.3 Transport Protocol....................................................................................................................................... 97 8.1.4 Software....................................................................................................................................................... 97 8.1.5 System Resources......................................................................................................................................... 97 8.1.6 Speeds.......................................................................................................................................................... 97 8.1.7 Schedule ...................................................................................................................................................... 97 8.1.8 Budget– COMPANION CD-ROM ........................................................................................................... 108 5 LIST OF FIGURES FIGURE 3-1 - 1394 TOPOLOGY EXAMPLE ..................................................................................................................... 11 FIGURE 3-2 - BANDWIDTH ALLOCATION ON THE 1394 BUS .......................................................................................... 12 FIGURE 3-3 - ASYNCHRONOUS WRITE DATA BLOCK PAYLOAD ..................................................................................... 13 FIGURE 3-4 - 1394 CABLE AND CONNECTORS .............................................................................................................. 14 FIGURE 4-1 - TYPICAL OPERATING SYSTEM ................................................................................................................. 21 FIGURE 4-2 - LAYERS IN THE OSI MODEL.................................................................................................................... 22 FIGURE 4-3 - WINDOWS NETWORK ARCHITECTURE ..................................................................................................... 24 FIGURE 4-4 - UNIVERSAL/MINI DRIVER ARCHITECTURE ............................................................................................... 25 FIGURE 4-5 - ARCHITECTURE FOR NDIS 3.1 OR LATER ................................................................................................ 26 FIGURE 4-6 - NDIS WRAPPER .................................................................................................................................... 29 FIGURE 5-1 - 1394 AND THE OSI MODEL .................................................................................................................... 32 FIGURE 5-2 - INTERMEDIATE DRIVER STRUCTURE........................................................................................................ 36 FIGURE 5-3 - DRIVER POSITION WITHIN NDIS ............................................................................................................. 38 FIGURE 5-4 - ETHERNET/1394 SEND PROCESS ............................................................................................................. 40 FIGURE 5-5 - 1394 ENCAPSULATED ETHERNET FRAME................................................................................................. 40 FIGURE 5-6 - 1394 NODE ID FORMAT ......................................................................................................................... 42 FIGURE 5-7 - ASYNCHRONOUS WRITE QUADLET PAYLOAD .......................................................................................... 46 FIGURE 5-8 - PCI CONFIGURATION CONTROL AND STATUS REGISTERS ......................................................................... 49 FIGURE 5-9 - PCILYNX DATA TRANSFER..................................................................................................................... 50 FIGURE 5-10 - NDIS MINIPORT BUILD PROCESS ......................................................................................................... 52 FIGURE 5-11 - NULL-MODEM CABLE .......................................................................................................................... 54 FIGURE 7-1 - WINDOWS DEVICE MANAGER ................................................................................................................. 91 FIGURE 7-2 - NETWORK PROPERTIES APPLET............................................................................................................... 91 LIST OF TABLES TABLE 3-1 - CABLE TRANSMISSION RATES .................................................................................................................. 14 TABLE 4-1 - NDIS SUPPORT IN WINDOWS ................................................................................................................... 27 TABLE 5-1 - MAXIMUM PAYLOAD SIZES ...................................................................................................................... 41 TABLE 5-2 - CABLE CONNECTORS ............................................................................................................................... 54 TABLE 6-1 - PROJECT FILE-MAP ................................................................................................................................. 60 TABLE 6-2 - MODULE SPECIFICATION FORMAT ............................................................................................................ 61 TABLE 7-1 - TEST SYSTEMS ........................................................................................................................................ 87 TABLE 7-2 - MODULE TEST RESULTS FORMAT ............................................................................................................. 87 TABLE 7-3 - RESOURCE USAGE ................................................................................................................................... 94 TABLE 7-4 - WHQL RESULTS ..................................................................................................................................... 94 TABLE 8-1 - STAGE COMPLETION STATUS ................................................................................................................... 96 ACCOMPANYING MATERIAL CD-ROM containing the complete project source code and the required hardware specifications. 6 1. INTRODUCTION This document relates to an investigation into the IEEE-1394 High Performance Serial Bus, more commonly known as ‘FireWire’. IEEE-1394 is a new bus technology, and current applications have centred on connecting peripheral devices to PCs. This project aims to bring IEEE-1394 into the realms of computer networking. The project has been set by the Local Area Networking team at British Telecommunications Research Labs (BTL), who have also provided the necessary equipment for the work. This team carries out evaluations on new networking technologies, with a view to selecting the best technologies and providing this information to British Telecommunications at large. They also work with the various communications standards bodies, and present their findings such that improvements can be made, and eventual standards can be agreed. Their evaluations generally involve setting up computer networks with these new network media, and carrying out rigorous tests to establish the capabilities and characteristics of the technology. For this work to be carried out on IEEE-1394, they require a Windows device driver that can implement a network over the IEEE-1394 bus. With this in mind, the developer (Kelvin Lawson) has been solely assigned the task of developing this software, and presenting the final result to BTL. The software should be developed from the initial concept to the final implementation, using the IEEE1394 hardware provided by BTL. The work encompasses three key areas: § § § IEEE-1394 bus hardware and protocols The Windows network driver model (NDIS) Transport protocols, such as TCP/IP or IPX/SPX 1.1 Project Objectives On initial consideration of the project, the following objectives were identified: § § § § § Development of a requirements specification Research into IEEE-1394 Research into NDIS Consideration of the implementation of IEEE-1394 within Windows and NDIS Development of a Plug and Play device driver At a functional level, the final device driver should achieve the objectives set in the requirements specification. 1.2 Structure of Report This document is laid out in chronological order, such that the initial requirements specification is read first, to give the reader an idea of the scope of the project. Following this, two chapters present the information required to understand the design considerations. These are the constraints and opportunities of the technology in its proposed application, which must be taken into account in order to justify the ultimate design approach that was used. Following these chapters, the reader is taken through the design of the actual project up to the eventual results of the work and some discussion on the results. 2. FORMAL REQUIREMENTS SPECIFICATION This section serves as a formal agreement between the developer and the customer, British Telecommunications Research Labs (BTL). It identifies the functions that must be performed by the final product, including any hardware and software constraints. Details of the actual method of implementation are left to the discretion of the developer. This student project is a new development and does not benefit from any prior work by BTL. 2.1 Project Overview To develop a Plug and Play device driver that implements Windows networking over the IEEE-1394 Serial Bus. The result should allow a number of PCs to be connected together and share resources in a similar fashion to the facilities available with existing network media, such as Ethernet. These facilities include file access to storage systems, such as hard drives, on networked PCs, and the use of any Windows compatible network-oriented applications, such as e-mail. This should eventually provide the platform for a full analysis of IEEE-1394, including how well the Serial Bus is suited to networking PCs. 2.2 Operating System The target operating system has been chosen to be Microsoft Windows95, due to the large number of systems in the Research Laboratories which run on this system. However, compatibility with later versions of Windows is seen as desirable, in order to allow for future upgrades to the computer installations. 2.3 Hardware The user computer systems are assumed to be based on the Intel architecture with a minimum specification being a Pentium processor, and also to contain a local PCI bus. Further constraints are not necessary except that there is sufficient RAM to run the Windows operating system, and one spare PCI slot. The minimum requirement for RAM can vary, but as a general indication at least 8 Megabytes is desirable. Two Texas Instruments PCILynx development cards have been supplied by BTL, which provide a bridge between the PCI bus and the 1394 Bus. A full description of these cards can be found in the PCILynx Specification1, however some key characteristics are identified as: • • • • • • • Compliant with IEEE 1394-1995 Compliant with PCI Specification revision 2.1 Supports IEEE 1394 transfer rates of 100 and 200 Mbps Provides 5 scatter-gather DMA channels for the following 1394 operations: Asynchronous Transmit & Receive Isochronous Transmit & Receive Supports Plug and Play Specification 8 2.4 Transport Protocol No assumptions have been made about the required transport protocol to be implemented over the 1394 Bus, except that it is available within the Windows Operating System. At the time of writing, available protocols include AppleTalk®, NetBEUI, NWLink, IPX/SPX and TCP/IP. However, current networks within BTL are based on TCP/IP and to ease transition to the new media from older technologies, it is seen as desirable but not required, that the same transport protocol be supported. Support for more than one transport protocol is also identified as useful for BTL, such that a more full analysis of 1394 as a networking media can be performed. 2.5 Software Functions No software requirements have been specified by BTL, other than that the product should fit into Windows as with previously available network media, usually Ethernet. The result should be seamless and invisible to the user, such that all functions normally associated with networking, are available. There follows a summary of the relevant interactions which might be expected by both administrators and the end user, and which should be no different whether using 1394 or Ethernet. 2.5.1 System Administrator Requirements The duties of the system or network administrator, include setting up new PCs and connecting them to the Lab network. The installation of networking devices requires a device driver which is relevant to the hardware. Once the hardware is installed in a spare PCI slot, booting Windows will result in an installation dialogue, which asks the administrator for a disk containing the correct device driver. No other tasks must be performed pertaining to the hardware due to the fact that the device driver and Texas Instruments device are to be Plug and Play compliant. Installation can then ensue by inserting a disk containing the driver and any required installation details. Plug and Play will negate the need for choosing IRQs and memory ranges, all of which is configured completely transparently. Once the device driver has been installed, it should be present in the Windows Control Panel, where it can be identified under the “Network” menu. The next stages involve the setting up of software such as required protocols. This can be done by choosing Network properties in the Control Panel, or from the Network Neighborhood menu. From here the administrator should be able to set the transport protocol in use, currently TCP/IP on the Lab network. It must be possible to “bind” the 1394 device to the required protocol, where a binding describes a virtual connection between the device and the protocol. Further configuration is not related to the device driver, and will be the same regardless of the underlying network card. This includes setting up the IP protocol with information relevant to BTL, such as a unique IP address assigned by the network administration, and eventually setting up the Client type in use. Thus the entire installation system should fit into BTL’s documented procedures. 2.5.2 End User Requirements As with the network administrator, the requirement is that the end user experiences no change in the way they interact with the network, in the transition from old technologies to 1394. Typical interactions with the network include: • • Drive-mapping – Server hard disks are mapped to local disk names. Thus users can maintain private disk space on the server and are able to work from any PC on the network. Also project resources can be pooled on the server. E-Mail – BTL provides employees with e-mail, and this must be maintained. 9 • • • • Intranet/Internet – Each unit and team within BTL maintain web pages on the Intranet and also access to the global Internet is provided through the network. Printers – Printers connected to machines on the network can be pooled, until 1394-capable printers become available. Remote login to distant machines on the network. Distributed processing. These applications are merely provided as an example of important requirements to the end user, and are not a specification of the entire scope of functionality to be provided by the end product. By conforming to the Windows networking standard, all of these applications should be inherently possible. Note that if it is required that a 1394 Bus be internetworked with some other networking media, a Bus Bridge will be required. Provision of such bridges is beyond the scope of this project. 2.5.3 System Resources It is identified as important that the product does not consume a large amount of resources on the host computers. It is noted that current Ethernet networked PCs do not experience degradation in performance when connected to the network, or when the network device driver is loaded. Thus the product should consume as little memory, processor time, or other resource as possible whilst not degrading the performance of the driver. Actual projected memory and other consumption is, however, outwith the confines of this section. 2.6 Speeds The development cards provided by BTL support physical transfer rates up to 200Mbps. To this end, it is a requirement that the product supports this transfer rate, however 400Mbps cards are available now, and higher rates are expected in the future. Thus it is seen as desirable that the product should be easily upgraded to support these prospective rates. 2.7 Project Schedule The development of a network device driver has never previously been undertaken within the team at BTL, or by the developer, thus a time-scale for completion was not specified. BTL have agreed that the work involved may take the date past April 1999. 2.8 Budget It is understood that, within reason, funding is available from BTL, however it is hoped that this will not be necessary. BTL have loaned two Texas Instruments 1394 devices, and one IBM PC system. References 1 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI 10 3. IEEE-1394 The IEEE-1394 High Performance Serial Bus, hereafter referred to as 1394, is an emerging technology that aims to provide an advanced bus for connecting a wide range of electronic devices. Its flexibility should ensure that it is adopted across many applications in both the home and the office, and makes the prospect of this project possible. This chapter introduces the technology, and goes on to discuss the current and future possibilities for its usage. 3.1 Technical Information 1394 was first officially defined as a standard in 1995, which resulted in the IEEE 1394-1995 specification1. This provided for data rates of approximately 100, 200 and 400 megabits per second (Mbps), known as S100, S200 and S400 respectively. Future standards promise higher data rates, and ultimately it is envisaged that rates of 3.2 gigabits per second will be achieved when optical fibre is introduced into the system. 1394 offers an attractive alternative to technologies such as SCSI and it is hoped that it can eventually provide a universal connection to replace many of the older connectors normally found at the back of a standard PC. This should subsequently reduce the costs of production of computer interfaces and peripheral connectors, as well as simplifying the requirements placed on users when setting up their devices. This is made possible by the following features of the 1394 bus: • • • • • • Hot Pluggable – Devices can be added or removed while the bus is still active. Easy to use – There are no terminators, device addressing or elaborate configuration often associated with technologies like SCSI. Flexible topology – Devices can be connected together in many configurations, thus the user need not consider logical locations on the network. Fast – Suitable for high bandwidth applications. Rate mixing – A single cable medium can carry a mix of different speed capabilities at the same time Inexpensive – Targeted at consumer devices. 3.1.1 Topology There are two bus categories defined as cable and backplane. Cable refers to a bus connecting external devices via a cable, whereas backplane refers to an internal bus. The cable environment is described by Jennings (1997)2 as “a non-cyclic network with finite branches consisting of bus bridges and nodes (cable devices)”. A non-cyclic network is one in which there are no loops. The resultant topology is a tree-formation, with devices daisy-chained and branched (where more than one device branch is connected to a device). An example is shown in Figure 3-1, where for example, the 1394 Splitter has three branches and the telephone is daisy-chained from the Digital Camera. 11 Figure 3-1 - 1394 Topology Example The finite branches restriction imposes a limit of 16 cable hops between nodes. Therefore branching should be used to take advantage of the maximum number of nodes on a bus. 6-bit node addressing allows up to 63 nodes on each bus, while 10-bit bus addressing allows up to 1023 buses, interconnected using 1394 bridges. Devices on the bus are identified by node IDs. Configuration of the node IDs is performed by the self ID and tree ID processes after every bus reset. This happens every time a device is added to or removed from the bus, and is invisible to the user. A final restriction is that, using standard cables, the length between nodes is limited to 4.5 metres. This can be increased by adding repeaters between nodes, but lengths are expected to improve as work on the standard ensues. Although a PC is shown in Figure 3-1, a principal advantage of 1394 is that, unlike USB, no PC is actually required to form a bus, and devices can talk to each other without intervention from a computer. The backplane bus is well described by Hoffman (1995)3: “In addition to a cable specification, there is a backplane specification that extends the serial bus internally to a device. The internal 1394 device may be used alone, or incorporated into another backplane bus. For example, two pins are reserved for a serial bus by various ANSI and IEEE bus standards. Implementation of the backplane specification lags the development of the cable environment, but one could image internal 1394 hard disks in one computer being directly accessed by another 1394 connected computer.” The backplane bus is not to be utilised within this project, and as such there shall be no further comment throughout this document. 3.1.2 Asynchronous & Isochronous Transfer One of the key capabilities of 1394 is isochronous data transfer. Both asynchronous and isochronous are supported, and are useful for different applications. Mackenzie (1998)4 describes isochronous transmission as a means to transmit “data like real-time speech and video, both of which must be delivered uninterrupted, and at the rate expected”. By contrast, asynchronous transmission is used to transfer data that is not tied to a specific transfer time. In the context of 1394, asynchronous is the 12 conventional transfer method of sending data to an explicit address, and receiving confirmation when it is received. Isochronous, however, is an unacknowledged guaranteed-bandwidth transmission method, useful for just-in-time delivery of multimedia-type data. An isochronous ‘talker’, requests an amount of bandwidth and a channel number. Once the bandwidth has been allocated, it can transmit data preceded by a channel ID. The isochronous ‘listeners’ can then listen for the specified channel ID and accept the data following. If the data is not intended for a node, it will not be set to listen on the specific channel ID. Up to 64 isochronous channels are available, and these must be allocated, along with their respective bandwidths, by an isochronous resource manager on the bus. Figure 3-2 - Bandwidth Allocation on the 1394 Bus Figure 3-2, as designed by Hoffman3, shows an example situation where two isochronous channels are allocated. These have a guaranteed bandwidth, and any remaining bandwidth is used by pending asynchronous transfers. Thus isochronous traffic takes some priority over asynchronous traffic. By comparison, asynchronous transfers are sent to explicit addresses on the 1394 bus. When data is to be sent, it is preceded by a destination address, which each node checks to identify packets for itself. If a node finds a packet addressed to itself, it copies it into its receive buffer. Each node is identified by a 16-bit ID, containing the 10-bit bus ID and 6-bit node or physical ID. The actual packet addressing however, is 64 bits wide, providing a further 48 bits for addressing a specific offset within a node’s memory. This addressing conforms to the Control and Status Register (CSR) bus architecture standard5. As stated by Jennings (1997)2, “Conformance to ISO/IEC 13213:1994 minimizes the amount of circuitry required by 1394 ICs to interconnect with standard parallel buses”. The 48 bit offset allows for the addressing of 256 terabytes of memory and registers on each node. If the CSR specification is unavailable, some information is contained in the 1394-1995 specification6. 13 3.1.3 1394 Packet Formats There are a number of different packet formats specified in 1394-1995, however only the asynchronous block write will be presented here, as it is the main transaction type used within this project. Figure 3-3 - Asynchronous Write Data Block Payload The asynchronous block write is described in the 1394-1995 specification7 as a packet type that “requests a data block be written to the specified destination address”. It is the packet type used on asynchronous transmits, for a variable length of data. The “destination_ID” field should contain the 16-bit destination node ID, while the “destination_offset” field contains the remaining 48 bits required for CSR-addressing. The data is sent in the data field, which can be any quadlet-aligned length up to a maximum given by the transmission speed. These maximums are shown in Table 5-1. At 200Mbps, for example, the data field may hold anything from 0 to 1024 bytes, in stages of 4 bytes. The header information is followed by a CRC (cyclic redundancy check) for error-checking, as is the block of data. For detailed information on the other fields shown, as well as the other packet types available, consult Clause 6.2 of the 1394-1995 specification8. 3.1.4 Bus Management Two bus management entities are available in the cable environment; the isochronous resource manager and the bus manager. They provide services such as maintaining topology maps, or acting as a central resource from which bandwidth and channel allocations can be made. Further information on bus management can be found in the 1394-1995 specification9. 14 3.1.5 Cable Figure 3-4 shows that the 1394 cable consists of three individually shielded cable pairs. There are two power lines and two (screened) twisted pairs for data and strobe transmission. Figure 3-4 - 1394 Cable and Connectors Lewis (1999)10 provides a very good description of the cable transmission methods, as well as a good overview of 1394-1995 in general. 3.1.6 Transmission Rates As already discussed, the cable rate definitions for 1394-1995 are termed S100, S200 and S400. These do not, however, describe the exact data rates for transmission and these can be found in Table 3-1. As discussed by Lewis (1999)10, “The high data rates are achieved by using differential non return to zero, or nrz, signalling on each shielded twisted pair”. 1394 Definition Actual Data Rate (megabits/sec) S100 (Cable base rate) 98.304 S200 196.608 S400 393.216 Table 3-1 - Cable Transmission Rates The information in this section serves as a brief introduction to the technology, however if more detailed information is required, the 1394-1995 specification1 should be consulted. 15 3.2 The Prospect of Networking A Texas Instruments paper11, states the following regarding the reasons for the development of 1394: “The need for 1394 and other next-generation network topologies and protocols is driven by the rapidly growing need for mass information transfer. Typical LANs and WANs simply cannot provide cost-effective connection capabilities nor do they easily support guaranteed bandwidth for “mission critical” applications. Additionally, parallel high-speed communications such as SCSI are not suited to long distances and do not support live connect/disconnect, making reconfiguration timeconsuming. Other factors driving next generation protocols such as 1394 include the need for reliability, durability and universal interconnection.” Although not stated in the citation, usage of 1394 has been primarily in the area of ‘interoperability’, the connectivity of peripherals. It may eventually displace technologies such as Centronix, RS232 and SCSI, replacing their connectors with only one type, the 1394 connector. However, the citation alludes to the possibility of it being used in LAN and WAN situations, applications which have been largely unexplored. The usage of 1394 in a LAN-type scenario is the subject of this project, and it would be useful to proffer a clear and exact definition of a LAN or computer network here. No clear definition was found, however, which could distinguish such network technologies from those found in peripheral buses, but this discussion proposes some possibilities. Consideration of this question initially resulted in the decision that many computer networking technologies could be used in an interoperability scenario, and reciprocally, peripheral buses could conceivably be used as computer networks. Thus, the distinction between a computer network and a peripheral bus, is less to do with the technology itself, but only the application that it becomes used for. Examples might be the prospect of using Ethernet to connect peripherals, and using SCSI to network computers. Further consideration of these examples, however, presented some speculation on the distinction, which is based on certain characteristics of the technologies. The medium access mechanism may be unsuitable for fair access in a peripheral bus. For example Ethernet, through CSMA/CD, provides a fair access system for all nodes. Equally token ring methods provide a fair system for accessing the medium. A peripheral bus might have a master-slave mechanism, where one PC acts as the master with total responsibility for control of the bus, and would therefore not be suitable for networking PCs with equal priority. Secondly a peripheral bus may not have a transmission protocol which is tailored to suit the medium. As a result, long distances may not be possible between nodes. By contrast, networking technologies could still be suitable for connecting peripherals. In fact, this is already the case with networked printers, usually using Ethernet. Also file servers are in some respects networked peripherals, although an interface bus is used to reach the hard drives. A possible drawback, however, is that traditional network media might be too slow for communication with some peripherals. 16 Given these points, the previous distinction between a computer network and a peripheral bus should be revamped. Although the distinction generally depends on the application, rather than the technology, certain characteristics of the technology will shape what it eventually becomes used for. The above discussion may appear to present the view that networks can be used for peripheral buses, and not the other way round, but this is not necessarily always the case. However some characteristics of peripheral buses are particularly detrimental to computer networking, such as short transmission distances and master/slave bus control. Therefore, in considering the use of 1394 as a computer networking medium, we must consider its characteristics, and their suitability for the proposed application: • • • • • • Fair arbitration – each node has equal access to the bus. Bus management – a central bus controller is optional. Explicit addressing – packets can be addressed to specific nodes. Lengths up to 4.5m High data rates Bridging to form internetworks Given the above points, it should be possible to see that 1394 offers the services normally warranted of a computer network. Where SCSI, for instance, may be most suitable for interoperability only, 1394 is a truly versatile medium, and it should be possible to use it in both applications to good effect. In fact Lewis (1999) 10 describes 1394-1995 as “a cross between a network and a bus extension system”. 17 3.3 Why Is 1394 Needed ? Although Section 3.2 proposed that it is technically possible to use 1394 to network computers, it did not explain why the end user should want to choose it over legacy network media. A primary driving force that should ensure the acceptance of 1394 computer networks in the marketplace is the increase in bandwidth it can offer over currently available technologies. Where networks commonly employ Ethernet at data rates of 10 or 100 Mbps, 1394 can offer rates of 400Mbps today and up to 3.2Gbps in the future. Demand for high bandwidths is ever increasing in the home and office, with the advent of applications such as video-conferencing, and companies can no longer rely on their old Ethernet networks to provide the required bandwidth. 1394, however, offers more advantages than bandwidth, and these were identified as: • • • • Manageability – Networks are easy to set up with little thought for the topology. 1394 offers plug and play, and automatic reconfiguration of device addressing, invisibly to the user or network administrator. This also eases the path of upgrading from older networks to 1394. Inexpensive – 1394 chips are priced for the consumer market, and should eventually be integrated on PC motherboards. Therefore companies need not purchase adapter cards in order to be network-ready. Universal connector – Can use the same port for peripherals and the network, also allowing easy sharing of peripherals over a network. Backwards compatibility – 1394 networking can be integrated into an operating system, while allowing current network applications to work as before. Simplicity may well be key to the adoption of 1394 in the home. While companies can employ network administrators to take care of complex network setup, the average home user may be put off by complicated hardware issues. With the same connector for their printers, cameras, stereos and networks, they can feel at home setting up a simple computer network, with no regard for the topology. Simplicity and price are not offset by poor performance. 1394 networks will provide an advanced network that can handle the coming increase in high-bandwidth applications. It is with this in mind that this project was identified, the ultimate goal of which is to implement such computer networking using 1394. The developer is responsible for taking this from the initial concept into a working realisation. The result of the work is important to British Telecom Research Labs, in order that they can be at the forefront of emerging technologies, both for their own benefit and for the benefit of their many customers. 18 3.4 Project Development Equipment Two adapter cards were provided by British Telecom Research Labs (BTL) for use with the project. These were Texas Instruments TSBKPCI Development Cards, with a physical layer (PHY) capable of 200Mbps transmission. The heart of these cards is a PCILynx ASIC, which provides the capability to transfer data between PCI and 1394 buses. A very complete specification is available12 and should be consulted for detailed information on using the PCILynx, however a list of capabilities is provided herein, to aid a quick understanding of the hardware. Features of the BTL-supplied cards include: • • • • • • • • • • • • • • Compliant with 1394-1995 (and not later draft specifications such as p1394a) Compliant with PCI specification revision 2.113 Generates 32-bit CRC for transmission of 1394 packets Performs 32-bit CRC checking on reception of 1394 packets Supports IEEE transfer rates of 100 and 200Mbps Provides 3 size-programmable FIFOs (Async & Iso Transmit & General Receive) Programmable 5 channel address comparator logic for receiving incoming 1394 packets and assigning them to a DMA channel Supports DMA transfers between 1394 and local bus RAM Provides PCI busmaster function for supporting DMA operations Provides PCI slave function for read-write access to internal registers Implements a 32-bit PCI address-data path Provides PCI address-data parity checking Provides software control of interrupt events Supports Plug and Play specification14 References 1 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE Jennings R. (1997), Fire on the Wire: The IEEE 1394 High Performance Serial Bus, Adaptec Inc. 3 Hoffman G. and Moore D. (1995), IEEE 1394: A Ubiquitous Bus, presented at Compcon’95 in San Francisco, 5 Mar 1995 4 Mackenzie L. (1998), Communications and Networks. McGraw-Hill, p140 5 ISO/IEC 13213:1994, ANSI/IEEE Std 1212, 1994 Edition, Control and Status Register (CSR) Architecture for Microcomputer Buses, ISO/IEC 6 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 3.3 7 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 6.2.2.3.1 8 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 6.2 9 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 3.8 10 Lewis G. (1999), FireWire – A Bus for all Systems ?, Electronics World (Jan 99) 11 Texas Instruments (1998), 1394 Technical Overview, TI 2 19 12 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI PCISIG (1995), PCI Local Bus Specification Revision 2.1, PCI Special Interest Group 14 Microsoft (1999), Plug and Play Design Specification for IEEE 1394, Microsoft Corporation 13 20 4. NDIS – THE WINDOWS NETWORK DRIVER MODEL This chapter should serve as an aid to understanding the Windows network driver environment. An understanding of this environment is vital to comprehending the remainder of this document. Most important is the NDIS Miniport concept, which became the basis for the project, and is therefore the main feature of this chapter. The primary sources of information on developing such software, whilst being useful as a reference or specification, tend not to provide an easy introduction to the subject. To fully grasp the topic, experience in the development of network drivers is most useful, and is the only way to learn the various undocumented nuances that come to light in the process of development. With this in mind, the following chapter has been written to provide the reader with an easy footing into the subject, given the experience gained in the development of this project. A device driver is a piece of software responsible for controlling a hardware device. It knows the characteristics of the hardware it supports, and provides an interface between the hardware, and the operating system or applications, which wish to communicate with the device. Any physically unique device also requires a unique device driver, in order to function within an operating system. Due to the massive range of available devices for any one purpose, it is desirable that the drivers provide a common interface to the software accessing it. If it were not for a common interface, applications would need to be hardware-specific. For example, there are many different network cards on the market at any time, and even more which are no longer in production. Each of the device drivers for these cards needs to be accessed in a specific fashion, which might be entirely different when compared with another manufacturer’s driver and card. For an application, such as a file sharing system, to send and receive network data using the many different hardware configurations, it would need to know the characteristics of all of these device driver interfaces. That means what functions need to be called upon and how. This of course presents an unreasonable task to the software developer, and would result in unwieldy software. In addition to this, and perhaps more importantly, any new hardware which comes on the market would not be supported by the slightly older application. Thankfully, in operating systems such as Windows, a common interface does exist, which makes the job of developing applications and supporting hardware far easier. Thus the hardware and device driver can be changed, and applications will still work. Figure 4-1 represents a typical operating system, and how it interfaces with the hardware it is running on. Only three device drivers have been included for simplicity, however in practice there will tend to be many more. For example, the hardware devices might be a graphics card, a sound card and a network card. Each of these device types has a well-defined interface, particularly between the operating system and the device driver. As long as the device driver conforms to the rules associated with its device type in Windows for example, the operating system and any relevant applications will be able to access it. 21 Figure 4-1 - Typical Operating System This works by writing device drivers which export a particular set of functions, rather like a dynamic library, and applications or the operating system can call on these functions when they wish to program the hardware device underneath. Of course the code within each device driver may vary widely, depending on the hardware it is controlling, but the functions it exports on its upper-edge (towards the operating system) must always conform to the rules governing the driver environment. These interfaces may be common for the same class of device, such as all sound cards, but may be entirely different when compared to other classes of device, and a particularly unique system is NDIS. This is described by Dhawan1: “The NDIS (Network Driver Interface Specification) was developed by 3COM and Microsoft. This interface specification defines an interface between the protocol stacks and network adapter card drivers. The network hardware and its associated NDIS driver are independent of the protocol stack and can communicate with each other and transmit and receive packets if the NDIS specification is used.” NDIS is the specification used when developing device drivers for network cards in all varieties of Windows. It standardises access to network cards, so that the same software may be used to access any brand of network device. It is the aim of this project to integrate 1394 into this environment, particularly with Texas Instruments 1394 cards. There are different versions of NDIS in use, depending on the particular release of Windows. This chapter introduces NDIS, particularly NDIS v4.0, which is primarily used in Windows98 and Windows NT v4.0, however Windows95 can be adapted to also implement this version. For a more complete reference on this specification, the NT v4.0 DDK2 should be consulted, which contains the actual specification for NDIS v4.0. 22 4.1 NDIS & THE OSI MODEL The Windows98 Resource Kit3 describes this relationship as: “The modular networking architecture of Windows 98 is based on two industry standard models for a layered networking architecture, namely the International Standards Organization (ISO) model for computer networking, called the Open Systems Interconnection (OSI) Reference Model, and the Institute of Electrical and Electronics Engineers (IEEE) 802 model. Windows NT and Windows for Workgroups are also designed according to these standard models. The ISO OSI and IEEE 802 models define a modular approach to networking, with each layer responsible for some discrete aspect of the networking process. They are only models; they do not correspond exactly to any existing network. However, they can help you understand how networks function in general.” Although not specified in this citation, the same also applies to Windows95 implicitly, given that this describes Windows for Workgroups and Windows98. The OSI model describes the flow of communication between well-defined layers in a network configuration. These layers are shown in Figure 4-2. Data to be transmitted by an application travels from the Application Layer, down through the other layers to the lowest layer, the Physical Layer, and across the network media, to be transported back up the layers at the destination node. Each layer performs its own functionality, and is only able to communicate with the layer immediately above and below itself, with no knowledge of the other layers. In this way each layer can be implemented as a single component, which does not rely on the conformance of distant layers. This concept has already been introduced above, with regard to NDIS, where applications are able to function properly with no regard to the intricacies of the network card in use. This is because the device driver sits at the Data Link layer of the OSI model, and only talks to the Network layer (the protocol) and the Physical layer (the 1394 network card in this instance). Thus the Application layer is far removed from any of these layers, and must only be able to communicate with the Presentation layer. Figure 4-2 - Layers in the OSI Model 23 To take this notion further, each layer of the OSI model assumes it is communicating directly with the same layer at the node with which it is communicating. Thus as far as any layer is concerned, the data being transferred has not travelled down through the other layers and eventually across the physical medium, but has travelled straight across to its respective layer at the distant node. NDIS device drivers relate to the OSI model at the Data Link layer, and this layer is described by Dhawan4: “The purpose of the Data Link layer is to provide the functional and procedural means to transfer data between network entities and to detect, and possibly correct, errors which may occur in the Physical layer. Data Link layer protocols and services are very sensitive to the physical layer technology. While in the upper layers there is one protocol specified per layer, in the lower layers this is not the case. In order to ensure efficient and effective use of the variety of cabling technologies, protocols designed to their specific characteristics are required.” Within the context of this project, the device driver developer is not only concerned with the characteristics of the 1394 protocol to be used for physical transfer, but also has to consider the Texas Instruments hardware, and how it fits into the local computer hardware. This means knowledge of the PCI internal bus system must be gained. Most importantly, the developer must know exactly how to program the 1394 card itself for data transfer, and provide functionality to convert data (formatted as a frame) into a raw physical bit-stream for the card and 1394 bus, and similarly convert back to frames at the receiver. The IEEE802 project committee further sub-divides the Data Link layer into the LLC and MAC layers, and is concerned only with the bottom two layers of the OSI model. The standards within IEEE802 refer specifically to Local Area Networks (LANs) and Metropolitan Area Networks (MANs). These two layers are described in the NT4 DDK5: “The LLC sublayer provides error-free transfer of data frames from one node to another. This sublayer is responsible for the establishing and terminating logical links, controlling frame flow, sequencing frames, acknowledging frames, and retransmitting unacknowledged frames. The LLC sublayer uses frame acknowledgement and retransmission to provide virtually error-free transmission over the link to the layers above. The MAC sublayer manages access to the network media, checks frame errors, and address recognition of received frames.” Within Windows, the LLC (Logical Link Control) functions are provided by the transport driver, whereas the MAC functionality is assigned to the network interface card, and thus its (NDIS) device driver. The actual Windows network architecture is shown in Figure 4-3, where the Network Adapter, in this instance, can be thought of as a Texas Instruments 1394 card. 24 Figure 4-3 - Windows Network Architecture These layers roughly correspond to the following layers of the OSI model: • The Redirectors, the IFS (Installable File System) Manager and the Network Providers, provide the Application, Presentation and Session layer functionality. • The Transport Protocols, such as TCP/IP or IPX/SPX, correspond to the Transport and Network layers. • NDIS and the network card drivers, reflect the Data Link layer. The network card driver sits specifically at the MAC layer, and receives information from the Transport layer, which is fully packaged in a framed format, ready for transmission on the Physical layer. Detailed description of the upper three layers of the OSI model is deemed not relevant to this document, due to the fact that the project is related to the development of a component at the Data Link layer. If necessary, more information can be found in the OSI Model documentation6, and is also well described by Halsall7. 25 4.2 Windows Driver Architectures The relationship between NDIS and the OSI model has been introduced, which is interesting from a learning perspective, however it does not fully describe the actual implementation of the driver structure in Windows. These device drivers are also layered, most noticeably in the Universal driver/Mini driver architecture. This section introduces this general architecture, and goes on to present the architecture used within NDIS. 4.2.1 Universal Driver Architecture The structure of device drivers has changed since Windows 3.1, where the drivers were complex to develop, and were required to contain many services, which in reality were common across many such drivers. The introduction of the Universal/Mini driver architecture in Windows95, and further in Windows98, has since, simplified driver development. This concept basically defers the common processing tasks to the operating system, leaving the driver to take care of only device-specific code. See Figure 4-4 for a representation. Figure 4-4 - Universal/Mini Driver Architecture The Universal driver contains most of the code necessary for a particular class of device, such as a printer, to communicate with the operating system. The Mini-driver contains only the device-specific code, which would otherwise be unknown to the Universal driver. Note from Figure 4-4, that a Mini-Driver is not always necessary, if the device is simple enough and abides by a common standard. This is illustrated by the Unimodem driver, which is a Universal driver for modems, and can directly work with any modem which supports the standard AT commands8. The only Mini-drivers that were implemented in Windows95, were the Small Computer System Interface (SCSI) and networking drivers. With the advent of Windows98, this architecture has been extended to include many of the newer technologies, such as USB, the 1394 bus, digital audio, DVD players, still imaging, and video capture. Unfortunately this Windows98 1394 support does not make provisions for its use within NDIS, it can only be used by higher level applications to control devices on the bus. In order to use 1394 within NDIS and allow networking protocols to travel over the bus, a new device driver had to be developed by this project, as it has to work at a low-level that is abstracted by the Windows98 support. This is unfortunate, and is because NDIS is far removed from the rest of the operating system, and drivers are not allowed to coexist to in both environments. NDIS is effectively an autonomous entity with no direct operating system calls, the reasons for which are outlined in Section 4.4. 26 For information on driver architectures which predate the Universal driver, the Windows95 DDK9 should be consulted. 4.2.2 NDIS Architecture Like the Mini-driver concept, NDIS versions later than 3.1 implement an architecture that separates device-specific functionality, from the rest of the network system. These device drivers are called Miniports, and they are supported by the NDIS subsystem, known as the “wrapper”. This structure is represented in a simple form in Figure 4-5, where the wrapper is shown as ‘NDIS’ and ‘Ndis.vxd’. This provides a neat interface between the device drivers, and the protocol drivers. Thus the protocol drivers, such as TCP/IP can use a well-defined function set in order to carry out functions such as sending a packet over the network media. Thanks to this interface, the protocol drivers are unaware of the intricacies of the network interface card (NIC) driver, and can function normally regardless of which device driver is lying beneath the interface. Figure 4-5 - Architecture for NDIS 3.1 or later Note that there are two device driver possibilities shown in Figure 4-5; • • Miniport – Simple, small device-specific driver, where common functionality is provided by the wrapper VxD Driver – Does not benefit from the extra services provided by the wrapper. These drivers are larger and more complex as the extra code must be provided within the device driver. More commonly referred to as a Legacy Full driver. NDIS VxD drivers are described by the NT 4.0 DDK10: “Legacy full NIC drivers required driver developers to write a large amount of code to deal with issues that are common across all NDIS drivers. About fifty percent of the code written for a full NIC driver is common to all NDIS drivers, while the other half is specific to the hardware. Writing a full NIC driver places the full burden of coding multiprocessor support and other more difficult operating system programming issues on the NIC driver developer.” A VxD is a Windows95/98 Virtual Device Driver, where ‘x’ represents the device being controlled. For example a VDD is a Virtual Display Adapter. VxDs are more complex to develop than Miniports, and usually require some coding in assembly 27 language. They are OS-dependent, hence they are only supported by Windows 95 and 98, but not in Windows NT. The VxD driver is a massive topic, and if required, a complete description can be found in the Windows95 DDK9. Legacy NDIS drivers are no longer supported by Microsoft, and became defunct after NDIS 3.1. Coupled with this, the entire VxD model has been removed from Windows releases following Windows98. Miniports are now the recommended implementation to use for NIC drivers, and these have a number of benefits: • • • They support newer versions of NDIS, and can thus take advantage of any new features after NDIS 3.1. Only hardware-specific code must be written, and they are therefore smaller and less complex to write. They are OS-independent, hence it should be possible to compile the same source on any version of Windows, without requiring any changes. This issue of working on many operating systems is key to the Miniport concept, and would appear to be where the ‘port’ suffix derives from. The main hindrance to portability, is which version of NDIS the Miniport was designed for. For example, Windows95 and NT 3.51 only support up to NDIS 3.1, and thus Miniports from later versions may not work if the newer extensions are used. Table 4-1 shows the highest version of NDIS supported by different versions of Windows: NDIS 2.0.1 WFW3.1 WFW3.11 NDIS 3.0 NDIS 3.1 Win95 NDIS 3.1 NT 3.51 NDIS 4.0 Win95b NDIS 4.0 NT 4.0 NDIS 5.0 Win98 NDIS 5.0 NT 5.0 Table 4-1 - NDIS Support in Windows Each version of NDIS is backwards-compatible, thus Windows98 provides support for NDIS 2.x up to NDIS 5.0. Note that at the time of writing, NT 5.0 is only in the Beta testing stage, and the actual NDIS implementation may be updated. Win95b refers to Windows95 OSR2, which was never released as an “off-the-shelf” product, and was only available with OEM machines. The original Windows95 release, however, can be upgraded to support NDIS 4.0, and this process is described in APPENDIX A. It was decided that NDIS 4.0 be used within the project, and thus this update has to be performed. It is also conceivable that operating systems other than Windows could implement the NDIS environment. Miniports should be relatively easy to port to the new system, as long as both the operating system and the miniport conform to the NDIS specification. 4.3 Dynamic Libraries 28 One of the major difficulties in designing and developing an NDIS Miniport is understanding the file-type. The primary source of information on NDIS is the Windows NT 4.0 Device Driver Kit (DDK) 2, which is a very good reference once development is under way, but is not suitable for use as a tutorial for the novice device driver developer. Thus it makes no attempt to explain the file type which should be produced by the linker. This situation is compounded where the target system is Windows95, due to the fact that the NT DDK does not provide information on developing Windows95 NDIS drivers. Reciprocally, the NDIS section of the Windows95 DDK9 merely directs you to the NT DDK for further information. Although this general lack of information was found to prove difficult in the development of this project, the material in this document should remedy the situation for those undertaking similar projects in the future. The compilation process of a Windows95 NDIS Miniport is undocumented in either operating system’s DDK, but Section 5.3 describes the method that must be used to perform this. The Miniport driver is basically a 32-bit protected mode DLL (Windows Dynamic Link Library). DLLs can be called dynamically to provide functionality to other processes. The term ‘dynamic’ is used to distinguish such libraries from those that provide functions which are included in an application’s binary file when the driver is compiled and linked. Thus with dynamic libraries, the actual code in the library is not included in the distributed binary file, and instead is provided whenever the application requires it. For further information on DLLs, consult the Windows98 Resource Kit3. The major difference between normal DLLs and Miniports (as well as VxDs), is that DLLs run at a less privileged processor level, ring-3, the same level which applications run at. Device drivers, however, must run at privilege level 0, or ring-0, which allows them access to low-level system resources, memory etc. Miniports run in a single 32-bit flat model address space, and of course have ready access to virtual memory. They also have somewhat indirect access to physical memory addresses, which can be accomplished by page-locking. This ensures that a virtual memory range is always mapped to a given physical address range, and virtual addresses can then be used to access physical memory on a hardware device. Although not specified by the DDK, the file extension assigned to Miniports is .SYS, not to be confused with 16-bit MS-DOS drivers. The actual file format is the Portable Executable (PE) format, which is common across all Windows platforms since Windows 3.1. This format is not only used for Miniports, but all Windows executables, such as applications or DLLs. The linker in any Windows development platform should create them, and thus the intricacies of the format need not be learned. To give an idea of the format, however, they are basically laid out in a very similar fashion to how the process will actually look once loaded into virtual memory. Thanks to this structure, the Windows loader need not perform a great deal of processing to run the file. If more information is required, Peering Inside the PE11 provides a very good tutorial. 4.4 NDIS Interfaces 29 The interfaces involved in NDIS have now been introduced, and this section shall now provide some more depth in the subject. Although in Figure 4-5, it appears that the NDIS subsystem sits only between the Miniport and the Protocol drivers, in actual fact NDIS provides functionality on both “sides” of the Miniport. The functions on the Protocol side are termed ‘upper-edge’ functions, and those towards the hardware ‘lower-edge’. As the Miniport is a type of DLL, it ‘exports’ functions on its upperedge, and it ‘calls’ functions on its lower-edge. That is, it provides functions to the Protocols, which can be called dynamically, and it utilises lower-edge (NDIS library) functions to talk to the card. NDIS is the intermediary between both sides, and thus no direct communication occurs. This is illustrated in Figure 4-6, where the “NetCard” can be thought of as a 1394 card. From this diagram it is easier to see where the term ‘wrapper’ comes from. Figure 4-6 - NDIS Wrapper Note that the Native Media Aware Protocol, and Intermediate driver are only present in special situations, but have been included for completeness. An NDIS Intermediate driver can be used to process packets before they reach the Miniport, for example to perform a security check, or frame conversion. A Native Media Aware Protocol allows you to develop your own protocol, which talks directly to the Miniport, and may be useful for unsupported network-media types where normal protocols, such as TCP/IP are not wanted. To explain the NDIS system, an example process might be TCP/IP sending a packet to a distant machine on the 1394 bus. The process should occur as follows: 1. The protocol driver constructs a packet (IP datagram), and packages it into a media-specific frame (1394 packet). 2. The protocol driver calls NDIS to pass the frame down to the Miniport. It does this using the NDIS library function NdisSend, and NDIS passes the packet on to the Miniport’s Send function (for sake of argument MiniportSend). 3. MiniportSend is called in the Miniport, which takes the packet and places it onto the 1394 physical medium. Talking to the hardware is accomplished using the NDIS library functions, for example NdisWritePortUchar for port i/o. 4. Any status, such as a failure, is indicated back up to the Protocol via the NDIS wrapper. 30 It can be seen then, that both the Protocol driver and the Miniport are isolated not only from each other, but from the hardware and the rest of the operating system. Any functions that are called by an NDIS driver are provided by the NDIS wrapper. In this respect, the developer is limited to a particular function set, and cannot link in the usual code libraries. However, the NDIS library provides a great deal of functionality, which should be sufficient for the needs of a driver. This abstraction from the operating system is key to cross-platform portability. No operating system specific function calls should be made, which means that any operating system can implement NDIS by passing the NDIS calls to its own proprietary calls. An important point to understand is that the protocol packages the frames for the Miniport, and thus must be aware of the media type in use. The example process above showed that the protocol should package the information into a 1394 packet, however in practice, the Windows protocols do not support 1394 packet formats. The common types are supported, such as Ethernet and Token Ring, and therefore an Ethernet Miniport, for example, need not process the frame, as header information such as the destination and source addresses are already present in the frame. The fact that only certain media types are supported by the protocols causes problems when new network types are to be used, a situation very apparent in the development of this project. The simple way round this is to accept a supported frame-type from the protocols, and convert this to a 1394 packet before sending. Section 5.2.6 discusses the possible ways to implement such a system. Miniports have to provide certain mandatory functions on their upper-edge, those used for sending packets, querying the network status, and so on. These functions, as well as the NDIS library functions are fully specified in the NT DDK2 and this should be consulted if more research is required. This chapter is intended only as an introduction to the structure of NDIS, and although subsequent chapters introduce more in-depth topics, the NT DDK should be consulted for a full understanding. References 1 Dhawan S. (1995), Networking Device Drivers, Van Nostrand Reinhold, p94 Microsoft (1996), Windows NT Version 4.0 Device Driver Kit, Microsoft Corporation 3 Microsoft (1998), Windows 98 Resource Kit, Microsoft Corporation 4 Dhawan S. (1995), Networking Device Drivers, Van Nostrand Reinhold, p13 5 Microsoft (1996), NT4 DDK: Network Drivers Design Guide Part 1, Microsoft Corporation 6 ISO (1984), Basic Reference Model for Open Systems Interconnection, ISO:7498 7 Halsall F. (1996), Data Communications, Computer Networks and Open Systems, 4th Ed, Addison-Wesley 8 Buchanan W. (1999), PC Interfacing, Communications and Windows Programming, Addison-Wesley, p648 9 Microsoft (1996), Windows 95 Device Driver Kit, Microsoft Corporation 10 Microsoft (1996), NT4 DDK: Network Drivers Design Guide Part 2, Microsoft Corporation, Section 1.3 2 31 11 Pietrek M. (1994), Peering Inside the PE: A Tour of the Win32 Portable Executable File Format, Microsoft Systems Journal (March 1994) 32 5. DRIVER DESIGN 5.1 Networking & IEEE-1394 IEEE-1394 has already been introduced as a bus standard in Chapter 3, which discussed the various capabilities of this new technology. It detailed how the current usage of 1394 has been focussed on interoperability. This project presents a unique opportunity to use the 1394 Bus in an entirely new way. The following section outlines the many decisions which had to be made on how the 1394 Bus could best be utilised as a computer networking medium, followed by a description of how the final driver design concept was determined. 1394 can operate as its own protocol, without carrying any higher-level protocols, such as TCP/IP, over it. Thus it can be used as a complete bus for interoperability, and would tend not to be compared with the OSI model. However, it may be useful to draw comparisons with the OSI model when using 1394 for carrying networking protocols over the medium. To take the example of carrying TCP/IP traffic, this could fit into the OSI model as shown in Figure 5-1. There are some bus management functions that, if present, may sit higher up in the model, but to keep the model simple they shall be omitted. Figure 5-1 - 1394 and the OSI Model 5.1.1 Asynchronous/Isochronous Traffic To use the1394 Bus as a computer networking medium, some design decisions needed to be made on which characteristics of the 1394 Bus are useful and which can be discarded. Perhaps the decision with the greatest impact, was the choice of purely asynchronous traffic. Isochronous traffic was a major goal in the development of 1394, and stands out as one of its main features. However, that does not mean that it should be used in every 1394 platform, and by using only the asynchronous capabilities we can still take advantage of high-speed communications, and all of the other features of 1394. As we have seen, asynchronous is the conventional transmit/acknowledge transfer to an explicit address, and isochronous is a broadcast, unacknowledged service for guaranteed bandwidth applications. Isochronous traffic was chosen to be left unimplemented for the following reasons: • • It cannot be used for sending data to an explicit node on the Bus. Instead it broadcasts to all nodes listening to a certain channel. It is for guaranteed bandwidth applications, and the current Windows networking implementation, for example the TCP/IP protocols, make no provision for 33 • attaching a bandwidth requirement, or quality of service (QoS), to a packet that must be sent. The project must be implemented in a reasonable time-scale. 5.1.2 Broadcast Networks As discussed later in this chapter, it was decided that the eventual product should work in a similar fashion to Ethernet, which provides only asynchronous capabilities. Explicit addressing is very important, because packets to be sent are normally sent to an explicit node on the bus, except for such broadcast events as ARP (Address Resolution Protocol within TCP/IP), and for multicast sends. Such networks are termed broadcast networks, described by Mackenzie (1998)1: “In a broadcast communication network, the subnet has no switching nodes, and all hosts share a single physical medium, which may be a cable, a fibre or free-space. A transmission from one host is broadcast to all others, in the sense that it is placed in the medium, where it becomes visible to the receiving apparatus at each possible destination. Each message must carry some form of destination identifier or address as part of its protocol control information. A possible destination only recognises that it is the intended target by examining the address in the message and identifying it as its own.” Asynchronous transfers on the 1394 Bus conform to the above definition, as each node acts as a repeater, but with the added complexity of addressing within each node. Not only does the asynchronous packet identify the destination node, it also identifies an offset within the node’s memory space. This is useful for controlling external, dumb devices and can also be used to set various 1394 operational registers at each node. This concept is entirely described in the Control and Status Register (CSR) Architecture for Microcomputer Buses specification2. The result of conformance to this architecture is each device on the 1394 Bus can be effectively locally mapped or memory-mapped to the other devices on the Bus. The implications of CSR in a computer networking environment were considered and it was decided that the memory offset be ignored, in order to work in a similar fashion to Ethernet. Only the 16-bit node ID would be used to identify the destination machine, and the other 48 bits of the CSR address can be ignored as they are not required for network addressing. To standardise this, it was decided that all 48 bits would be set to zeroes in each packet. Alternatively, future implementations may find some other use for these redundant bits. 1394 does not inherently make any provision for multicast asynchronous sends, and as such it was decided that multicast be unsupported in the implementation. The implications of this are not greatly important, but are also difficult to measure. How much of an adverse effect this would have will vary widely between different networks. It is envisaged that the majority of traffic on the eventual networks will be unicast, however. If multicast transfers were deemed to be a solid requirement, it would be possible to implement them using isochronous channels. Each multicast group would be identified by a unique 1394 isochronous channel number. To do this, some work must be carried out in developing a new protocol to communicate the channel 34 identifications to each member of the multicast group. Alternatively it may be possible to set aside a group of 1394 Bus IDs, each to be used for one multicast group, however the Texas Instruments development cards provided for use with this project only make provisions for listening for one Bus ID. There are workarounds then, but generally they would cause problems on unicast sends, or would require some extra management, all of which begin to stray from the 1394 specification. Broadcast asynchronous transactions are specified in 1394-19953, which states that for destination addresses containing entirely binary 1’s “All nodes on the local bus shall recognize the packet”. However, during discussions with David Wooten of the 1394 Trade Alliance, it became apparent that there are problems with broadcast sends. This is due to poor early implementations of 1394 link silicon. Both broadcast and multicast are to be properly supported in future 1394 standards using “asynchronous streams”. The p1394a draft specification4 discusses these: “An advantage of an asynchronous stream is that broadcast and multicast applications that do not have guaranteed latency requirements may be supported on Serial Bus without the allocation of a valuable resource, bandwidth. An additional advantage is that asynchronous streams may be easily filtered by contemporary hardware.” It is unfortunate then, that the development cards for use with the project conform to the earlier standard 1394-1995, and as such, some other methods of achieving broadcast and multicast must be found. The implications of the lack of broadcast far outweigh those for the lack of multicast. As mentioned above, TCP/IP uses broadcast for ARP, which enables the protocol suite to identify the MAC address related to a specific IP address. With 1394-1995, some methods had to be developed to enable broadcast to occur, in order to properly implement TCP/IP. The design decision and some alternatives for implementing this are identified in Section 5.2.10. For a discussion of ARP, see Tanenbaum (1996)5. 5.1.3 Bus Management There are two bus management entities that can be implemented on the 1394 Bus. They can reside on any node, and do not have to be on the same node. The isochronous resource manager provides services such as guaranteeing adequate bandwidth and allocating channel numbers, whereas the bus manager provides services such as maintaining topology maps. Clause 3.8 of the 1394-1995 specification6 states that “a valid combination is the absence of any bus manager entity in which case no isochronous traffic is allowed.” Given that no isochronous traffic is required by the design, it is deemed acceptable to omit both entities, and function as an unmanaged bus. The 1394-1995 specification also details the fact that cycle masters are only required for isochronous traffic, and thus a cycle master and slaves are not required by this design. Further information on cycle masters and bus management can be found in Clause 8 of the 1394-1995 specification7. 35 5.2 IEEE-1394 & NDIS NDIS and 1394 have, until now, been two entirely separate technologies, and nothing was found to suggest that a relationship between the two had been implemented in the past. This presented a particularly fulfilling project objective, where there was scope to entirely shape the product, and no need to conform to previous methods of implementation. To this end, it was possible to investigate a number of different routes, before arriving at the final design choice. Perhaps the decision with the greatest impact, was the type of device driver to implement. There are four possibilities: • • • • VxD – Full legacy NIC driver Miniport NIC driver Intermediate Driver / TI API Intermediate Driver / WDM Driver 5.2.1 Legacy VxD Driver Legacy drivers were introduced in Section 4.2.2. They are “full” in that they must deal with a number of complex operating system issues, as well as the normal hardware issues related to the network interface card (NIC). They are described in the NT DDK8: “Introduced with the initial release of Windows NT, full NIC drivers require the device driver writer to program at a more basic level, dealing with kernel-mode issues of multiprocessor support and processor and thread synchronization. Full NIC drivers require driver developers to write a large amount of code to deal with issues that are common across all NDIS drivers.” It is, of course, desirable to the developer that more issues are taken care of by the operating system. It may, however, be that control over this extra functionality is required in the driver, but this scenario seems unlikely given that Microsoft have officially stopped supporting legacy drivers. Certainly in this case, control of these features is not a design requirement, and would serve to add unnecessary overhead to the workload. Another very important consideration with legacy NIC drivers is that they are written specifically for a particular operating system. That is a Windows95 legacy NIC driver could not simply be recompiled on NT. The amount of work involved in porting to a different operating system would differ between drivers, but complete crosscompatibility is a useful feature. Finally, legacy NIC drivers are not supported in NDIS 4.0 or above, thus the extensions provided by later versions are unavailable. 36 5.2.2 Miniport NIC Driver Miniport drivers overcome the problems faced by legacy NIC drivers that were outlined above. The NT DDK8 says the following of Miniports: “Windows NT supports miniport NIC drivers to allow developers to write only the code that is specific to the network hardware, merging the common functions into additional services provided by the NDIS library. Miniport NIC drivers are smaller and faster, requiring much less work to write. Miniport drivers are able to defer handling of many issues to the NDIS library.” Thus Miniports immediately appear to be an attractive target type. Less functionality has to be provided by the developer, and they are smaller, thus will make a reduced footprint in system memory. Also during fault-finding, if there is less code, faults are easier to find. Most important of all, however, the project can enjoy a shorter development time; other than the features that are now taken care of by the NDIS library, the code is essentially very similar in a Miniport. 5.2.3 Intermediate Driver / TI API Miniports and legacy NIC drivers must contain hardware-specific code, in order that the hardware be controlled. This means that this code has to be written by the developer, and the intricacies of the hardware must be learned in order to do this. However, in actual fact, an API (Application Programmer Interface) is provided with the Texas Instruments 1394 cards. This API exports a library function-set which can be used to control the 1394 cards. Functions are provided to program all operations on the devices. The API is a VxD device driver, called pcilynx.vxd. It is only supported in Windows95/98, no equivalent is provided for NT. Figure 5-2 - Intermediate Driver Structure It is possible within NDIS to implement a structure similar to Figure 5-2. Here the intermediate driver would be the output from this project, and it sits between the protocol drivers and the TI-supplied API driver. Just as with a Miniport/legacy driver, the intermediate driver accepts commands on its upper-edge, however, instead of directly controlling the hardware, it communicates with the API to perform the hardware functionality. In effect, it is masquerading as a Miniport to the protocol drivers. 37 The attractiveness of this implementation lies in being able to omit the hardwarespecific code, leaving it up to the API. One or two fairly simple function calls could be used to perform otherwise complex operations on the hardware. This structure was afforded a great deal of consideration, and a number of problems were identified. The intermediate driver, by specification, is intended for protocol-level functionality. Perhaps to perform tests on packets before they are passed to the Miniport below, or to change the frame format before frames are sent and so on. This structure deviates from the NDIS specification, where the hardware functionality is always performed by an NDIS driver. This is effectively making a call to a driver outside NDIS, which perverts the wrapper concept. The NT DDK does not even recognise that calls outside NDIS are possible, but experimentation has proved that this is possible. For example, the Windows95 DbgPrint technique, documented in Section 5.5.1, makes a call to VxD wrapper services that are supposedly unavailable to NDIS drivers. Although it is wrong to make such calls, it may be the only way to perform a certain action. In the case of DbgPrint, however, and possibly universally, such a call makes a driver OSdependent. This is not particularly important for a debug technique that will not be present in release-level code, however OS-independence is important in core functionality in release versions. OS-independence is in fact impossible while the TI API is only available for Windows95/98, however it is conceivable that an NT API will be released eventually. Apart from not conforming to NDIS, there are other issues to consider in developing an intermediate driver. No control can be gained, internally to the API, unless the source code is available. Texas Instruments were contacted regarding the possibility of obtaining this source code, but it was only available at a price, and a decision was made that the benefits of obtaining this would not justify the cost. Thus one cannot be sure of the methods used within the API, and the efficiency of those functions required by the project design. Considering the fact that the project was rather unique, it was deemed as appropriate that full control be available over the hardware and 1394 Bus. At a higher level, this method introduces some cross-communication between drivers, and this might produce inefficiencies, which are undesirable when working with highspeed communications. Coupled with this, the synchronisation and multi-processor issues mentioned with regard to legacy NIC drivers, may not be implemented as required by NDIS, and interrupt-handling, usually taken care of by NDIS, becomes confused. Finally, the API is provided as a universal access method to the hardware. A lot of this functionality may not be required by the design, for example isochronous transfer, in which case there is redundant code consuming resources. 5.2.4 Intermediate Driver / WDM 1394 Bus Class Driver Windows98 and the forthcoming Windows NT5 also provide driver support for 1394 buses. This support is entirely removed from NDIS, and the intended use is to control remote devices, such as video cameras, on the 1394 Bus. The support is via a WDM (Windows Driver Model) driver, which works in a similar fashion to the Texas Instruments API. It is conceivable that an intermediate driver could be written to sit in 38 the NDIS structure, and the pcilynx.vxd API would be replaced with this bus class driver. This method would suffer from the same problems as the TI API method, as it is basically doing the same thing. Furthermore, WDM drivers are not supported by Windows95 or Windows NT4/3.51, thus the time spent on developing such a driver would have to be supplemented with work on support in older Windows versions. Although it is good to support the new models such as WDM, it presents problems when older versions of Windows need to be supported. Given that this project primarily requires a Windows95 driver, this was deemed an unsuitable development route. This issue introduces an interesting dilemma faced by device driver developers in general, when there are a number of driver models to follow. They must make tradeoffs depending on which operating systems are to be supported, and often would dismiss new and possibly superior models, to ensure compatibility across platforms. The fact that these new drivers are available with WDM may be useful for the normal (interoperability) operation of the 1394 bus, however it must be stressed that they do not provide networking support. The result of this project would still be used in the new operating systems when networking is required. 5.2.5 Driver-Type Design Choice Although both intermediate driver methods reduce the amount of code that needs to be developed, the problems described above were deemed to outweigh this advantage. Thus the choice lay between a Miniport and a legacy NIC driver, and given that no disadvantages of Miniports over legacy drivers were identified above, the final design choice lay with Miniports. Miniports are a progression from legacy drivers, and are the Microsoft-advised method. This decision was also agreed with the project supervisor. The intended NDIS structure is shown in Figure 5-3. Figure 5-3 - Driver Position within NDIS 5.2.6 Ethernet Emulation The fact that NDIS supports a restricted set of media types presents an interesting problem when developing for a new media type such as IEEE-1394. Any NIC driver must tell the NDIS library which medium it supports. With NDIS 4.0, the following media types are supported: • • • • Ethernet (802.3) Token Ring (802.5) FDDI LocalTalk 39 • • • ARCNET WAN (point-to-point and WAN cards) Wireless Given that 1394 is not currently recognised by Microsoft as a network type, and the fact that it is such a new technology, it is not supported by any available version of NDIS. This is not an ideal situation, but it is not insurmountable. The 1394 Miniport can “pretend” to the upper layers that it is one of the supported types. The medium to masquerade as is not extremely important, and Ethernet was chosen for the following reasons: • • • • • The developer has previous knowledge of Ethernet, and understands the basics of transmission and frame formatting. It is a fairly basic medium, with simple concepts for addressing destination nodes etc. It should be easy to translate 1394 to encompass these concepts. It should be supported by most of the protocol drivers, which for example, ARCNET is not. Frames can contain a fairly large amount of data (up to 1500 bytes), thus there should be no need to send more 1394 packets than necessary. If the data length in each frame was limited to a smaller value than 1394 could support, the abilities of 1394 would not be utilised to the maximum. See Table 5-1 for an indication of 1394 packet sizes. The NT DDK9 suggests that “ATM media should implement LAN-emulation functions within themselves, and report their medium type as Ethernet 802.3 or 802.5”. Although this is referring to ATM, it may be general advice to support one of these two types for any non-native media. NDIS protocol drivers output frames correctly formatted for the chosen medium. Thus if a 1394 Miniport chose to emulate Ethernet, it would receive packets to be sent onto the 1394 Bus, with some Ethernet framing data already present. The actual fields present are the Destination Address, Source Address and Data Length fields. It is normal that an Ethernet NIC would add the rest of the fields on transmission (Preamble, Start Delimiter, FCS and Delay). Thus the Miniport does not have to take care of any framing issues, it merely passes the frame onto the NIC for a send, or passes it straight up to the protocol drivers on a receive. For further information on the Ethernet frame format, consult Buchanan (1997)10. The actual operation of the 1394 Miniport then, would be to accept an Ethernet frame to be sent, and encapsulate it in a 1394 packet, before placing it on the 1394 Bus. This process is depicted in Figure 5-4. 40 Figure 5-4 - Ethernet/1394 Send Process The resultant 1394 packet would contain the whole Ethernet frame, surrounded by the 1394 framing information. Figure 5-5 shows a very basic view of the resultant 1394 packet. The 1394 fields are representative of the packet formatting information. In 1394 language, the Ethernet frame would constitute the data payload. The format of these 1394 packets is described in the 1394-1995 specification11. Figure 5-5 - 1394 Encapsulated Ethernet Frame Including the Ethernet framing fields is effectively including redundant information. The addressing and data length fields take up exactly 14 bytes, and ideally these would be stripped. The problem is that they would have to be reconstructed at the far end before being passed back up to the protocol drivers. This may be fairly simple using the information in the 1394 packet header, but this would take up valuable processor time, and may slow down packet reception. Given that 14 bytes is not a massive overhead, the information has been deemed small enough as to not pose a problem. Other implementation possibilities are: • • Intermediate Driver Rewrite protocol drivers An intermediate driver could be written to sit between the protocol drivers and the Miniport. For sends it would convert Ethernet frames from the protocols into 1394 packets, and perform the reverse on receives. Unfortunately, there will be processing and resource overheads in introducing an extra driver layer, which would slow the system down. Also this intermediate driver would perform such a small amount of processing on the packet, that the need for separating it from the Miniport is questionable. Rewriting the protocol drivers is a very drastic option, and would be a rare choice for supporting new media types. All of these drivers, such as for TCP/IP or IPX/SPX, would have to be changed if they were required over the 1394 Bus. They would then 41 output correctly formatted 1394 packets, which could be placed immediately onto the bus by the Miniport. Target customers may not want to replace their protocol drivers when they upgrade to 1394 hardware, and if they were using proprietary protocols, these would no longer work. Ideally, 1394 should become an NDIS-supported media, at which time Microsoft would release 1394-ready versions of the protocol drivers. An Internet Engineering Task Force (IETF) Working Group, is in fact working on an IP over 1394 standard12, which will hopefully be adopted by Microsoft within its IP driver, such that 1394-formatted packets will be passed down to the Miniport. IPover1394, however is to support only p1394a hardware, and to support the other protocols, such as IPX/SPX, the Miniport would still have to report itself as a Windows-native media type. Both the Intermediate and protocol driver options still require a Miniport to be developed, and provide little help to the Miniport. Therefore, the time and costs involved in developing these new drivers are deemed to far outweigh the advantages. 5.2.7 1394 Transaction Type There are two types of transaction on a 1394 Bus that perform an asynchronous send. These are: • • Write request, data quadlet Write request, data block Both of these actions allow data to be sent to an explicit address on an explicit destination node. Both consist of a header and a payload in which the data to be sent is placed. A quadlet write can transfer a 32-bit data payload, whereas a block write can transfer one or more quadlets of data. Block writes are limited to a maximum payload size, depending on the data rate. The maximums in a cable environment are given by Table 5-1. Data rate Maximum payload size (bytes) S100 (Cable base rate) 512 S200 1024 S400 2048 Table 5-1 - Maximum Payload Sizes The minimum size of an Ethernet frame that might be passed down from the protocol drivers is equal to the minimum length of data added to the header size. As discussed in Section 5.2.6, the header information takes up 14 bytes for Destination Address, Source Address and Data Length. The minimum number of data bytes is 46, up to a maximum of 1500 bytes. Thus the minimum frame size that would need to be contained in a 1394 packet, is: 14 + 46 = 60 bytes. Thus a quadlet write (4 bytes) would not be sufficient to transmit the smallest possible Ethernet frame. Therefore it was decided that block writes be used for all frame transmission. The largest possible Ethernet frame would contain 1514 bytes. This 42 could not be transmitted at S200, at which the maximum payload is 1024 bytes. Given that the project development cards are only capable of up to S200, there are three possibilities: • • • Notify NDIS/protocols that the maximum frame size is 1024 bytes. Split larger frames up into two smaller frames. Upgrade to S400-capable cards. The first option would be the simplest to implement, and should not have an adverse effect on the network. Thus the final design choice here lay with asynchronous block writes for all frame transmission, up to a minimum Ethernet frame size of 1024 bytes. Note that the actual 1394 block write packet also contains a header, which is not involved in the limitations shown in Table 5-1. The header size is 4 quadlets, thus the actual maximum packet size at S200 would be: 1024 + 16 = 1040 bytes. There are also 2 quadlets which provide a CRC for the header and the payload, bringing the transmitted packet up to 1048 bytes. The block write packet format is shown in Figure 3-3 however for detailed information on the 1394 packet structures, consult the 1394-1995 specification11. 5.2.8 MAC Addresses One problem with acting like an Ethernet NIC, is that NDIS expects each NIC to have a unique 6-byte MAC address. This is used by the protocol drivers to construct a correctly addressed Ethernet frame on a send. Thus the destination and source address fields will contain the MAC address of the destination and source nodes. Thanks to this, a real Ethernet adapter can pick up frames addressed for itself, and the Miniport need not know anything about which node the frame is destined for. 1394 Buses use a different addressing scheme, which is only 2 bytes long, shown in Figure 5-6. The first 10 bits identify the bus number, and the final 6 bits identify a particular device on that bus. These IDs are dynamic, and can change after any bus reset. Thus it would be unsuitable to notify protocols of this information, as it can change whenever a new device is connected to or removed from the bus, or any other action that causes a bus reset. Figure 5-6 - 1394 Node ID Format Given that the Windows protocol drivers assume that an address is static, some means of providing each 1394 NIC with a unique static ID is required. Also that ID should be 6 bytes long, in order to comply with Ethernet MAC addresses. When each 1394 Miniport registers itself with NDIS, it is queried for the NIC’s MAC address, and it should pass back some unique address. How that address should be calculated or assigned, is a point of consideration. It could be assigned by the network administrator, 43 in much the same way as they must assign unique IP addresses. Alternatively, it could be calculated internally on installation, or on loading the Miniport. There are a few possibilities for obtaining a unique MAC address, some of which are listed below: • • • Based on the time at installation, or on first loading the Miniport. Based on some unique information from the Windows registry, possibly the Windows serial number. Based on the 1394 card serial number The immediately appealing option is to use the 1394 serial number. It is the most likely route to provide a unique 6-byte number. The serial number is stored in a serial EEPROM on the PCI development cards, and is 2 bytes long. The remaining 4 bytes can be made up of almost anything, although resultant MAC addresses to avoid are: • • All one’s (FF:FF:FF:FF:FF:FF in hexadecimal) - reserved for broadcast sends. Those with a 1 in the most significant bit - reserved for multicast. Further information on MAC addresses and reserved values can be found in Fundamentals of Ethernet Technology13. Reading the serial number is not as simple as reading from the PCILynx register space, which is the normal method for programming the development cards. Sample code14 for doing so is provided with the cards and the format of the EEPROM is described in the Lynxsoft guide15. Note that if other manufacturer’s 1394 cards are eventually supported by such Miniports, and are to co-exist with this Miniport, then a suitable algorithm must be chosen to create MAC addresses such that it cannot result in conflicting addresses. It is envisaged that initially MAC addresses will be chosen by an administrator, until such a time as the above EEPROM method is implemented. 5.2.9 MAC/Node ID Translation The previous section discussed the allocation of a unique MAC address for each 1394 NIC. This may take care of the upper edge of the Miniport, in that it will have information for NDIS regarding its MAC address, but it does not provide a method for translating these MAC addresses to a 1394 node ID, like that shown in Figure 5-6. As it stands, the protocol drivers would be passing down frames addressed to certain MAC addresses, and the Miniport would not know which 1394 node this MAC address referred to. Thus it could not place the frame on the 1394 bus. In simple terms, the Miniport has to do the following when it is passed an Ethernet frame: 1. Check the destination MAC address of the frame passed to it. 2. Translate this to a 1394 node ID. 44 3. Build a 1394 packet addressed to the correct 1394 node ID, with the Ethernet frame as the data payload. The simple answer would be to somehow derive the 1394 node ID from the MAC address, however this is not possible in a real implementation, as the node IDs are not constant; they can change after a bus reset. MAC addresses must remain static in order to not confuse the upper layer protocols, therefore they can not also change after a bus reset. This is a complex problem, and one that was given a great deal of consideration. Each Miniport must, in some way, maintain knowledge of the translations of 1394 Node IDs to MAC addresses. To gain this knowledge, some mechanism had to be developed to enable the Miniports to build up a database of MAC to 1394 address translations. It was decided that, on loading or after a bus reset, each Miniport would notify all other nodes of its MAC address and its 1394 ID. Thus each Miniport would receive the details from other nodes, and end up with a complete table. It could then use this table to cross-reference MAC addresses whenever a packet needs to be sent. This is an unusual action for a Miniport to perform, due to the fact that Miniports are usually “dumb” layers, which merely pass frames onto the medium, or pick them off and send them up to the protocols. The functionality described herein brings the Miniport into the realms of addressing, usually taken care of by the network layer, as described by the OSI Model16. For each node to notify the other nodes on the 1394 Bus of its MAC address, a protocol had to be developed to carry this information. Ideally a simple broadcast mechanism would be used, where one broadcast packet is received by all nodes, and the MAC/1394 address information is stored within the Miniports. Unfortunately, due to the problems with 1394-1995 broadcasts, another method had to be developed. Thus three possible options were identified that could be used for communicating with every node: • • • Isochronous delivery. Reiteratively send one packet to each child and parent. Send one asynchronous packet to each node. Isochronous delivery could be utilised by programming every node to listen on a certain isochronous channel, over which they would send their MAC/1394 address details in small packets. In theory this appears a simple mechanism, however on consideration, it was decided that implementing isochronous capabilities on the bus would bring with it many new configuration problems. Also, the allocation of bandwidth, which could otherwise be used for normal asynchronous transfer, would mean a reduction in data rates. There are many issues to take care of if isochronous delivery is required, including setting up an isochronous resource manager entity. Given the project time-scale, it was decided that the project should remain an asynchronous only bus. The second option would make quite efficient use of bus bandwidth. A node wishing to broadcast to all other nodes could send one packet to each of its “children” and its 45 parent. In turn, these nodes would repeat the packet on all of their ports, and so on. This way the packet would be efficiently propagated through the network to all nodes. This would be quite difficult to implement, and in doing so would consume a fair amount of project time, however it is noted as a possible future update. Using the third option, of sending many asynchronous packets, is not immediately attractive, however it was decided that it is an effective method, and is easy to implement. In doing this, there would be a sudden influx of these packets after every bus reset, however the impact of this can be reduced by using the smallest packets possible. The following discussion will prove that the effect on bus bandwidth is minimal. The Miniports will do the following in order to implement this: 1. Detect Bus resets. 2. On detection, build a small packet that contains at least the local MAC address and the local 1394 ID. 3. Send this packet repeatedly to each of the other nodes (It knows what other nodes are on the bus due to the self ID process). In order for each Miniport to build up the table it must: 1. Check incoming packets to establish whether they are these special packets, or if they should be passed up to the protocols. 2. If it detects one of these packets, it should not pass it up to the protocols, and instead read the MAC/1394 addresses within, and store it in local memory. 3. Once it has intercepted a packet from all nodes, it will have a complete table of translations. After this process, when a normal frame is passed down to be sent on the bus, the Miniport must first: 1. Read the first 6 bytes of the frame to determine the destination MAC address. 2. Look up the table to find out which 1394 ID this refers to. 3. Construct a 1394 packet addressed to the correct destination 1394 Node ID. On receipt of a 1394 packet, a Miniport need not, of course, perform any address translation. It must only check that it is not one of the small configuration packets outlined above, as it is important that these packets are processed within the Miniport and not passed up to the protocols. If a received packet is not a special packet, it does pass it up, unprocessed. The format of these packets was considered, in order to find a method of minimising their size. It was decided that 1394 asynchronous quadlet writes be used, as these are the smallest asynchronous packets. Also, if all other normal traffic were to be block writes, then these special packets could be easily identified by type, without even reading the information within the packet. 46 Figure 5-7 - Asynchronous Write Quadlet Payload It was already decided that the destination offset in 1394 packets be ignored, and could be set to all zeroes. Fortunately the “destination_offset” field is exactly 6 bytes, and thus would be perfect for storing the 6-byte MAC address. The 1394 address is always contained in the source ID field, therefore the packet already contains the necessary information for other Miniports to build their translation tables. The format of these quadlet write packets is shown in Figure 5-7, where both pieces of information required of these packets can be seen to be totally contained in the packet. The “tl” field, Transaction Label, can be used to further identify these packets, by a given number, such as all one’s. This would only be required if there are other occasions when quadlet writes will be used, otherwise they can be identified by the “tcode”, Transaction Code, field, which will contain the code for quadlet writes. The “quadlet_data” field is not required by this implementation, but is reserved for future usage. It may be that later versions require extra information regarding the capabilities of nodes, to be stored in the table. Information on the other fields in quadlet write packets can be found in Clause 6 of the 1394-1995 specification17. To investigate the effect that these packets will have on the bus traffic, we can calculate a worst-case scenario. The maximum number of nodes per bus is 63. Assuming there are 63 nodes present, each wishing to send a quadlet write to the other 62 nodes, the number of packets which would be placed on the bus is: 63 * 62 = 3906 packets. The number of bytes this would consume, given that each quadlet write packet takes up 5 quadlets (20 bytes): 3096 * 20 = 61,920 bytes. This is not a massive amount of information, and at rates of 200Mbps (25 megabytes per second), could be transferred in: 61,920 / 25,000,000 = 0.0024768 Seconds. Given that bus resets should occur very rarely, normally only when a new device is connected to the bus, this should not produce any noticeable effect on network performance. This may cause a few problems on a network such as Ethernet, given the collision detection mechanism it employs, however 1394 uses a fair arbitration system, which would guarantee fair access to the bus. Should this still be regarded as adverse, 47 it is possible to reduce the overhead from packet headers by assigning one “master” to accumulate 62 packets from the other nodes, and transfer the whole table in one block write to each node, or to use the parent/children method. The storage of the table will have some effect on the footprint of the Miniport within memory. Each node will take up 8 bytes in the table (6 bytes for MAC address and 2 for 1394 ID). Therefore, to contain a table of 62 cross-references would require: 62 * 8 = 496 bytes. Given the memory available on machines today, this is deemed to not make a noticeable impact on Windows operation. Other than these methods, the only other viable option is to upgrade all devices on the bus to p1394a compliant, which can deal with asynchronous streams. Unfortunately this is not only costly, but rather unforgiving in that a network will then not provide support for any 1394-1995 cards whatsoever. Later implementations may provide support for isochronous traffic or the parent/children method, but to keep the project fairly simple in its initial stages, the asynchronous copy-to-all method will be used. 5.2.10 Broadcast and ARP The previous section detailed how to implement unicast sends based on Ethernet MAC addresses. Such transactions are generally the most common, and certainly the most important send types on an Ethernet-type network. There are cases when broadcast sends are required, however, such as with ARP, and the Miniport must be able to deal with this in some way. This section outlines how broadcast will be implemented. As outlined in Section 5.2.8, broadcast Ethernet sends are identified by a MAC address where all 32 bits are binary 1’s. Thus, when the Miniport queries the frame’s destination address, as it always should do for a unicast send (see previous section), it must also recognise a broadcast address, and take some extra action. Rather than using the lookup table to find the 1394 address, it must instead implement some form of broadcast. The possibilities have already been discussed in some detail, and it is assumed that the method will be as below: 1. Miniport detects a broadcast destination address. 2. Miniport just repeats the packets to all nodes on the 1394 Bus. Rather than analyse self ID information, it will be quicker to use the table to determine which nodes are on the bus. Using this method, the Miniport does not have to hard-code any support for particular types of broadcast transmission, such as ARP requests. An alternative method was also considered and subsequently discarded which involved hard-coding ARP support: 1. Intercept ARP requests. 48 2. Use the table to respond automatically within the Miniport. This would involve constructing a fake ARP response packet and passing it straight back up to IP, without using the 1394 Bus at all. Although this method does not involve the 1394 bus, and is therefore a very quick mechanism, a number of problems were identified: • • • Support for only one protocol (IP) is hard-coded. The table must also contain the IP addresses in order to be able to respond to ARP requests. This would require the Miniport gaining knowledge of its IP address which normally it does not know. The table would increase in size by 4 bytes for each node in the table. Due to these problems, it was decided that the original implementation be used, that is repeating broadcast packets to each node. This concludes the main project decisions made on how to best integrate 1394 with the NDIS system, and is the basis from which the development was specified. 49 5.3 Hardware Control The PCILynx contains a register-space, which is used to program the card. This space is subdivided into quadlets (32 bits), each of which pertains to different information and operations. Some of these quadlets are read-only, and some also provide write access. As with all PCI devices, the PCI configuration space can also be read to obtain information regarding the addresses of these registers and other PCI-related information, such as the IRQ in use. This configuration area is depicted in Figure 5-8. Figure 5-8 - PCI Configuration Control and Status Registers The Base Address quadlets are used to determine the physical addresses of registers on the card. Base Address 0 is the most important for programming the PCILynx chip, whereas the others pertain to RAM and ports that are local to the card, and are not implemented in this project. Communication with these registers in NDIS is achieved by mapping them onto a local (Windows) virtual address space. In effect the physical card register space is then also accessible using Windows-accessible virtual addressing. Thus the address stored at Base Address 0 gains also a virtual address through an NDIS library call, NdisMMapIoSpace, and each register is then programmed by writing to or reading from an offset added to this virtual address. Transmission on the 1394 bus is achieved by constructing correctly formatted 1394 packets in PCI-accessible memory. Each packet is supplemented with a Packet Control List (PCL) which is a proprietary mechanism for programming the DMA busmaster. These PCLs inform the busmaster of which operation must be performed, such as an asynchronous transmit, and point to the memory containing the 1394 packet. For a receive operation, this pointer will be to an empty buffer in PCI memory. Therefore when a packet is to be sent, it must be correctly constructed in contiguous PCI memory, and the address of this packet is then passed to the card, using the PCL structure. As Windows implements virtual memory, the physical addresses of these packets must be obtained as the PCILynx does not support virtual addresses. In NDIS this means that a page-locked, shared memory range must be allocated, using the 50 NDIS library call. NdisMAllocateSharedMemory, which returns both a physical and virtual address for the same memory range. Windows and NDIS operations are then performed using the virtual addresses, whereas the card itself is passed the respective physical addresses, and this memory is never paged out. Figure 5-9 - PCILynx Data Transfer Figure 5-9 illustrates the process of transferring data between the bus and PCI memory. The busmaster copies 1394 packets between PCI memory and on-board FIFO memory. On transmits, the busmaster copies packets into one of the transmit FIFOs (isochronous or asynchronous), after which the LLC calculates and appends the CRCs and copies the packets onto the bus. On receives, the packet CRCs are checked and, if correct, the packets are copied into the General Receive FIFO (GRF). If a receive PCL has been set up, then the busmaster copies the data from the GRF into PCI memory. Busmaster DMA is considered the fastest mechanism for PCI devices to use, because it works on its own, while freeing up the computer’s processor to perform other tasks. It is well described by Goldman (1998)18. “Only the bus-mastering DMA data transfer technique leaves the system CPU alone to process other applications. In bus-mastering DMA, the CPU on the network adapter card manages the movement of data directly into the PC’s RAM without interruption of the system CPU by taking control of the PC’s expansion bus. PCI bus-based (devices) also exhibit low utilization of the main CPU thanks to the intelligence of the PCI bus itself.” A full description of design decisions, such as memory range requirements, is not included here, as all hardware control has been described in the Module Specifications, of Section 6.3. This section serves only as an introduction to a very complex hardware system, and if a complete understanding is required, the PCILynx specification19 should be consulted. 51 5.4 Development Environment The actual tools and methods used when developing Miniports were found to be some of the most confusing aspects of NDIS development. The intricacies of the functions within the NDIS library are well documented; the format of parameters passed to it, the return values and so on. There are, however, only sparse details of what is needed to get started. For this reason, a great deal of time was spent learning the development method, before actual coding began. The fruits of this labour are now presented here in an accessible format, and this document can in future be used in conjunction with the usual information, in the development of future NDIS projects, and even as a general guide to Windows device driver projects. Three products were required before NDIS Miniport development could start: • • • Microsoft Visual C++ v5.0 Microsoft Windows NT 4.0 Device Driver Kit (DDK) Microsoft Windows Software Development Kit (SDK) September 1998 These are available in various release versions, and it may be possible to use older or newer versions than were used in the development of this project. The language used for developing NDIS Miniports is ‘C’, and it may even be possible to use a ‘C’ compiler other than Visual C++, for example the Borland/Inprise software, but there may be slight implementation differences. Also the linker must be capable of producing ring-0 level applications (see Section 4.3 for details). Therefore it is advisable to use Microsoft products, as these are presumably the intended tools. All three products are provided with subscriptions to the Microsoft Developer Network (MSDN), although Visual C++ is also available as a retail product. The Windows NT DDK should always be used for NDIS development, since the NDIS details have been removed from both the Windows95 and Windows98 DDKs. As this section details, it is possible to use the NT DDK for developing for these two environments. A new version of the NT DDK is released with every new version of NT. To find which version of the DDK you need, look up the version of NDIS in Table 4-1, in section 4.2.2. Given that this project conforms to NDIS 4.0, the NT 4.0 DDK was used. Finally, the SDK version is not so important; it provides some tools and libraries that may prove useful, as well as further information on the Windows environment (although it does not contain any device driver level material). Sample NDIS Miniport source code is provided as part of the DDK, and these drivers are ready for immediate compilation for use in NT. However, as touched upon in section 4.3, there are no references whatsoever to compilation in Windows95. This situation would be acceptable if the Windows95 DDK provided these details, but the only NDIS information is a suggestion to consult the NT DDK for the development of Miniports. Given that the primary target system was specified as Windows95, a method for development in Windows95 was an important objective. The NT compilation process is taken care of by a utility known as BUILD, which works in conjunction with the development environment (generally assumed to be 52 Visual C++). It takes an input configuration file, called a SOURCES file, which details the filenames of the source-code, and the target type, in this case an NDIS Miniport. BUILD is given the rest of the configuration information by the DDK. Thus it should be easy to compile and test one of the example drivers. Windows95, however, does not support BUILD, and an alternative method had to be used to compile Miniport source from this environment. In both cases Visual C++ is used to build the target driver file, but in Windows95 a ‘Makefile’ is normally used, which is very similar to a Unix Makefile. This is rather like a batch file, which is used to program the build process, and provide the correct parameters to the compiler and linker. The Visual C++ integrated development environment (IDE) can be used, but it is biased towards application development, and it is more difficult to exert absolute control over the target file. Although compiling and linking may seem a simple concept, the developer must provide a large amount of parameters to these tools, in order to ensure that the target driver file is of the correct type, is formatted correctly and so on. The Makefile used in developing this project is included in APPENDIX C. It can be used on a Windows95 or NT system to build Miniports for both environments. This is done by providing a command-line parameter. It is also possible to use either parameter to create a Windows98 Miniport. There are many parameters to be passed to the compiler and linker, and these are all configured within the Makefile. For details of the format of such files, the Visual C++ online help documentation should be consulted. The actual compilation is done from the MS-DOS command prompt, and the DOS environment must be set with a number of variables, providing details of the location of the Visual C++, DDK and SDK binary files and libraries. The utility NMAKE is part of Visual C++, and uses the Makefile to build the target. The build process is depicted in Figure 5-10. Figure 5-10 - NDIS Miniport Build Process The fact that the sample Miniports could not initially be compiled for use in Windows95 proved to be a major obstacle to the project work, but once the Makefile 53 technique was completed, the development could properly ensue. The DDK sample Miniports could be compiled, producing an output .SYS driver file. Due to the fact that these samples were for particular network cards (both Ethernet and Token Ring), they could not be tested since the cards were not available. However, the ability to build the target file was deemed a breakthrough, and development of a new Miniport for the IEEE-1394 cards was able to start. 54 5.5 Debugging Device drivers, and thus NDIS miniports, require a very particular debug environment due to the fact that they run at processor privilege level 0, or ring 0, within Windows. Common debuggers, such as that which is integrated within Microsoft Visual C++, whilst useful for debugging user applications, can not be used for debugging such device drivers. Instead, a kernel-mode debugger is required, such as WDEB386 which is included with the Windows95 DDK20 or the commercially available NuMega SoftICE. This section describes the usage of these two packages, and provides an appraisal that can be used where a choice has to be made. Proprietary techniques, designed by the developer for use with the project, are also described. 5.5.1 Microsoft WDEB386 The Microsoft Windows System Debugger (WDEB386), can be used for debugging any software running within the Windows operating system. That is, it can be used for device drivers, dynamic link libraries (DLLs) and also applications. It is a very lowlevel debugger in that its command set is used to directly access hardware such as memory, i/o ports and processor registers. It closely resembles the debug program which was historically included with the MS-Dos operating system, and similarly, tracing through programs is done at assembly language level rather than source code. This complicates things somewhat, as it is very difficult to relate the assembly language to the source code. To debug a device driver, two computers are required; one acts as the test machine, which is running WDEB386, Windows and the driver under development, whilst the other is used for viewing the debugger output, and sending debug commands. The connection between the two machines is via a null-modem cable, and both computers will require a spare serial port. The machine which displays the debug output simply requires a terminal package to be loaded, not necessarily within Windows, however Windows contains the hypertrm package which is perfectly good for this use. The nullmodem cable does not require handshaking lines, and must simply be connected as in Figure 5-11. The cable’s pin assignments are described in Table 5-2. Figure 5-11 - Null-Modem Cable Pin Name Receive 2 Transmit 3 Ground 5 Table 5-2 - Cable Connectors 55 The installation of WDEB386 also installs a number of VxDs, which replace some of the standard Windows files with a debug version of the same file. These provide support for debug functions which can be coded in driver source code, and also initiate extended output of information at the terminal. Once this environment is installed, debugging can proceed by executing WDEB386 instead of loading Windows as normal. The booting of Windows should be interrupted, and WDEB386 should be started in its place, which in turn will load Windows for you. In this way, the debugger surrounds the operating system rather than loading the debugger from within Windows. This is especially important considering device drivers are normally loaded before Windows has finished the full boot process. Since drivers can generally be coded in a high-level language, such as ‘C’, the information provided by WDEB386 may not mean a great deal to most developers. The developer usually will not know what the contents of certain processor registers should be, and so on. To this end, making the best use of this environment requires placing the following commands within the code: • • Breakpoints Debugger output strings Breakpoints provide a point at which the debugger will regain control of the operating system, and commands can be entered from the terminal. This is useful in getting information on the flow of a driver, as strategically placed breakpoints can show you exactly where your code is stumbling in the case of software that isn’t working. Typical application debuggers allow you to step through your source code line by line until a problem area is highlighted, however this is not possible without using one of these two commands. Debug output strings are usually far more useful than breakpoints, as contents of variables, program flow and so on, can be output to the debug terminal. In this way the terminal can be viewed for vital information while the driver is processing. An example of terminal output using these strings might be: MYDRIVER: Memory allocated successfully [2048 Bytes] MYDRIVER: Reading serial number from device registers [12-34-56-78] The two debug commands are provided by the debug-version NDIS wrapper driver, and can be called using DbgBreakPoint() and DbgPrint() respectively. When developing NDIS 4.0 Miniports for Windows95, however, due to the lack of an NDIS 4.0 driver debug version, alternative methods must be used, and simple workarounds were written by the developer to aid in the debugging of this project. Breakpoints can be implemented using assembly calls to int 3 or int 1, and VxD services can be used to implement the debug output strings (see myDbgPrintf in Section 6.3.1). Both of these alternatives are not allowable on the NT operating system, and thus the source code should provide for different compilations depending on the target operating system. This can be done using compiler commands such as: #ifdef WIN95 /* W95 Specific Code */ #else /* WinNT Specific Code*/ 56 5.5.2 SoftICE This package is a very powerful alternative to WDEB386, and most notably, provides source-level debugging. It loads in much the same way as WDEB386, as it must be loaded before Windows and sits in the background. It can be loaded and used on one machine, where a hot-key is used to switch between Windows and the debugger window. Once in the debugger, the driver source code is displayed, and can be stepped through as with most debuggers, and where the source code files are unavailable it steps through at assembly level. The aforementioned breakpoint and output string functions are available, and it is also possible to view the values stored in variables. Since the source code and symbol files are loaded, it is easy to jump to a certain function using g functionName where functionName is the name of your ‘C’ (or other) function. To do this in WDEB386 would require that you know the actual address of the function. 5.5.3 Which Debugger to Use Both debuggers are useful for different forms of testing. So called ‘white-box’ testing where the code within functions is known and tested, is best implemented using SoftICE as the code is available for display, and can be stepped through. ‘Black-box’ testing, however, may be easier using WDEB386, where the code is not known, but it is simple to watch which functions are performing their job, without complicated stepthroughs. For more information regarding black and white box testing, see Sommerville (1992)21. Where the operating system used for debugging is Windows NT, there is a further tool which replaces WDEB386, called WinDbg. It requires a serial link between two machines, both of which must be running Windows NT. However, since the primary target system for the miniport driver is Windows95 further discussion is outwith this document, and can be found in the Windows NT DDK22. Note that for full driver compatibility between operating systems, these tests must also be performed. 57 5.6 Project Planning A time plan was prepared early on in the project23, which outlined the anticipated stages of the project, and the dates by which they should be achieved. This had to be revamped eventually, when the complexity of the project became more apparent. The problem outlined in Section 5.3, regarding the undocumented compilation of an NDIS driver for Windows95, required the postponement of the starting date of the coding, however research was ongoing during this time, and it was used to design the low level hardware control method. A new set of development stages were identified, as shown below: • • • • • • • • • Development & successful compilation of a skeleton NDIS Miniport Achievement of basic communication with and control of the TI hardware Implementation of Plug and Play Integration into Windows Control Panel & Network applet Achievement of Self ID configuration Transmission between 1394 nodes Communication with protocol layer MAC/1394 Translation table mechanism Broadcast mechanism During development and testing of the source code a version-numbering system was used. Starting with v0.001, the version was incremented by 0.001 after every major change. In addition, a ‘changes’ file was maintained to document the changes as well as any bugs that were found during testing. The entire source code for each version was archived, which proved very useful when changes introduced in earlier versions began causing adverse affects. Changes, bugs and bug-fixes were also documented in the project log book, accompanied by the relevant version number. Logging details in this manner increased productivity, and the speed of source code development. References 1 Mackenzie L. (1998), Communications and Networks, McGraw-Hill, p112 ISO/IEC 13213:1994, ANSI/IEEE Std 1212, 1994 Edition, Control and Status Register (CSR) Architecture for Microcomputer Buses, ISO/IEC 3 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 6.2.4.2.1 4 IEEE Project p1394a, Draft Specification for a High Performance Serial Bus (Supplement), IEEE 5 Tanenbaum A. (1996), Computer Networks, 3rd Ed, Prentice-Hall London, pp420-3 6 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 3.8 7 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 8 8 Microsoft (1996), NT4 DDK: Network Drivers Design Guide 1.2.1, Microsoft Corporation 2 58 9 Microsoft (1996), NT4 DDK: Network Drivers Design Guide Part 2: 1.5, Microsoft Corporation 10 Buchanan W. (1997), Advanced Data Communications and Networks, Chapman & Hall, pp447-8 11 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 6.2 12 Johansson, P. (1999), Internet Draft: Ipv4 over IEEE 1394, IETF 13 Intel, Fundamentals of Ethernet Technology, Intel Certification Course FN2 14 Texas Instruments, 1394 Solutions CD-ROM, TI, filepath ‘hwsample\ic.c’ 15 Texas Instruments (1998), Lynxsoft 1394 Software Application Programmer User’s Guide- SLLU003 v2.2, TI, Chapter 6 16 ISO (1984), Basic Reference Model for Open Systems Interconnection, ISO:7498 17 IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE, Clause 6 18 Goldman J. (1998), Applied Data Communications: A Business-Oriented Approach, John Wiley & Sons, pp203-5 19 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI 20 Microsoft (1996), Windows 95 Device Driver Kit, Microsoft Corporation 21 Sommerville I. (1992), Software Engineering, Addison-Wesley 22 Microsoft (1996), Windows NT Version 4.0 Device Driver Kit, Microsoft Corporation 23 1394 Project Logbook 1, p1 59 6. FORMAL DESIGN SPECIFICATION This chapter describes the project source code in full. Due to the complexity of the driver operations, all design requirements have been placed in the module specifications, rather than supplement them with notes on the design process. It was decided that notes on how memory allocations were decided and other low level design decisions, would be best placed inside these specifications. Fundamental design issues have been introduced already, however, in Chapter 5. A detailed specification of device driver characteristics, such as data types and function calling conventions, has not been included as there are so many considerations and these are common across all driver projects. This chapter has to be supplemented with a study of the NT 4.0 DDK1 for a full understanding. 6.1 Development Approach Development of a new device driver does not proceed as normal application development. A large amount of code has to be written before a driver will even load. Thus initial development can be thought of as a rather blind process. In this instance the developer had to thoroughly research the NDIS specification, and study the code samples, before starting out with the new source code. After this, before any testing could occur, a “complete” driver had to be developed. That is, one in which all of the required exported functions were available, and a number of other issues were taken care of. It is not possible to write one function, and go ahead and test it. First, the source code, and driver, must contain every required NDIS export function. These are listed in the module specification for this particular driver, however required functions depend on the implementation. Consult the NT DDK1 for further details on such functions. Although the function needs to be present, it does not necessarily have to contain any code thankfully. As long as a basic driver structure is there, NDIS will be willing to load the driver. Thus, source was developed which contained all of the required functions, however, initially most of these contained no real code. Instead, they merely displayed some output to the debugger, to notify the developer that they had been called/entered. The first function to be called on driver loading, and thus the function which was attended to first, is the DriverEntry function. Secondly, the MiniportInitialize function is called to allocate resources such as memory and interrupts, and to set up the 1394 hardware. Once these two functions contained code, and some information was written for the ‘C’ include files, the driver was ready for initial testing. Testing, however, cannot start until the driver is installed in Windows, a process which is described in APPENDIX B. Once the driver is installed in Windows, it can be tested by resetting the machine, and loading Windows through one of the kernel debuggers described in Section 5.5. This is a laborious process, as Windows must be reloaded after every change to the source. Coupled with this, working at ring-0 gives privileged access to memory and resources, and page faults or Windows protection errors are quite common-place, as well as difficult to debug. 60 6.2 File Map The source code is spread across a number of files. This is for tidiness, to prevent unwieldy files, and to keep similar functionality within one source/object file. Table 6-1 describes the location of each function. Filename tilynx.c Functions DriverEntry tilynxInitialize tilnynxSetupAdapter tilynxFreeResources tilynxHalt tilynxReset tilynxCheckForHang myDbgPrintf cardInitialize card.c setupReceivePcls linkPcls tilynxSend send.c setupTransmitPcls tilynxQueryInformation oid.c tilynxSetInformation tilynxEnableInterrupt interrup.c tilynxDisableInterrupt tilynxHandleInterrupt tilynxIsr tilynxTransferData Table 6-1 - Project File-Map 61 6.3 Module Specification This section contains the specification for each module contained in the project source code. These have been designed and implemented by the developer, and the test results can be found in Section 7.1. Each module has been implemented as a ‘C’ function. The section is ordered as in the file-map shown in Table 6-1. All references to ‘tilynx’ or ‘Miniport’ in the specifications refer to the device driver, while the terms ‘adapter’, ‘device’ or ‘NIC’ refer to the Texas Instruments 1394 PCILynx hardware. NDIS export functions that are named ‘MiniportXxxx’ in the NT DDK, are named ‘tilynxXxxx’ here. Due to the amount of source code (170 Kbytes), it has not been included in the Appendices, and has instead been placed on the companion CD-ROM, as has an archive of previous versions. The hardware specifications are also on this CD-ROM, and it will be necessary to reference these to fully understand the source code. Full details of the CD-ROM are in APPENDIX D. All source files and functions have header files to explain the functionality they perform in simple terms. Also the source is commented throughout to help the reader gain an understanding. The format of the module specifications is shown in Table 6-2. Name Source Parameters Title of function Source file location Passed parameters. Preceded by ‘IN’ or ‘OUT’, which are not part of the parameter names. ‘IN’ refers to variables that are passed but not changed. ‘OUT’ refers to parameters that are passed as pointers, and can be changed. (For returning data to the calling function). Return Value Call Type Variable types are not included, in order to reduce the complexity herein. They can easily be found in the source code, or in the NT DDK for NDIS exports. Value passed back by ‘return’ statement. None refers to a void return declaration. NDIS Export: Required by and called by NDIS. These are never called by another function within the Miniport – they exist entirely autonomously, and communicate only with the NDIS wrapper. Internal Function: Specified by developer, called internally to driver. Detailed description of the functionality that must be performed by Description this function. It should be possible to read this and go on to write the function. This can also be used to supplement the source code, to aid understanding. Table 6-2 - Module Specification Format 62 6.3.1 tilynx.c Name Source Parameters Return Value Call Type Description DriverEntry tilynx.c IN: DriverObject – Created by the system IN: RegistryPath – Path to parameters for this driver Returns the status of the operation (successful/unsuccessful) NDIS Export This is the primary initialisation routine for the TILynx driver. It is simply responsible for initialising the wrapper and registering the Miniport driver. All device drivers require this function, in order that they can be loaded in Windows. It is run only once, during initialisation. The wrapper is initialised by passing the two input variables (DriverObject & RegistryPath) in a call to NdisMInitializeWrapper. This returns a handle, which is used to refer to this driver when communicating with the wrapper (NDIS). Once the wrapper has been notified in this manner, it will look out for the miniport driver to register itself using NdisMRegisterMiniport. Before calling NdisMRegisterMiniport, the DriverEntry function must fill in an NDIS_MINIPORT_CHARACTERISTICS structure. This structure is used to notify the NDIS library/wrapper of the miniport’s characteristics, such as the version of NDIS which it supports (v4.0), and the names of its NDIS-exported functions. In this way, the miniport’s functions may have any name, for example its Send function can be named tilynxSend, mySend or any unused name, as long as it is registered with NDIS. The names of all TILynx functions are entirely described by this design specification. This characteristics structure can be declared on the stack, because NDIS copies the information in the call to NdisMRegisterMiniport, and stores it in its own storage area. Where either of these two calls fail, DriverEntry returns STATUS_UNSUCCESSFUL, otherwise STATUS_SUCCESS is returned. 63 Name Source Parameters Return Value Call Type Description tilynxInitialize tilynx.c OUT: OpenErrorStatus - Extra status bytes for token ring adapters. OUT: SelectedMediumIndex - Index of the media type chosen. IN: MediumArray - Array of media types for the driver to choose from. IN: MediumArraySize - Number of entries in the array. IN: MiniportAdapterHandle - Handle for passing to the wrapper when referring to this adapter. IN: ConfigurationHandle - A handle to pass to NdisOpenConfiguration. Status of the operation NDIS Export This is the MiniportInitialize function of the driver. It is a required function that sets up the NIC hardware (Texas Instruments PCI card) for operation, claims hardware resources in the registry, and allocates resources such as memory, and an interrupt line. First select the network medium which this driver supports. An array of possible choices is passed in the MediumArray parameter. This array should be searched to find the index for Ethernet (NdisMedium802_3), and SelectedMediumIndex is set to point to the index value in MediumArray. If Ethernet is not passed as a valid choice (i.e. not contained within MediumArray), then tilynxInitialize should return NDIS_STATUS_UNSUPPORTED_MEDIA. Memory for the adapter descriptor block should be allocated in nonshared, virtual memory. This descriptor block is defined in tilynxsw.h as a structure of type TILYNX_ADAPTER, and stores information required by the driver. Allocation of memory is done using the NdisAllocateMemory library call. The amount of memory to allocate is equal to the size of a TILYNX_ADAPTER structure. This memory should be zeroed, using NdisZeroMemory. (Allocated memory initially contains indeterminate bit-values). NdisOpenConfiguration is used to obtain a handle to the NIC’s parameters in the Windows registry. This handle can then be used by NdisReadConfiguration and NdisReadNetworkAddress to read values from the registry. The Ethernet MAC address should be read from the registry using NdisReadNetworkAddress. This searches the stored parameters for the keyword “NetworkAddress”. The actual value is stored as a string. For example, the MAC address 1A-1A-1A-1A-1A-1A. would be stored as string value “1A1A1A1A1A1A”, not as a numerical value. Once the address has been read, it is stored in the adapter block, for later use. The registry reference where this information is stored is HKLM\System\CurrentControlSet\Services\Class\Net\xxxx\NetworkAdd ress, where xxxx is a 4-digit number used to reference a particular NIC. 64 Next, NdisMSetAttributes must be called to inform the NDIS library/wrapper about significant features of the NIC. Specifically NDIS must know that the NIC is a Busmaster DMA device, and that it is a PCI device. NdisMSetAttributes must be called before a number of other library functions, otherwise they will fail, including: • • • • • • NdisMPciAssignResources NdisMAllocateMapRegisters NdisMAllocateSharedMemory NdisMMapIoSpace Ndis…RegisterXxx (e.g. NdisReadRegisterUlong) NdisMRegisterInterrupt Following NdisMSetAttributes, if the miniport is running on Windows NT4.0, the PCI slot in which the NIC is placed must be determined. If the system is Windows 95/98, then the operating system’s Plug and Play capabilities take care of this for you. Therefore where the miniport is used in NT, the slot number must be stored in the registry (on installation), and this can be read using NdisReadConfiguration. The slot number information is then passed as a parameter in a call to NdisMPciAssignResources. This library function returns the adapter’s bus relative resources, in a structure of type NDIS_RESOURCE_LIST, which is a typedef of the CM_PARTIAL_RESOURCE_LIST structure type. Information on this structure type can be found in the DDK include file, NTDDK.H. The actual information which is needed for allocating resources is the interrupt number (IRQ) and the base addresses of the device memory. These are identified in the structure by type values of: • • cmResourceTypeInterrupt cmResourceTypeMemory Three memory base addresses should be returned, with their lengths. These refer to the three register areas on the device as described in the PCILynx specification2. The memory and interrupt information should then be stored in the adapter descriptor block (TILYNX_ADAPTER structure), to be used later in allocating local resources. Note that where the environment is Windows 95/98, the slot number parameter is ignored in the call to NdisMPciAssignResources. If the driver is run in these environments, the parameter is set to –1, a nonsense value. Instead, each PCI slot is probed until the NIC is found. The information used to identify the correct card and slot is taken from the registry value “AdapterCFID”. This is a vendor-assigned device identification, concatenated with a PCISIG-assigned vendor ID. 65 This should be stored in the registry during installation, as string value “8000104C” for the project PCILynx devices. This probing feature is undocumented in the NT4.0 DDK, which does not list the slot number parameter as optional. All of the required registry information has now been read, and the registry should be closed using NdisCloseConfiguration. This library function has only one parameter, the handle returned by NdisOpenConfiguration. Lastly, the internal function, tilynxSetupAdapter must be called, to complete the initialisation process. This function allocates the necessary memory, registers the interrupt and performs hardware writes to set up the NIC for operation. The initialise process has thus been split up into tilynxInitialize and tilynxSetupAdapter in order that tilynxInitialize is not unwieldy. Refer to the tilynxSetupAdapter module specification for a full description of this function. If any errors were reported during tilynxInitialize, tilynxFreeResources is called to free up any allocations (such as the memory for the adapter descriptor block), and tilynxInitialize returns NDIS_STATUS_FAILURE. This is also the case if tilynxSetupAdapter does not return NDIS_STATUS_SUCCESS. This provides a mechanism for cleanly unloading the driver when it is having problems loading. Name Source Parameters Return Value Call Type Description tilynxSetupAdapter tilynx.c IN: Adapter – Pointer to the TILYNX_ADAPTER descriptor block. Status of the operation Internal Function Called during driver loading/initialisation to allocate shared memory for transfers, register the interrupt, and start the card. Also maps the NIC’s register memory space into local virtual addresses. Specifically: • Maps the NIC's registers to a mem-space so that it can be programmed & read. • Allocates map registers so that send-buffers can be mapped to shared memory, which the card is able to access. • Allocates shared memory for received data from card. • Allocates shared memory for the transmit staging buffer. • Allocates memory for the transmit & receive PCLs. • Allocates memory for the dummy transmit & receive PCLs. • Registers the interrupt with the wrapper (NDIS library) • Initializes the card by writing to its registers. • Sets up the receive PCLs and buffer space, and links these to a DMA channel so that receives may begin. 66 NdisMMapIoSpace is used to map the NIC’s registers to a virtual memory range. Only Base Address 0 of the NIC needs to be mapped, because this refers to the internal registers. This is the first address range that will be returned by NdisMPciAssignResources and is thus stored as index 0 in the mem range array in the adapter block. (Adapter->MemBaseAddress[0]). The other two ranges returned refer to registers for programming local bus resources (local RAM, AUX and ZOOM Video), and these are currently not used by the miniport, so do not need to be mapped. NdisMMapIoSpace returns a pointer to a base virtual address, which can be used when this memory space needs to be accessed. To access an offset within this mapped range, the offset is simply added to the base virtual address. Next some map registers are allocated. These are used to map buffers passed to the miniport’s send function into physical addresses. Send buffers are passed down from the protocol drivers, but only exist in virtual memory. Thus, map registers are used to store the virtual to physical mappings. To actually start a mapping, NDIS function NdisMStartBufferPhysicalMapping is used. If each buffer is to constitute one packet, then we need enough registers to map the maximum number of packets. This is equal to the maximum number of transmit PCLs. NdisMAllocateMapRegisters is used to allocate these registers. The parameters to be passed to it are: 1. Handle of the adapter (in adapter block) 2. ‘0’ for Non-ISA bus. 3. TRUE for 32-bit DMA operations 4. Number of map registers needed (MAX_XMIT_PCLS) 5. Max size of a mapping (TILYNX_MAX_PACKET_SIZE) Next shared memory is allocated for the receive & transmit buffers. These must be in shared memory so that they have physical addresses, which are downloadable to the card. The NIC cannot understand/access virtual addresses. For receives, the DMA copies the received data straight into the receive buffer, and for transmits the data to be sent is placed into the transmit buffer. Each packet is stored in this memory in stages of MAX_TILYNX_PACKET_SIZE, thus a packet will never overflow into the next packet’s buffer space. Thus to reference the beginning of each packet or PCL, we add an offset to the base physical address: Adapter->ReceiveMemPhysical + (PCLNum * TILYNX_MAX_PACKET_SIZE) For example to access PCL 0: Adapter->ReceiveMemPhysical+ (0*TILYNX_MAX_PACKET_SIZE) Shared memory is allocated using NdisMAllocateSharedMemory, passing the variable names where you wish the virtual & physical addresses to be stored. The size of the buffer should be 67 TILYNX_MAX_PACKET_SIZE * MAX_XMIT_PCLS for the transmit buffer, and TILYNX_MAX_PACKET_SIZE * MAX_RECV_PCLS for the receive buffer. Memory should be allocated in cached memory since non-cached memory is a scarce system resource, and is less likely to be available. Next shared memory is allocated for the transmit & receive PCL structures. These must also be stored in shared memory, because they are accessed by the NIC. The same library function, NdisMAllocateSharedMemory, is used, and this is called once for each PCL. Thus it is called MAX_XMIT_PCLS times for the transmit PCLs, and MAX_RECV_PCLS times for the receive PCLs. The size of each allocation is always the size of a PCL structure, which is obtained using sizeof(PCL_S). The virtual, physical and buffer addresses for each PCL are stored in an array of PCL_LIST_S structures, defined in tilynxsw.h. There is one array for the receive PCLs, and one for the transmit PCLs. As well as allocating memory for these PCL structures, shared memory must be allocated for the dummy PCLs used to start a PCL queue. Thus two more memory ranges are allocated, for the transmit and receive dummy PCLs, each of size (sizeof(PCL_S)). Next the interrupt is registered with Windows. The previous call to NdisMPciAssignResources during tilynxInitialize returns the information on the IRQ number, and this can be used in a call to NDIS library function NdisMRegisterInterrupt. The parameters passed are: 1. Pointer to an NDIS_MINIPORT structure for returned information. 2. Interrupt vector (IRQ number). 3. Interrupt level (should be the same as 2. 4. Shared interrupt with another device = FALSE 5. RequestISR = FALSE 6. InterruptMode = NdisInterruptLatched RequestISR is set to false, to notify the wrapper (NDIS) that the miniport’s HandleInterrupt function should be called on an interrupt, not the Isr function. An Isr function must still be exported by the driver, as this is called if an interrupt occurs during Initialize. InterruptMode is set such that interrupts are triggered by a transition from low to high on the interrupt line. Lastly, the miniport’s cardInitialize is called. This performs some writes to the hardware to control its operation, and set it up ready to receive and send data. It sets up the receive PCLs to receive data, via a call to another internal function. For a complete description of cardInitialize, see Section 6.3.2. All memory allocated within tilynxSetupAdapter is zeroed using 68 NdisZeroMemory, since on allocation the memory will contain indeterminate data. If there was a problem at any point during tilynxSetupAdapter, NDIS_STATUS_FAILURE is returned to the calling function, which should free up any resources allocated by this function. Otherwise NDIS_STATUS_SUCCESS is returned. Name Source Parameters Return Value Call Type Description tilynxFreeResources tilynx.c IN: Adapter – Pointer to the TILYNX_ADAPTER descriptor block. None. Internal Function Frees all resources that have been allocated for use by the adapter. This is called internally by other driver functions, when there has been an error in loading the miniport, and all resources must be deallocated for use by other system applications or drivers. It is also called when the miniport is to be unloaded, for example on system shutdown. First all normal allocations are checked, in case they have not been allocated yet. It they have not been allocated, then they need not be deallocated, and a call to free the resource will have uncertain consequences. This is a safety mechanism, in case this function is called early in the initialisation sequence, in which case not all resources will have been allocated yet. Note that the order of deallocation is designed such that later allocations are freed first. This is because later allocations may depend on earlier allocations. There follows a list of all resources which are allocated during normal driver operation (latest first): • • • • • • • • • • Receive memory buffer: Adapter->ReceiveMemVirtual Transmit memory buffer: Adapter->TransmitMemVirtual Dummy Receive PCL mem: Adapter->DummyReceivePclVirtual Dummy Transmit PCL mem: Adapter->DummyTransmitPclVirtual Transmit PCLs mem: Adapter->TransmitPclList[] Receive PCLs mem: Adapter->ReceivePclList[] Interrupt (IRQ): Adapter->Interrupt Map registers Mapped IO Space for NIC registers: Adapter->tilynxRegisters Adapter descriptor block mem: Adapter (Not shared) NDIS library functions which are available for freeing resources are: 69 • • • • • Name Source Parameters Return Value Call Type Description NdisMFreeSharedMemory – for freeing the memory shared by the miniport and the hardware. NdisMDeregisterInterrupt – for deregistering the interrupt associated with the device. NdisMFreeMapRegisters – for freeing the map registers used to map virtual to physical memory. NdisMUnmapIoSpace – for unmapping the IO space used to map device memory to system virtual memory. NdisFreeMemory – for freeing any memory which was allocated, but not shared with the device. tilynxHalt tilynx.c IN: MiniportAdapterContext – Handle of Adapter descriptor block None. NDIS Export This function is called by NDIS when the driver is being removed, usually on system shutdown. It should stop the card and free all resources allocated during initialisation. The card can be stopped by blocking any interrupts, and stopping any received data: • • Call the compiler macro cardBlockInterrupts. Stop receives by writing an invalid address to the receive DMA (Channel 0) current PCL register. This can be accomplished using the compiler macro TILYNX_WRITE_REG. An invalid address is any number with a 1 in the LSB (Least Significant Bit). The freeing of resources is done using a call to the tilynxFreeResources function. The use of one function for all freeing of resources makes the code more readable, and more adaptable to changes in the allocated resources. No status is returned by this function. Name Source Parameters Return Value Call Type Description tilynxReset tilynx.c OUT: AddressingReset – If SetInformation should be redone IN: MiniportAdapterContext – Handle of Adapter descriptor block Status of operation NDIS Export Is called by NDIS if it thinks the hardware has hung. Simply performs a software reset by writing to the MPCI_MISCTRL_REG register. 70 AddressingReset can be ignored, as the reset does not upset the card’s configuration. Always returns NDIS_STATUS_SUCCESS. Name Source Parameters Return Value Call Type Description tilynxCheckForHang tilynx.c IN: MiniportAdapterContext – Handle of Adapter descriptor block Boolean. NDIS Export Is called by NDIS to check that the device has not stopped/hung. In this implementation, the device should not hang. Therefore, always returns FALSE (Device hasn’t hung). Name Source Parameters Return Value Call Type Description myDbgPrintf tilynx.c Same format as printf (IN: *pszfmt) None. Internal Function Outputs information to the debug terminal. There is no debug NDIS 4.0 subsystem (NDIS.VXD), therefore the usual DbgPrint function is not supported when developing NDIS 4.0 Miniports for Windows95. If the target system is Windows NT, DbgPrint is supported as usual. This function should use a normal DbgPrint call for NT Miniports, and for Windows95, should use VxD wrapper services for the debug output. There must be a distinction because NT will crash if a call to VxD wrapper services is made. For NT, the following code can be used: __asm jmp DbgPrint For Windows95: __asm jmp LCODE__Debug_Printf_Service LCODE__Debug_Printf_service is the Windows95 native DbgPrint function, and is provided in library VXDWRAPS.CLB, of the Windows95 DDK. Thus this file must be linked during the build process. 71 6.3.2 card.c Name Source Parameters Return Value Call Type Description cardInitialize card.c IN: Adapter - Pointer to the TILYNX_ADAPTER block Status – Status of operation Internal Function This is called during tilynxInitialize, to perform configuration of the hardware. Configuration is achieved by writing to the register-space of the PCILynx chip. The register-space should already have been mapped to a virtual address space, using NdisMMapIoSpace. The NDIS library provides the following functions for reading and writing this mapped space: • • NdisReadRegisterXxxxx NdisWriteRegisterXxxxx Where ‘Xxxx’ can be ‘Ushort’, ‘Uchar’ or ‘Ulong’, depending on the amount of memory to access. The PCILynx chip registers are stored in contiguous 4-byte (quadlet) stages, which on the target system equate to a Ulong. The compiler macros TILYNX_READ_REG and TILYNX_WRITE_REG were written to simplify the call to these register functions, and these can be found in the source include file tilynxhw.h. The offsets of each register from the base address, are defined with compiler ‘#define’ statements in tilynxhw.h, and these names are used here in italics to identify each register. The values that must be written to these registers are also defined in tilynxhw.h, to simplify the code. Precise definitions of each register, and its function, can be found in the PCILynx specification. This should be used in conjunction with this module specification to determine the actual bits to be set in each register. The following hardware configuration must be performed during cardInitialize: Perform a PCI software reset, by writing to the PCI Miscellaneous Control register (MPCI_MISCTRL_REG). Set the FIFO sizes by writing to the FIFO_SIZES register. This divides the 256-quadlet buffer into ATF/ITF/GRF. No isochronous functionality is required, thus we should split the FIFO equally between ATF and GRF. • ATF = 128 quadlets • GRF = 128 quadlets • ITF = 0 quadlets 72 Set the transmit FIFO thresholds by writing to the FIFO_XMT_THRESHOLD register. The DMA will not start processing until this amount of data has been written to one of the transmit FIFOs. • ATF = 32 quadlets (this can be tweaked as required) • ITF = 256 quadlets (this threshold will never be reached) Reset the overrun/underrun counters for ATF/ITF/GRF by writing all zeroes to the LLC_FIFO_OVRFL register. These will be incremented when overruns or underruns occur. Check the PCILynx revision ID by reading from MPCI_REV_ID_REG in the PCI configuration space. If the revision is below revision A (<0x02), some extra bits must be set in MPCI_MISCTRL_REG. This is documented in the PCILynx specification3 which states that PCI slave posted writes and PCI slave burst should be enabled to ensure PCI Spec 2.1 compliance. Set up PCLs, DMA and memory to receive data from the 1394 bus. Due to the complexity of this function, and so that it may be reused outside of initialisation, this code has been placed in a separate function, setupReceivePcls. It should be passed the receive type as a parameter (PKT_RCV or PKT_RCV_AND_UPDATE). Extra flags can also be passed, for example to implement a circular PCL queue, however this is not required, and has been provided for future implementations. The card interrupts should be enabled, and this is accomplished using the compiler macro cardUnblockInterrupts. This has been implemented as a macro because this will happen in many places of the code during driver operation. Therefore, if a different interrupt mask is required, it would only need to be changed in one place. Set retry count to 5 and retry delay to 0 Iso Intervals. This is achieved by writing 0x0005 to the DMA_BUSY_RETRY register. Note, to have a retry delay other than zero, a cycle timer must be present on the bus, which is not the case for this implementation, as it is used for isochronous functionality. Next the 1394 Bus/Physical IDs must be determined and written to the card. These are discovered by forcing a bus reset (Compiler macro BusReset), and reading the physical ID from PHY register 0, written during the self ID process. Must wait 400 micro-seconds between the bus reset and the reading of PHY0, which can be achieved using NdisMSleep(400). To read the physical registers, write to LLC_PHY_REGS with the address wished to read (0) and the READ bit, then read LLC_PHY_REGS. Store the physical ID in the adapter descriptor block, and force bus ID = 0. Finally, the 16-bit ID must be written to LLC_NODE_ADDR. 73 Set the LLC_CONTROL register to enable asynchronous transmits and receives. The RCV_COMP_VALID bit should also be set, which enables the reception of packets. This bit ensures that the 1394 Bus/Physical node ID matches the destination of the incoming packet, and also uses the receive comparators to establish whether a received packet should be accepted. By not setting the isochronous transmit/receive bits, isochronous transfer is disallowed. It is also important to note that the asynchronous transmit/receive bits are reset to 0 on a bus- reset, thus the interrupt handlers should set these bits again. This completes the hardware configuration, and the card should now be ready to receive data. It is also ready for a queue of transmit PCLs to be linked to a DMA channel and sent. At present NDIS_STATUS_SUCCESS is always returned, however a status return has been implemented to provide for future code where fails may be possible. This is specifically provided for the setupReceivePcls call, since writing to register space never returns a fail. Name Source Parameters Return Value Call Type Description setupReceivePcls card.c IN: Adapter – Pointer to the TILYNX_ADAPTER block IN: Command – PKT_RCV or PKT_RCV_AND_UPDATE IN: flags – For example to create a circular PCL queue None. Internal Function Sets up the DMA Asynchronous receive buffers: Constructs PCLs to receive asynchronous packets into a ring of memory buffers, each of size TILYNX_MAX_PACKET_SIZE. After each packet is received into a buffer, the PCL generates an interrupt so that tilynxHandleInterrupt can be called to process the received packet. The DMA channel to be used for receives is fixed as channel 0. No other channels can be used for asynchronous transfer due to complications with 1394 bus retries4. The number of PCLS in the queue should equal MAX_RECV_PCLS (a global define in tilynxsw.h). Thus each PCL in this queue must be set up to contain the correct bits set in its structure. PCLs are defined by the structure type PCL_S, also defined in tilynxsw.h. The queue should start with a dummy PCL, which only contains information in its nextPCL and errorPCL fields. All other fields are ignored, and the DMA moves straight onto the PCL at address stored in nextPCL. The last PCL in the queue should have an invalid address stored in its 74 nextPCL/errorPCL fields, unless a circular queue is required. An invalid address is an address containing a 1 in the LSB. Precise definitions of each bit in the PCL structure can be found in the PCILynx specification4. The following field values must be set in the PCLs. All other bits should be set to zero. Data Buffer0 Control & Byte Count Quadlet: transferCount = TILYNX_MAX_PACKET_SIZE – This sets the size of the available receive buffer for each PCL. isochronousMode = 0 – Asynchronous on this channel multipleIsochronous = 0 – Not required (isochronous only) transmitSpeed = 0 – Not required (transmits only) bigEndian = 1 – Send buffer in Big Endian (for 1394 Bus compatibility) waitForStatus = 0 – Pipeline transmits to improve throughput lastBuffer = 1 – This is the last buffer of the PCL (no Buffer1+) doneInterrupt = 1 – Generate an interrupt when DMA completes PCL waitSelect = 0 – No wait: continue execution command = Command – Passed as a parameter (PKT_RCV/UPDATE) Data Buffer0 Address: Set to the physical address into which packets should be received If the flags passed contains CHAN_CIRCULAR, point nextPCL/errorPCL of the last one in the queue, back to the first PCL. Once the dummy PCL has been set to point to the first PCL in the queue, the physical address of the dummy can be passed to the linkPcls function, in order to link the queue to the DMA channel. After linkPcls has been called, the function exits (with no return status). Name Source Parameters Return Value Call Type Description linkPcls card.c IN: Adapter – Pointer to TILYNX_ADAPTER block IN: Command – PKT_XMT/PKT_RCV/PKT_RCV_AND_UPDATE IN: dummyPhysicalAddress – First PCL in queue (dummy) IN: dmaChannel – DMA Channel to link the PCL queue to None. Internal Function Will take a completed PCL chain and link it to a DMA channel. Provision has been made for choosing the channel, however by design, all asynchronous transfers are to be linked to channel 0, due to limitations of the card. This fourth parameter should allow for future implementations of isochronous traffic. 75 The PCL chain must have been allocated in physical memory. The chain must have a dummy PCL packet that is used to point to the actual PCL chain that will be used to transfer the packets to memory. DMA waits for a valid PCL pointer to be written to the Current Packet Control List Address register, and for the CHA_ENA and LINK bits to be set in DMA control reg. Note: A valid address means a 0 in bit 0. The address written is the dummyPhysicalAddress parameter, which should already be a valid value. All PCLs in the queue must have valid netPCL addresses except for the last PCL if that is to halt DMA processing. Given a valid PCL address, the DMA looks at it and fetches the nextPcl address from this PCL. That is because this first PCL is just a dummy PCL, as described in the PCILynx specification5. The DMA makes this nextPcl address the current PCL address, and begins execution. If the address is invalid, the DMA is halted and the LINK & BUSY bits are cleared in DMA status register. A DMA Halted Interrupt is then generated and the channel becomes inactive. This function tests the Command parameter, to check if this PCL queue is for PKT_XMT, PKT_RCV or PKT_RCV_AND_UPDATE. Both receive types are the same code, they are differentiated by the command values in the PCL queue. If PKT_XMT: Write the physical address of the first PCL in the queue (Dummy) into the DMA 0 current PCL register, DMA_CHAN_0_CURR_PCL. Set the CHA_ENA and LINK bits in the DMA Channel 0 Control register, DMA_CHAN_0_CTRL. The rest of this register is read-only and will be updated to contain the commands etc in the PCL. Flush the ATF (Asynchronous Transmit FIFO), in case the data has not reached the ATF threshold. This can be removed if the packets are known to be frequent or large enough. This is achieved by writing to the FIFO_TEST register. After the queue has been transmitted, if the last PCL is to halt processing, the channel will be hung and CHA_ENA and LINK must be re-set in the DMA Control register, if the channel is to be used again. This completes the transmit code. PKT_RCV or PKT_RCV_AND_UPDATE: For receives the dummy physical address must be set as for PKT_XMT, 76 and the channel must also be enabled. However, the receive comparators should also be set before the channel is enabled. This sets up a system to accept or decline received packets, based on certain fields in the header. Each DMA channel has two comparators, each of which has a value register and a mask. The mask is used to determine how many bits of the value register are to be checked. Write the 1394 Bus/Physical Ids to the DMA channel 0 Word 0 Value register, DMA_CHAN_0_WORD0_VALU. The Bus & Physical IDs are stored in the adapter descriptor block after the self ID process. Write to the DMA0 Word0 Mask register, DMA_CHAN_0_WORD0_MASK. This should be set to check all bits of the Bus and Physical ID. It should also be set to check that received packets have an asynchronous transaction code. This mask can be achieved by setting bits 0xFFFF00A0. Set DMA0 Word1 Value register, DMA_CHAN_0_WORD1_VALU, to not compare the source ID on incoming packets. This is achieved by writing all zeroes. Write to the DMA0 Word1 Mask register, DMA_CHAN_0_WORD1_MASK, to enable the channel, send acknowledgements, match broadcast etc and accept self-ID packets. These bits are defined in tilynxhw.h and can be OR’ed together to achieve the value to be written to the register: CMP_W1_ENA_CH_COMPARE – Enable the channel for receives. WRITE_REQ_ACK_SEL – Send acknowledgements on a receive. MATCH_BUS_AND_NODE – Accept packets with correct 1394 ID. MATCH_3FF_AND_NODE – Accept Local Bus, correct physical ID. MATCH_BUS_AND_3F – Accept correct bus, Broadcast physical ID. MATCH_BROADCAST – Accept Local Bus, Broadcast physical ID. Don’t set SELF_ID_ENA, as this enables the catching of self ID packets, and reception into PCLs. Instead, the self ID information should be read from the PHY registers after a bus reset. This completes the receive code. 77 6.3.3 send.c Name Source Parameters Return Value Call Type Description tilynxSend send.c IN: MiniportAdapterContext – Handle of Adapter descriptor block IN: Packet – Packet descriptor for frame to be sent IN: Flags – Optional Status of operation. NDIS Export This is called by NDIS to send a packet on the 1394 bus. As required by design, it must (basically): • • • • • Check destination MAC address of frame passed to it. Translate MAC to 1394 ID (using lookup table) Build a 1394 block payload packet header. Copy the passed frame/buffer after the header. Pass to setupTransmitPcls. NdisQueryPacket is used to find the first buffer descriptor contained by the packet. NdisQueryBuffer is then used to find the (virtual) address and length of this buffer. A 1394 async write block payload packet header is created at the beginning of the transmit memory. This should be set with the source ID, correct transaction code for block writes (1), no destination offset. The data length field should be set to the buffer length returned by NdisQueryBuffer (length of the frame to be sent). The buffer to be sent should then be copied into the transmit memory, directly after the 1394 packet header. The starting address of the buffer has been returned by NdisQueryBuffer. The destination 1394 ID must now be set in the packet header. This must be obtained by translating the buffer’s destination MAC address to a 1394 ID. The destination MAC is found in the first 6 bytes of the buffer. tilynxSend then uses this to lookup the translation table, stored in its Adapter descriptor block, as required by the design. Next, the shared memory should be updated with NdisUpdateSharedMemory, to make sure it is current when copied by the hardware. Finally, setupTransmitPcls is called, passing the address of the transmit staging buffer, as well as the length of the buffer (equal to the 1394 header length + the frame length). Note that if the destination MAC contains a binary 1 in all 32 bits (hex 78 FF:FF:FF:FF:FF:FF), this is a broadcast transmission, and thus tilynxSend should copy the same packet to all nodes in its lookup table. If the lookup table has not been built (unlikely as this happens very quickly), it should use the self ID information it has received. Only returns NDIS_STATUS_SUCCESS. Name Source Parameters Return Value Call Type Description setupTransmitPcls send.c IN: Adapter – Pointer to the TILYNX_ADAPTER block IN: BufferLength – Length of buffer to be sent IN: BufferPhysicalAddress – Physical address of transmit buffer None. Internal Function Take a passed buffer to be transmitted, and construct a PCL queue to link to a DMA channel. The buffer should be correctly formatted as a 1394/FIFO packet with header and payload. For now only single-packet sends are implemented, thus only one PCL is necessary in the queue (as well as a dummy). If multiple-packet sends are required, the code can easily be changed to reflect the way setupReceivePcls works. Precise definitions of each bit in the PCL structure can be found in the PCILynx specification4. The following field values should be set in the PCLs. All other bits should be zero. Dummy PCL layout: nextPCL = errorPCL = (Physical address of first PCL in queue) All other bits are ignored. PCL0 layout: nextPCL = errorPCL = (Invalid address e.g. 0x1) Data Buffer0 Control & Byte Count Quadlet: transferCount = BufferLength – Passed as a parameter isochronousMode = 0 – Asynchronous for this channel/PCL multipleIsochronous = 0 - Can be anything, isochronous only transmitSpeed = 1 - Transmit speed, 1=200Mbps/S200 bigEndian = 1 – Big Endian for 1394 Bus compatibility waitForStatus = 0 - Don't wait: pipeline asynchronous transmits lastBuffer = 1 – This is the last buffer doneInterrupt = 1 - Post an interrupt when status is completed for PCL waitSelect = 0 – Don't wait to continue execution of PCL command = 2 - Binary:0010 = Command: XMT memory to 1394 FIFO Data Buffer0 Address: Physical address of the data buffer (BufferPhysicalAddress was passed 79 as a parameter). Once the dummy and transmit PCLs are constructed, they are in a queue: Dummy->PCL0->Halt Processing. The dummy PCL should now be linked to a DMA channel. Design requires that channel 0 be used, thus linkPcls is called, passing PKT_XMT as the command, with channel=0 and passing the physical address of the dummy PCL. After linkPcls finishes, the function is complete and no status needs to be returned. 6.3.4 oid.c Name Source Parameters Return Value Call Type Description tilynxQueryInformation oid.c IN: MiniportAdapterContext - Handle of Adapter descriptor block IN: Oid – The NDIS_OID to process IN: InformationBuffer – Stores result of the query IN: InformationBufferLength – Number of bytes left in buffer OUT: BytesWritten – Number of bytes written to InformationBuffer OUT: BytesNeeded – Contains bytes needed if not enough room in InformationBuffer Status of operation NDIS Export Is called by NDIS/protocols to query the OID (Object Identifier) information relative to the adapter. This information includes such details as the MAC address, media type, maximum number of packets support at once etc. Algorithm: • switch(oid) • get requested information • copy into buffer • if not enough room in buffer, request a larger buffer Supported OIDs, which must be ready for querying, and the results they should return, are: OID_GEN_MAC_OPTIONS = “ndis_mac_option_transfers_not_pend | ndis_mac_option_receive_serialized | ndis_mac_option_copy_lookahead_data | ndis_mac_option_no_loopback) 80 OID_GEN_SUPPORTED_LIST = List of supported OIDs OID_GEN_HARDWARE_STATUS = NdisHardwareStatusReady OID_GEN_MEDIA_SUPPORTED = OID_GEN_MEDIA_IN_USE = Ethernet OID_GEN_MAXIMUM_LOOKAHEAD = Max payload (1024 for S200) OID_GEN_MAXIMUM_FRAME_SIZE = Max payload – Eth header size = 1024 – 14 = 1010 (at S200) OID_GEN_MAXIMUM_SEND_PACKETS = 1 (only single packet sends) OID_GEN_MAXIMUM_TOTAL_SIZE = Max 1394 payload size (eg 1024 for S200) OID_GEN_LINK_SPEED = 2,000,000 (For S200) OID_GEN_TRANSMIT_BUFFER_SPACE = Max packet size = Payload + Header = 1024 + 16 = 1040 (For S200) OID_GEN_RECEIVE_BUFFER_SPACE = Max packet size OID_GEN_TRANSMIT_BLOCK_SIZE = Max packet size (one block = one buffer. by design) OID_GEN_RECEIVE_BLOCK_SIZE = Max packet size OID_GEN_VENDOR_ID = Read from PCI config space (eg 0x8000104C for TI cards) OID_GEN_VENDOR_DESCRIPTION = “Texas Instruments 1394 PCILynx Card.\0” OID_GEN_VENDOR_DRIVER_VERSION = Project version number (eg 0.038) OID_GEN_DRIVER_VERSION = NDIS version number (4.0) 81 OID_GEN_MEDIA_CONNECT_STATUS = NdisMediaStateConnected OID_802_3_PERMANENT_ADDRESS = OID_802_3_CURRENT_ADDRESS = MAC Address stored in Adapter descriptor block OID_802_3_MAXIMUM_LIST_SIZE = Max multicast list size = 0 (not used) If neither of these OIDs was passed: return NDIS_STATUS_INVALID_OID If buffer is too small, return NDIS_STATUS_INVALID_LENGTH and pass the number of bytes needed. Else copy into buffer and return NDIS_STATUS_SUCCESS Name Source Parameters Return Value Call Type Description tilynxSetInformation oid.c IN: MiniportAdapterContext – Handle of Adapter descriptor block IN: Oid – OID code to identify the set operation on the driver IN: InformationBuffer – Buffer containing data to be used in the set IN: InformationBufferLength – Number of bytes in InformationBuffer OUT: BytesRead – To return number of bytes read from buffer OUT: BytesNeeded – Returns if additional bytes needed for this OID Status of the set operation NDIS Export Handles a set operation for a single OID. Algorithm: • verify length • switch(oid) • process request OIDs: OID_GEN_CURRENT_PACKET_FILTER Return unsupported if set is for either of: ndis_packet_type_source_routing ndis_packet_type_smt ndis_packet_type_mac_frame ndis_packet_type_all_functional ndis_packet_type_all_functional ndis_packet_type_group Otherwise set the packet type in Adapter descriptor block 82 OID_GEN_CURRENT_LOOKAHEAD Copy into Adapter descriptor block. If the OID passed was neither of these, return NDIS_STATUS_INVALID_OID. 6.3.5 interrup.c Name Source Parameters Return Value Call Type Description tilynxEnableInterrupt interrup.c IN: MiniportAdapterContext – Handle of Adapter descriptor block None. NDIS Export Is called by NDIS to enable interrupts on the card. This function only calls the compiler macro cardUnblockInterrupts. This ensures that if the interrupt mask is to be altered, the code must only be changed in one place. This is usually called after tilynxHandleInterrupt. Name Source Parameters Return Value Call Type Description tilynxDisableInterrupt interrup.c IN: MiniportAdapterContext – Handle of Adapter descriptor block None. NDIS Export Is called by NDIS to disable interrupts on the card. This function only calls the compiler macro cardBlockInterrupts. This makes for more readable code, and allows for easy code alteration in the future. It is also included as an obvious pair to the macro cardUnblockInterrupts. This is usually called before tilynxHandleInterrupt. Name Source Parameters Return Value Call Type Description tilynxHandleInterrupt interrup.c IN: MiniportAdapterContext – Handle of Adapter block. None. NDIS Export Interrupt handler during normal driver operation (after tilynxInitialize). Should acknowledge interrupts when they occur, otherwise they will stay set, and the machine will enter an infinite loop (calling tilynxHandleInterrupt). 83 There are two interrupt registers, PCI and LLC. The LLC register holds very hardware-specific interrupt status bits, such as incoming CRC headers. The logic OR of the LLC is fed into one bit of the PCI register, thus if any one LLC bit is set, it will cause a PCI interrupt. Details on this can be found in the PCILynx specification6. All bits of the PCI and LLC registers can be enabled or disabled, depending on the requirement. Only a select few are enabled, otherwise the card would be continually causing interrupts, and slowing the system down. To acknowledge an interrupt, a binary 1 should be written to its status bit. tilynxHandleInterrupt must first read the PCI reg, to determine where the interrupt is coming from. If it is a (PCI) DMA0_PCL interrupt, then a packet has been completed on DMA0. This maybe a receive or a transmit. On receives, the data should be indicated up to the protocols, in the lookahead buffer, using an EthIndicateReceive function call. First however, the DMA0 status register should be read to determine if it was successful. Finally the interrupt must be acknowledged by writing a 1 to its status bit. If it is a (PCI) DMA0_HLT, then DMA channel 0 has been halted. Usually caused by a PCL queue which halts at the end of the queue. The channel must be reenabled by writing to the DMA0 control register. The interrupt should be acknowledged. If the LLC_INT bit is set, then there is an LLC interrupt. The LLC_INT register should be read, and checked for the following bits: BUS_RESET – A bus reset has occurred. Acknowledge, and reenable asynchronous receives/transmits by writing to the LLC_CONTROL register (bus resets clear these bits). Also the self ID process after a bus reset assigns the node a new 1394 ID, and this should be read from PHY0 reg. There must be a pause of 400uS after a reset before this is read, however NdisMSleep will not work at this IRQ level. Thus the reading of PHY0 should be deferred until processed by a function running at a different level, or until sufficient time has passed. See HDR_ERR bit. GRF_OFLOW – An overflow has occurred in the General Receive FIFO. Acknowledge. ATF_UFLOW – An underflow has occurred in the Asynchronous Transmit FIFO. Acknowledge. RXDATA_RDY – Received packet status bit. Acknowledge. Should not really be turned on, as any proper receives are notified in the PCI reg. This can happen very often. AT_STUCK – Stuck on an asynchronous transmit. Acknowledge. 84 HDR_ERR – Incoming header CRC error. This also occurs when physical layer packets are received, such as self ID packets. This is because they don’t use the same CRC process. They are just one quadlet of information followed by a quadlet which is the inverse of the first quadlet. This can be used to read from the PHY0 register for the 1394 ID, as the last time this occurs will be after the last self ID packet has been received. Finally the LLC_INT bit should be acknowledged in the PCI register. Interrupt handlers are useful when debugging, for displaying information about received packets, and register status. It can be used to dump the contents of receive buffers. Name Source Parameters Return Value Call Type Description tilynxIsr interrup.c OUT: InterruptRecognized – TRUE if interrupt was from this card OUT: QueueDpc – TRUE if a DPC should be queued IN: Context – Pointer to Adapter object None. NDIS Export This is an interrupt handler, however it is only called if an interrupt occurs during tilynxInitialize. This is because the Miniport has been designed so that NDIS handles and identifies the interrupt, then calls tilynxHandleInterrupt. This is done by setting requestIsr = FALSE on the call to NdisMRegisterInterrupt. This function should only be entered when a bus reset occurs, as this is performed during initialize, thus tilynxIsr does not process packet receives. It still must however check for and acknowledge the many interrupts as done during tilynxHandleInterrupt. This is for robustness, as one can never be entirely sure what might cause the interrupt. If a rogue interrupt were to enter tilynxIsr, and it wasn’t acknowledged the machine would enter an infinite loop. Therefore see tilynxHandleInterrupt for the interrupt checking code. However, there are extra functions which must be performed by tilynxIsr, as follows: Since NDIS has not identified the interrupt, it must check to see if the interrupt has actually come from this card. It does this by reading the PCI interrupt register. If INT_PEND is set, then the interrupt has indeed come from this card. InterruptRecognized should be set to TRUE, otherwise set to FALSE. Also must block interrupts (using cardBlockInterrupts) before doing any processing. tilynxDisableInterrupt is called before 85 tilynxHandleInterrupt, therefore tilynxHandleInterrupt need not do this. Equally, after the function is complete, it should call cardUnblockInterrupts, to reenable interrupts on the card. Finally, set QueueDpc = FALSE, as all processing is performed inside tilynxIsr. No value is returned. Name Source Parameters Return Value Call Type Description tilynxTransferData interrup.c OUT: Packet – To store and pass up the received data. OUT: BytesTransferred – Number of bytes in packet IN: MiniportAdapterContext – Handle of Adapter descriptor block IN: MiniportReceiveContext – Handle for particular receive data IN: ByteOffset – Offset in data to start copying from IN: BytesToTransfer – Number of bytes to copy Status of operation. NDIS Export This is called by NDIS when not all of the received data was copied in the lookahead buffer. This is a translation from the protocol call NdisTransferData. By design, all data will be copied in the lookahead buffer. Therefore this function is left empty. Only returns NDIS_STATUS_SUCCESS. For completeness and robustness, this should eventually be coded to copy the received data into a proper packet descriptor, where time is available to do so. 86 6.4 Test Process It was decided that testing be split up into two phases: § § Module testing of the individual ‘C’ functions Functional testing, based on the development stages listed in Section 5.6. Testing can occur through including strategically placed debug output strings throughout the code. Thus, the debug terminal screen displays these strings line-byline, indicating the progress of execution. Contents of variables should also be displayed using these strings, to ensure that data is being processed correctly. Listing all of these debug commands within this document would be unworkable, however they have been kept within the source code, which should be checked in conjunction with each module’s specification. Note that release versions will not include the debug commands, therefore this will not have an adverse effect on the Miniport’s size. Function tracking is important and can be implemented using debug strings such as: “Entering function: tilynxInitialize” “Leaving function: tilynxInitialize” Given that functions are called dynamically by the NDIS wrapper, there are often gaps in execution where one function has finished, while NDIS is performing other operations, before the Miniport is called again at a different function. In this respect, the Miniport cannot be tracked through a number of functions at a time. Determining what is happening whilst the Miniport is not executing is extremely difficult. The only method of watching what NDIS is doing, is to trace through assembly language lineby-line, which unfortunately gives away little of what is going on. The NDIS wrapper does not output any debug strings to notify the terminal of what processing is currently occurring. Rather than repeat the actual tests that should be performed, they are listed in the Test Results section in Chapter 7. All tests are to be carried out using Microsoft WDEB386, as source-level debugging, such as that provided by SoftICE, is most useful for debugging during development, rather than final tests. Finally, Microsoft have provided a tool-set for testing NDIS Miniport drivers, known as the Windows Hardware Quality Labs (WHQL) NDIS Tester. This should also be used to test the Miniport, and can provide useful information on how well the driver interfaces with the NDIS library. It is possible to also use this program to send test communication between two machines running the NDIS Tester, which of course tests the hardware interface as well. References 1 Microsoft, Windows Version 4.0 Device Driver Kit, Microsoft Corporation Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI 3 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI, 6.2.11 4 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI, 5.3.2 5 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI, p37 6 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI, 5.3.1.7 2 87 7. TEST RESULTS This chapter details the results of tests carried out on the project Miniport. These have been carried out as detailed in Section 6.4, using both the WDEB386 debugger and the Windows Hardware Quality Labs (WHQL) test tools. In all cases, two diverse test systems were used to obtain the results, as shown in Table 7-1. Test System 1 Windows95 updated to NDIS 4.0 Pentium 166MHz TI 1394 PCILynx adapter Table 7-1 - Test Systems Test System 2 Windows95 OSR2 Pentium II-400MHz TI 1394 PCILynx adapter 7.1 Module Tests This section details the tests of the individual modules developed within the Miniport, and described by the module specifications in Section 6.3. They are performed by installing the Miniport in Windows, and running the WDEB386 debugger. Two machines are required, one running the Miniport, and one viewing the debugger output, as described in Section 5.5. Error-checking has been implemented throughout the source code, such that the Miniport will always return a fail where problems occur. Notification of the error is also displayed to the debugger terminal. Only one PC was tested at a time, not connected to the other PC over the 1394 bus. This is because these module tests are only to test the source, and identify any errors in the path through the source. If there is a return value, modules should return NDIS_STATUS_SUCCESS. Actual functional tests, such as communication between nodes, will be tested later. Due to the nature of drivers, these modules are not called by a “main” function, therefore the tests proceeded by causing situations where the function would be called by NDIS. The developer cannot simply dictate the module that should be run next. The format of these results are as shown in Table 7-2. PASSED/FAILED Function Name A summary of the functionality performed by the module. This does not describe the entire functionality performed, but identifies some of the important operations. Table 7-2 - Module Test Results Format The following results were obtained: PASSED DriverEntry Returns NDIS_STATUS_SUCCESS. The Miniport is properly registered with the NDIS wrapper, identifying that its characteristics are deemed acceptable for loading. tilynxInitialize PASSED 88 Returns NDIS_STATUS_SUCCESS. The Miniport reported itself as 802.3 media type, which was accepted by NDIS. The card has been identified by NDIS, through Plug and Play, and NDIS has reported the memory ranges and IRQ line that the card uses. The call to tilynxSetupAdapter has succeeded, in setting up the hardware and all other resource allocation. PASSED tilynxSetupAdapter Returns NDIS_STATUS_SUCCESS Allocations have been successful. These allocations include mapping the register space in order that the card can be read, allocating memory for receive/transmit and registering the interrupt. Call to cardInitialize was successful, which set up settings on the 1394 hardware. A ring of receive memory has been set up, and the card is ready to receive data. PASSED tilynxFreeResources Tested by shutting down Windows, thus unloading the driver. Is called, and deallocates resources, producing no error messages. PASSED tilynxHalt Is also tested by unloading driver. This calls tilynxFreeResources, which has been tested and passed. PASSED tilynxReset Returns NDIS_STATUS_SUCCESS. Very simple function, which writes one bit to hardware. tilynxCheckForHang Always returns FALSE. PASSED PASSED myDbgPrintf Outputs to the debugger terminal. Is already deemed to work, as is used during tests. PASSED cardInitialize Returns NDIS_STATUS_SUCCESS. Inputs settings to the 1394 hardware register-space. Successfully causes a bus reset, and obtains a unique 1394 ID. Successfully sets up a ring of receive memory. setupReceivePcls PASSED 89 Successfully sets up a PCL queue for receives. Calls linkPcls to link the queue to the card. PASSED linkPcls Successfully links the PCL to a DMA channel on the 1394 hardware. Debug output displays the PCLs correctly, as required. PENDING tilynxSend Due to problem with protocol communication, has yet to be tested. However, a similar replacement send function has been created and tested. See Section 7.3.3. PASSED setupTransmitPcls Although not called through a protocol send as yet, this has been tested via a test send function. PCLs are correctly created, and displayed to the debug terminal during the linkPcls function. PASSED tilynxQueryInformation Returns NDIS_STATUS_SUCCESS. On loading, NDIS queries a number of OIDs, which are returned successfully. PASSED tilynxSetInformation Returns NDIS_STATUS_SUCCESS. After OIDs have been queried, NDIS attempts to set various characteristics through this function. These details are stored, and displayed to the debugger correctly. PASSED tilynxEnableInterrupt Interrupts are enabled on the card. This is established by adding a device to the 1394 bus. A bus reset interrupt is generated. PASSED tilynxDisableInterrupt Interrupts are disabled on the card. This is established by adding a device to the 1394 bus. A bus reset interrupt is not generated. PASSED tilynxHandleInterrupt Interrupts are acknowledged. A large amount of card register information is displayed to the debugger. This can be induced by causing a bus reset. PASSED tilynxIsr Interrupts are acknowledged. A large amount of card register information is displayed to the debugger. This is automatically induced during cardInitialize which forces a bus reset. PASSED tilynxTransferData Merely outputs a debugger message. 90 7.2 Functional Tests These tests have been specified by the developer, in order that the main functionality required of the Miniport can be tested. They check the more general functionality required of the driver, rather than strict module tests. These might be described as system tests, insofar as a dynamic library can be tested as a whole system. 7.2.1 Resource Allocation/Deallocation All resources were allocated correctly on the test systems, and no failures were notified. Also these resources are correctly deallocated when the driver is unloaded or halted. 7.2.2 Hardware Communication Control of the TI 1394 cards has been achieved. The register-space on the card can be read and written to, thus programming the hardware for operations. Simple tests were performed for read and write access: § § Read the very first quadlet of the register space, which contains the IEEE-assigned vendor and device ID. For the project hardware, this ID is 0x8000104C. Write to the LLC physical (PHY) registers to cause a 1394 bus reset. If the bus reset interrupt is serviced, then the write was successful. More extensive tests are not necessary, as the rest of the functional tests require that the hardware is under control. 7.3 Plug and Play This has been implemented successfully, and no configuration has to be performed either on the physical card, or in Windows. If the TI 1394 card’s memory allocation or interrupt (IRQ) line changes, this is detected on driver loading, and the Plug and Play settings are automatically used. This was tested by introducing new PCI hardware to the system, changing the PCI slot, and removing other PCI hardware. The slot number is not fixed, thus when a card moves it is automatically searched for and found. Note that Plug and Play is not supported in NT4, thus when a card is installed, the slot number must also be written as a registry entry by the installation script. The IRQ and memory resources, however, do not need to be set manually. 7.3.1 Integration into Windows Control Panel The Miniport has been successfully integrated into the Windows device manager, such that it can be viewed in the Control Panel, and also configured in the Network properties applet. It is identified in the device manager by a unique name as part of the Network device section, as shown in Figure 7-1. 91 Figure 7-1 - Windows Device Manager The Miniport, and the 1394 bus network, can be set up by system administrators using the Network properties applet, as shown in Figure 7-2. From here the administrator can bind the chosen protocol drivers, such as TCP/IP or IPX/SPX, to the 1394 hardware. Figure 7-2 - Network Properties Applet 92 7.3.2 Self ID Configuration The Self ID capabilities of the Miniport and bus were tested, to ensure that each node selects a unique 1394 ID for itself. This was tested by continually plugging and unplugging the nodes on the bus, to induce bus resets. The interrupt service routine subsequently displayed the new 1394 ID, which proved to be unique in every case. 7.3.3 Transmission Between 1394 Nodes Test packets were set up and successfully transmitted between two machines running the Miniport driver. Two transaction types were tested, as they are the required types identified in the design: § § Asynchronous Write with quadlet payload – required for building the 1394/MAC translation table. Asynchronous Write with block payload – required for sending the data passed down by the protocol drivers. A test system was implemented within the Miniport, for this purpose. One machine is set as the Sender, and the other is set as the receiver. This distinction is made using the Control Panel network applet, with a parameter called TestSend. If this parameter is set to 1, this machine will send a packet to the other node. Quadlet writes were tested by constructing 1394 packets with a quadlet payload of 0xDEADBEEF. A ‘C’ function was developed for this purpose, and can be found in send.c in the source code1, named createTestSendQuad. Block writes were tested by, similarly, constructing 1394 packets with a block payload. This time, the payload consisted of 0xDEADBEEF, repeated throughout the payload. Data sizes of one quadlet to 256 quadlets (S200 limit) were tested. A test function was created, which can also be found in send.c, named createTestSendPacket. In both cases, the tests were performed in both directions, by using both PCs as the sender, and addressing the packet to the receiver node’s 1394 ID. The test packets were successfully received at the receiver node. The packets were viewed by dumping the receive memory to the debug terminal whenever a packet received interrupt is generated. The memory dump could be seen to be exactly the same as the sent packet. Packet addressing was tested by sending packets addressed to a 1394 ID that is not on the bus. It was found that these packets were not picked up by either node, thus explicit addressing was deemed to work properly. Also bus resets were induced to force a 1394 ID change, after which, packets could still be received. Therefore the receive comparators have been properly set, and the Miniport will continue to work after ID changes. These results are very important, and signify that proper 1394 communication has been achieved. Thus the Miniport has successfully acquired full control over the hardware and the 1394 bus, and is ready to immediately place frames passed down by the protocols, into the block payload. 93 7.3.4 Communication with Protocol Layer Communication between the Miniport and the protocol drivers above, uses the NDIS wrapper as an intermediary, however for simplicity we will assume there is a direct link herein. This communication usually takes the form of either status queries (object identifiers) or actual packet transfer. Status functionality was found to be implemented successfully, however a problem was identified with other protocol communication. Object Identifiers (OIDs) are well documented in the NT 4.0 DDK2, but are basically used by the protocol drivers to ask for certain information from the Miniport, such as the MAC address, or the largest acceptable packet size. On loading the Miniport, a number of OID queries and sets are performed by the protocol drivers, and these complete successfully. The information queried and set is output to the debug terminal to give an idea of what information the protocol is interested in. However, once the OID stage is complete, a bug was found that causes a Page Fault or Windows protection error. This proved to be a major stumbling block in the project development, and is as yet unsolved. The problem in finding the cause of this, is that it is occurring outwith the Miniport, and thus there is no information available in the debugger, as to what the system is currently doing. If the problem had been within the Miniport, the source could have been traced using SoftICE until the Page Fault occurred, however it is being thrown up by some other driver or application. Tracing through assembly language with SoftICE indicates that this problem is occurring after a call to NdisSend, which should eventually be passed onto tilynxSend. However, tilynxSend is never entered, therefore the page fault must be occurring within NDIS. A great deal of time has been spent in fixing this, however nothing has been found yet due to the lack of debug information. It is envisaged that the problem can be fixed in a matter of days or possibly up to two weeks, as fault-finding had to be postponed for the preparation of this document. In the meantime, the problem can be bypassed by providing certain OID information to the protocol drivers, in order to cause the protocols not to send any data. For instance, by having the Miniport report that it does not support broadcast transmits, TCP/IP will not send any data, however the driver still loads and is bound to the TCP/IP drivers. This is useful, but ultimately the communication with the protocols must go past this current stage. The Miniport was tested by binding with TCP/IP, IPX/SPX and NetBEUI, producing the same results. The problem is common across all of these protocols. 7.3.5 MAC/1394 Translation Table Mechanism The code for creating and checking the translation table has been provisionally created, and should be ready for such a time when the protocol drivers pass a frame down. The code for sending the small packets has been based on the createTestSendQuad function used for testing quadlet writes, and therefore should be working. Until the Page Fault problem is bypassed, this functionality cannot be tested, due to the fact that no packets are passed down from the protocol drivers. 94 7.3.6 Broadcast Mechanism Code for implementing the broadcast mechanism outlined in Section 5.2.10 has been provisionally written, and can be found in the tilynxSend function. Given that only two cards have been provided for use with the project, it will be difficult to test this functionality fully. However, the code is in place and ready for such a time when the Page Fault is fixed. 7.4 Driver Resource Usage Some tests were performed on the Miniport to find information on resource usage. These were performed on a release version compile, thus the debug commands are not present. The results are shown in Table 7-3. File Size SoftICE-reported memory footprint Virtual memory allocated Page-locked memory allocated Total allocated memory Table 7-3 - Resource Usage 26,080 Kbytes 26 Kbytes 1,104 Bytes 4,448 Bytes 5,552 Bytes It can be seen that the dynamically allocated memory, for various uses within the Miniport, amounts to only 5,552 bytes, and 1,104 bytes of this can be paged out. 7.5 WHQL Test Results Finally, the WQHL NDIS Tester was used to verify the Miniport. The full detailed test logs can be found on the companion CD-ROM3, however a summary is included here in Table 7-4. Test Performed Total Variations Master client test 5 Open/close adapter tests 45 Query OID information 28 Set lookahead size 58 Set multicast address 2 Set packet filter 320 Query OID statistics 33 Totals 491 Table 7-4 - WHQL Results Passed 5 45 28 58 1 160 33 330 Failed 0 0 0 0 1 160 0 161 The send and receive WHQL tests have not been performed, due to the page fault problem when starting send transactions. The 161 failures that have been identified by the tests are related to multicast support. This is because the Miniport is provisionally reporting that multicast is not supported, in order to bypass the page fault on Windows loading. In eventual implementations, the Miniport will report that multicast is supported, at which time all of the 161 variations will pass. 95 References 1 Companion CD-ROM, path ‘\source\’ Microsoft (1996), Windows NT Version 4.0 Device Driver Kit: Network Drivers Design Guide, Microsoft Corporation, 2.4.6 3 Companion CD-ROM, path ‘\whql\’ 2 96 8. DISCUSSION AND CONCLUSIONS This final chapter takes stock of the achievements of the project, and discusses the implications both now and in the future. 8.1 Evaluation of Test Results Given the results documented in Chapter 7, it can be seen that a great deal has been achieved in the time available. Taking first the required stages of development which were identified in Section 5.6, the current project status is shown in Table 8-1. Development Stage Skeleton Miniport Basic communication with hardware Plug and Play Integration into Windows and Control Panel Self ID process Transmission between 1394 nodes Protocol layer communication MAC/1394 Translation Broadcast mechanism Table 8-1 - Stage Completion Status Status Achieved. Achieved. Achieved. Achieved. Achieved. Achieved. Pending. Pending / Ready Pending / Ready A Miniport has been developed which was a difficult process in itself, and it does perform a great deal of functionality. A major step was to obtain full control over the TI 1394 hardware, and packets can be sent between nodes. However, the NDIS problem has temporarily halted actual packet transfer between the protocol drivers at two nodes. Finally, proprietary and unique methods have been conceived by the developer, to achieve a working relationship between 1394 and computer networking. Although these have not yet been tested, provisional code has been written, and the Miniport should be quite ready for such a time when the NDIS problem is overcome. To look next at the Requirements Specification of Chapter 2, these goals were of a higher level than the actual implementation, and would be more apparent to the user. These requirements were as follows: 8.1.1 Operating System The operating system decision lay with Windows95, and this has been the path followed throughout the project. The Miniport is Windows95-ready, both with the first release, and the later OSR2 release. However, during development care was taken to ensure compatibility with other Windows versions, and it should be that the Miniport works in Windows98 and versions of NT later than and including v4.0. These have not been tested yet, as no machines were readily available, however they will be tested in the near future. It is a possibility that the problem with NDIS may be non-existent on these systems, as NDIS behaves differently in different versions. Secondly, NT and Windows98 provide more debug information for NDIS drivers than Windows95, and it is envisaged that it will be easier to identify the cause of the problem with either of these operating systems. 97 8.1.2 Hardware Control has been gained over the BTL-supplied hardware. 8.1.3 Transport Protocol Although not yet fully working, the Miniport has been developed as protocolindependent thus any future Windows protocol chosen by BTL will be supported. The Miniport can be bound to all Windows protocols, and communicates its characteristics to these drivers properly. However, the NDIS problem has halted further binding with these protocols. The aforementioned tests on Windows98 and NT should provide a route to the resolution of this. 8.1.4 Software The required System Administrator interactions have been achieved. The Miniport is Plug and Play ready, and can be installed and set up in Windows as with any other regular network device. The user-level interactions however cannot be tested, due to the NDIS problem between the protocol and the Miniport. 8.1.5 System Resources Resources have been consciously kept to a minimum, such that the Miniport makes a very small footprint in Windows. (See Section 7.4). 8.1.6 Speeds The Miniport is adaptable to support any current 1394-specified bandwidth, and can also support all future bandwidth changes. 8.1.7 Schedule As noted in the requirements specification, the schedule has extended slightly beyond that of April 1999. As the problem is a bug, and it is believed that the code is otherwise complete, the new completion date should be within two weeks. 8.1.8 Budget The entire project has been completed at no cost to the customer, BTL, other than the loan of one computer and two 1394 devices. The developer has also spent nothing in achieving these results, other than development time. 8.2 Evaluation of 1394 and Networking No direct measurements of a 1394 network’s capabilities have been taken as yet, and these will be available as soon as the protocol drivers are able to pass packets down to the Miniport. However some observations can be made based on the work carried out, and these should stand in an eventual 1394 network. Actual tests will be useful to check that there are no slight differences to the envisaged data rates, but these should be small if any, as the developer has endeavoured to learn the characteristics of the 1394 bus, and has taken all of this into account. A key point regarding 1394 and networks are the data rates, and the test system will be capable of 200Mbps, and up to 400Mbps if the latest TI hardware is used. Secondly, 1394 uses a fair arbitration mechanism that will have a very positive effect on the network traffic and actual data rates. To compare this to Ethernet, most adapters for this media are capable of 10 or 100Mbps data rates, and use the collision detection mechanism for bus access. Due to collision detection, when bus traffic on an Ethernet network is high, there can be severe consequences to the data rates achieved. 1394, 98 however, avoids this with its fair arbitration, and should be able to cope well with any amount of traffic, up to the available bandwidth of 200 or 400Mbps, and beyond. Although 200Mbps is double the rate achievable with normal Ethernet hardware, it may not be a significant improvement for companies to overhaul their existing networks. At present this would be costly, as 1394 adapters are relatively expensive, but 1394 hardware manufactures promise cheap interfaces in future, and if adopted by Intel as seems likely, we may see 1394-ready motherboards in our PCs quite soon. Thus there would be no cost in upgrading networks to 1394 from legacy media, and the benefits could be achieved immediately. Coupled with this, given the eventual results from this project, existing software will continue to work when the network is changed, thanks to the NDIS concept. Implications for the home user are slightly different, and consumers may be more enamoured to using the built-in 1394 plug that they are already using to connect to their stereo and so on. They would have no reason to stick to old media, as they do not have a large installation base of legacy hardware. 1394 is not entirely advantageous for networking however, as lengths between nodes are currently limited to 4.5m. It may be that an environment is sufficiently dispersed such that longer distances are required. There are two mitigating arguments; repeaters can be used to extend the distance between nodes, and perhaps more attractively, work is ongoing to increase these lengths. Without these improvements there may be problems with the adoption of 1394 as a networking medium, and it must be addressed. Secondly, the number of nodes per bus is limited to 63, and this may not fulfil requirements in an office, however if bus bridges are sufficiently cheap this should not pose such a problem. Coupled with this, there are arguments for limiting the number of nodes per bus, such as not consuming too much bandwidth by overloading the network. The method in which this project has implemented 1394 networking should not cause any adverse effect. Two issues are the encapsulation of Ethernet frames, and the broadcast method. Ethernet encapsulation means that there are an extra 14 bytes per packet. With packet data payloads of 1024 bytes at S200, the extra 14 bytes should not adversely affect the bandwidth consumption. If Ethernet frames were consistently small, this may have some effect if there is high bus traffic. In a busy network, the 14 extra bytes might be useful if freed up for less redundant data, however smaller packets could be easily copied together into larger packets to reduce the overhead. A way round this problem would be to have native 1394 support in the Windows NDIS protocol drivers, an area which will hopefully be investigated for future versions of Windows. The broadcast method, including the building of the MAC/1394 translation cache, has been investigated and the effect on bus traffic has been shown to be very small. However, if this is still regarded as detrimental, the other possibilities for broadcast outlined in Section 5.2.9 can be investigated, that is using either isochronous traffic or more intelligent asynchronous delivery paths. Finally, and this applies to any 1394 network implementation, there is the overhead from the 1394 packet header and CRCs. The asynchronous block write header contains 16 bytes with two CRCs of 4 bytes each. Thus each data payload is accompanied by 24 bytes of 1394 protocol information. To compare this to an Ethernet frame, data is accompanied by 122 bytes of header and footer information if including the preamble and delay fields. The delay between 1394 packets is determined by a ‘gap time’, and 99 this differs depending on node proximity, whereas Ethernet implements a worst-case delay for each frame. Without measuring a physical implementation, it is not possible to predict the delay between 1394 packets, however it can be seen that the 1394 packet header is likely to consume less bandwidth than the Ethernet frame information. It may be useful to also investigate and compare with other network media, however Ethernet is a good basis for a comparison, and the real proof of 1394’s suitability will come in the eventual 1394 network. 8.3 Commercial Implications The finished device driver will be an important achievement, and will bring a useful new application to the home user and the office. Given that British Telecom Research Labs (BTL) sponsored the project, the implications for them must be considered. The LAN team at BTL specified the project as a feasibility study of the 1394 bus, and will be able to perform a thorough analysis based on the results. They have the ability to recommend BT-wide adoption of 1394 if deemed appropriate, and ultimately this could result in the encouragement of adoption by BT’s many international customers. The possible deployment of 1394 is immeasurable, and BT have the strength to at least partly dictate this, and it is projects such as this which are needed for BT to make such informed decisions. Although some of the discussion within this chapter has mentioned improvements which are forecast for the future of 1394, such as increased transmission distances, that does not mean that it should not be investigated until such a time when these improvements are made. BT, as a major international player in data communications, must be at the forefront of promising new technologies and if 1394 does take off commercially in the future, BT will be ahead of the game. The current status of the project should provide the LAN team with information for digestion, and when a working implementation is achieved, they will be able to perform a full study, using the techniques they use for other network media. Finally, although it would be difficult to sell the Miniport as a product to the public, it may be that device and motherboard manufacturers that utilise the TI chipset would be interested in adopting the software. Consumers would tend not to consider a device driver as something for purchase, and instead would expect that the hardware manufacturers or operating systems provide the necessary support for their hardware at no cost. The Miniport would need further development and modification for this to happen, but given the wide range of devices based on the PCILynx chip, there would be many potential customers. 8.4 The Next Stage The most pressing requirement for further work is currently the elimination of the aforementioned NDIS page fault problem. The next step in combating this would be to start testing with NT or Windows98. Technical support from Microsoft might also be useful, for which a support contract would have to be purchased. That aside, some areas for future investigation have been identified. None of these are necessary to the project, they are merely included as ideas for the future. These are as follows: 100 § § § § § § § § § Implementation of bus management – A primary reason for this would be to make possible different speed capabilities on one bus. Through management different speeds can co-exist, without reducing the entire bus to the lowest speed. Isochronous – It may be useful to implement isochronous capabilities if quality of service (QoS) is required. This would require that the protocol drivers attach a QoS requirement to packets. Broadcast – The current method was chosen mostly for simplicity. It could be improved by using the discussed parent-child method or by implementing isochronous transfers. Tweaking – Software settings such as buffer sizes, FIFO thresholds or cachealignment could be fine-tuned to gain the best performance. Full analysis – Packages are available which could fully test the network capabilities, data rates, and performance under stress. Performance should also be tested using different protocols, IPX/SPX etc, and under different traffic characteristics or with many nodes on the bus. IPover1394 – This could be supported when the standard eventually becomes available. It will only, however, work with p1394a hardware, and will limit the protocol to only IP. There is something to be said for conforming to standards, however it would require a great deal of consideration given these limitations. Investigation of 1394 internetworks. Adaptation for other 1394 hardware – This should be painless due to the nature of the Miniport, especially given that a great deal of hardware is based on TI chips. Linux – An invitation has been received from the founder of the Linux 1394 project, to implement their Texas Instruments hardware support. 8.5 Conclusions The project has proved to be very interesting, and a great deal has been learned in the process of development. The experience gained in NDIS and 1394 is valuable as these subjects are not covered in University coursework, and were not covered during the work placement at BTL. The conception of proprietary techniques for implementing otherwise non-existent procedures was found particularly challenging and fulfilling, as was the prospect of commercial adoption. Although the project required skills in many different areas of technology, all work was carried out independently and successfully managed by the developer, and the resultant product has brought together all of these technologies. No funding or technical assistance was required of the customer, for what turned out to be a far more complex project than originally envisaged by BTL. The one outstanding bug should be fixed very soon, and although it is expected that the developer will complete the project, the documentation is sufficiently full that a new developer can pick up from here. Finally, the project management skills gained have ensured the progress of the work in the short time available, and will be very useful for future work. 101 APPENDIX A - Windows95 and NDIS 4.0 If support for NDIS v4.0 Miniports is required in Windows95, the NDIS wrapper must be updated to version 4.0, as Windows95 comes with a v3.1 NDIS wrapper as standard. This process is not documented by Microsoft, but is widely accepted by the NDIS developer community. This merely involves replacing the NDIS library system file NDIS.VXD, by the NDIS 4.0 version. The replacement can be easily achieved by installing the Microsoft DUN (Dialup Networking) update v1.3, available from the Microsoft web site, which contains the new NDIS.VXD. This is only necessary for the “off-the-shelf” release of Windows95, as NDIS 4.0 is included in later updates, starting from Windows95 OSR2. Alternatively the NDIS.VXD file included in OSR2 can be copied to the older Windows95 system. 102 APPENDIX B - Miniport Installation in Windows95 The installation of the project Miniport is much simplified by the Plug and Play capability. The process occurs as follows: § § § § § § With the PC switched off, install the TI 1394 hardware into a spare PCI slot. Boot the PC into Windows95. Windows95 discovers the hardware and asks for a disk containing the device driver. Insert the CD-ROM containing the Miniport (tilynx.sys) and the installation script (tilynx.inf). There are two versions, debug and release, in separate directories. The Miniport is installed, as well as any necessary Windows system files for Windows networking. The device now should appear in Control Panel Device Manager. Set up the network using the Network applet of Control Panel. Bind the required protocol and set up the parameters such as IP address for TCP/IP. No hardware information has to be configured by the user, as this is taken care of by Plug and Play. This process is common for all network device drivers, and as such should fit into any company administration procedures. This is achieved through the installation script, tilynx.inf, developed specifically for this project, which sets parameters such as the requirement of NDIS v4.0, and writes the necessary registry entries. It allows for any protocol to be bound to the TI hardware, identified as an attractive project capability. The contents of this installation script follow: ; Windows 95 Install script for Texas Instruments PCILynx 1394 Network Interface Card ; Requires NDIS 4.0 thus MS DUN update 1.3 or OSR2 must be installed ; Kelvin Lawson 4/99 <[email protected]> [Version] Signature=$CHICAGO$ Class=Net Provider=%MS% LayoutFile=LAYOUT.INF [DestinationDirs] DefaultDestDir=11 CopyDriverFiles =11 [Manufacturer] %TI%=TI [TI] %DeviceName%=TI.Device,PCI\VEN_104C&DEV_8000 [TI.Device] AddReg=tilynx.ndi.reg,tilynx2.ndi.reg CopyFiles=CopyDriverFiles Reboot 103 [CopyDriverFiles] tilynx.sys [tilynx.ndi.reg] HKR,Ndi,DeviceID,,"PCI\VEN_104C&DEV_8000" HKR,,AdapterCFID,,8000104C [tilynx2.ndi.reg] HKR,,DevLoader,,*ndis HKR,,DeviceVxDs,,tilynx.sys HKR,,EnumPropPages,,"netdi.dll,EnumPropPages" HKR,,NetworkAddress,,"A3A3A3A3A3A3" HKR,NDIS,LogDriverName,,"TILYNX" HKR,NDIS,MajorNdisVersion,1,04 HKR,NDIS,MinorNdisVersion,1,00 HKR,Ndi\Interfaces,DefUpper,,"ndis3" HKR,Ndi\Interfaces,DefLower,,"ethernet" HKR,Ndi\Interfaces,UpperRange,,"ndis3" HKR,Ndi\Interfaces,LowerRange,,"ethernet" HKR,Ndi\Install,ndis3,,"tilynx.ndis3" HKR,Ndi,CardType,,"PCI" ; For test purposes - if 1 this will send a test packet ; After installation this parameter can be changed in the Network applet. HKR,Ndi\params\TestSend,ParamDesc,,"TestSend" HKR,Ndi\params\TestSend,default,,0 HKR,Ndi\params\TestSend,min,,0 HKR,Ndi\params\TestSend,max,,1 HKR,Ndi\params\TestSend,step,,1 HKR,Ndi\params\TestSend,base,,10 HKR,Ndi\params\TestSend,type,,int HKR,NDI\params\TestSend,flag,1,20,00,00,00 ; Note: An NT 4.0 .INF script must add registry entries for the PCI SlotNumber. ; This is not necessary in Win9x or Win2K thanks to plug and play. [ControlFlags] CopyFilesOnly=PCI\VEN_104C&DEV_8000 [SourceDisksFiles] tilynx.sys=1 tilynx.inf=1 [Strings] MS="Microsoft" TI="Texas Instruments" DeviceName="TI 1394 TSB12LV21 PCILynx Network Interface Card" 104 APPENDIX C - Makefile for NDIS Miniport Compilation This Makefile can be used to compile the project Miniport for use in Windows 95, NT or 98. The command line options shown can be used for targeting the operating system and choosing a debug or release target file. For Windows 98, either parameter can be used, however for Plug and Play support Win95 should be specified. The file is a configuration file for the NMAKE utility, part of Microsoft Visual C++. ############################################################################# # # Modified: Kelvin Lawson 2/12/98 # # Standalone makefile for NDIS .SYS driver # # Usage: # nmake [WINNT_PLATFORM=1 | WIN95_PLATFORM=1] [RELEASE=1] # # For nodebug build, add RELEASE=1 # Default is WIN95_PLATFORM=1, debug. # For extra C defines, add C_DEFINES=-D.... # # !!! No .h.c dependencies! add makedep or rebuild manually # ############################################################################# #---------------------------------------------------------!if !defined(WINNT_PLATFORM) && !defined(WIN95_PLATFORM) !message "** BUILDING 95 version by default" WIN95_PLATFORM=1 !endif !if !defined(RELEASE) !message "** CHECKED (debug) build **" CKFREE=CHECKED DBG=1 !else !message "** RELEASE (nodebug) build **" CKFREE=FREE DBG=0 !endif # running on nt or w9x? !if defined(NUMBER_OF_PROCESSORS) NULL= CMD=cmd.exe !else NULL=nul CMD=command.com !endif #---------------------------------------------------------DRNAME=TILYNX #--- Directories - replace by yours! ----------------------------------DDKPATH=d:\DDKnt40 105 W95DDK=d:\DDK95 MSVCDIR=c:\progra~1\devstu~1\vc ICEDIR=c:\siw95 BASEDIR=$(DDKPATH) NDISINC=$(DDKPATH)\SRC\NETWORK\INC INC95=$(W95DDK)\inc32 MSVCINC=$(MSVCDIR)\include DDKINC=$(DDKPATH)\INC #---------------------------------------------------------CL=$(MSVCDIR)\bin\cl RC=c:\mssdk\bin\rc LINK=$(MSVCDIR)\bin\link NMSYM32=$(ICEDIR)\nmsym /translate:source,package NTDDKLIB=$(DDKPATH)\lib\i386\$(CKFREE) VXDWRAPLIB=$(W95DDK)\lib\vxdwraps.clb # Modules ----------------------------------------------------------MODLIST=\ tilynx.obj interrup.obj card.obj send.obj oid.obj tilynx.res # W95/NT specific: W95MOD= NTMOD= #---------------------------------------------------------!if defined(WINNT_PLATFORM) !MESSAGE "****** BUILDING NT version" $(C_DEFINES) TARGETPATH=.\OUTNT MODLIST=$(MODLIST)$(NTMOD) NT95PLATF=-DWINNT_PLATFORM=1 !else !if defined(WIN95_PLATFORM) !message "****** BUILDING W95 version" $(C_DEFINES) TARGETPATH=.\OUT95 MODLIST=$(MODLIST)$(W95MOD) NT95PLATF=-DWIN95_PLATFORM=1 !endif !endif #---------------------------------------------------------!if $(DBG) TARGDIR=$(TARGETPATH)D OBJ=objd !else TARGDIR=$(TARGETPATH) OBJ=obj !endif OBJL1=$(MODLIST:[=^$(OBJ^)\) OBJL2=$(OBJL1:]=.obj ) 106 OBJLIST=$(OBJL2) #Write and include external file to substitute $(OBJ) in OBJLIST ... !if [ $(CMD) /c echo > $(TEMP)\mktmp1.tmp OBJLIST=$(OBJLIST) ] !endif !include $(TEMP)\mktmp1.tmp #--- Compile options ------# -FI: forces #include file !if $(DBG) !message "DEBUG build" CFLAGS=-c -nologo -nologo \ -I. -I$(DDKINC) -I$(NDISINC) -I$(INC95) \ -I$(MSVCINC) -FI$(DDKINC)\warning.h -D_X86_=1 \ -Di386=1 -DSTD_CALL -DCONDITION_HANDLING=1 -DNT_UP=1 -DIS_32=1 \ -DNT_INST=0 -DWIN32=100 -D_NT1X_=100 -DWINNT=1 -D_WIN32_WINNT=0x0400 DWIN32_LEAN_AND_MEAN=1 \ -DDBG=1 -DDEVL=1 -DNDEBUG -DFPO=0 \ -D_DLL=1 -D_IDWBUILD -DRDRDBG -DSRVDBG \ -DNDIS_MINIPORT_DRIVER -DBRZWLAN $(NT95PLATF) \ /c /Zel /Zp8 /Gy -cbstring /W3 /WX /Gz /QIfdiv- /QIf /Gi- /Gm- /GX- /GR- /GF -Z7 /Od /Oi /Oy- \ LFLAGS=-map:$(OBJ)\$(DRNAME).map -debug:notmapped,FULL -debugtype:both -PDB:NONE RCFLAGS2 = -DDBG=1 $(C_DEFINES) !else !message "RELEASE build" CFLAGS=-c -nologo -nologo \ -I. -I$(DDKINC) -I$(NDISINC) -I$(INC95) \ -I$(MSVCINC) -FI$(DDKINC)\warning.h -D_X86_=1 \ -Di386=1 -DSTD_CALL -DCONDITION_HANDLING=1 -DNT_UP=1 -DIS_32=1\ -DNT_INST=0 -DWIN32=100 -D_NT1X_=100 -DWINNT=1 -D_WIN32_WINNT=0x0400 DWIN32_LEAN_AND_MEAN=1 \ -DDBG=0 -DDEVL=1 -DNDEBUG -DFPO=0 \ -D_DLL=1 -D_IDWBUILD -DRDRDBG -DSRVDBG \ -DNDIS_MINIPORT_DRIVER -DBRZWLAN $(NT95PLATF) \ /c /Zel /Zp8 /Gy -cbstring /W3 /WX /Gz /QIfdiv- /QIf /Gi- /Gm- /GX- /GR- /GF -Z7 /Oi /Oy- /Ob1 /Ot #LFLAGS=-map:$(OBJ)\$(DRNAME).map -debug:notmapped,FULL -debugtype:both -PDB:NONE # This strips dbg info! LFLAGS=-map:$(OBJ)\$(DRNAME).map -debugtype:coff -PDB:$(OBJ)\$(DRNAME).pdb RCFLAGS2 = -DDBG=0 !endif # ---------- TARGETS -----------ALL: $(OBJ) $(TARGDIR) $(TARGDIR)\$(DRNAME).SYS $(TARGDIR): 107 if not exist $(TARGDIR)\$(NULL) md $(TARGDIR) $(OBJ): if not exist $(OBJ)\$(NULL) md $(OBJ) #------- LNK.RSP --------------$(OBJ)\lnk.rsp: echo Writing: <<$(OBJ)\lnk.rsp $(LFLAGS) $(OBJLIST) -machine:i386 -verbose -MERGE:_PAGE=PAGE -MERGE:_TEXT=.text -MERGE:.rdata=.text -SECTION:INIT,d -OPT:REF -FULLBUILD -INCREMENTAL:NO -FORCE:MULTIPLE -RELEASE -version:4.00 -osversion:4.00 -subsystem:native,4.00 -optidata -driver -align:0x20 -base:0x10000 -entry:DriverEntry@8 -IGNORE:4001,4037,4039,4065,4070,4078,4087,4089,4096 -NODEFAULTLIB $(NTDDKLIB)\int64.lib $(NTDDKLIB)\ntoskrnl.lib $(NTDDKLIB)\hal.lib $(NTDDKLIB)\ndis.lib $(VXDWRAPLIB) <<KEEP #--- This forces recompile if cc.rsp changes (MS NMAKE feature) $(OBJ)\*.obj: $(OBJ)\cc.rsp # ----------Link the driver: $(TARGDIR)\$(DRNAME).SYS: $(OBJLIST) $(OBJ)\lnk.rsp $(LINK) -out:$@ @$(OBJ)\lnk.rsp @if not exist $@ echo *** $@ not built! *** #--------- CLEANUP -----CLEAN: echo Deleting built files - del *.obj - del $(OBJ)\*.obj - del $(OBJ)\*.rsp - del $(OBJ)\$(DRNAME).* #-- SoftIce .NMS file: ------$(TARGDIR)\$(DRNAME).nms: $(TARGDIR)\$(DRNAME).sys - $(NMSYM32) $? #-------END ------------ 108 APPENDIX D – Companion CD-ROM The following directories are on the CD-ROM. Directory source archive docs whql debug release Contents Full project source code Source code history Relevant specifications WHQL NDIS tester results Debug build Miniport and installer Release build Miniport and installer Installation of the Miniport is easy using this CD-ROM. When Windows asks for a disk containing the device driver, insert the CD and point it to the \debug or \release directory, depending on which version is to be installed. The rest is automatic. 109 BIBLIOGRAPHY The following documents are useful as background information on all aspects of the project work. No sources were found, however, to provide an introduction and instruction guide to NDIS development, which this document has provided. Armitage G.J. and Adams K.M. (1995). How Efficient is IP over ATM anyway ?, IEEE Network vol 9 Jan/Feb pp18-26 Bennatan E.M. (1995), Software Project Management – A Practitioner’s Approach, 2nd Ed, McGraw-Hill Bertsekas D. and Gallager R. (1992), Data Networks, 2nd Ed, Prentice-Hall Borrill P.L. (1991), Microprocessor Bus Structures and Standards”. IEEE Micro Magazine, vol 1, pp 84-95, Feb 1981 Bradely T. and Brown C. (1992), RFC:1293 – Inverse Address Resolution Protocol, IETF Buchanan W. (1997), Advanced Data Communications and Networks, Chapman & Hall Buchanan W. (1999), PC Interfacing, Communications and Windows Programming, Addison-Wesley, p648 Conger S. (1994), The New Software Engineering, International Thomson Publishing Custer H. (1992), Inside Windows NT, Microsoft Press Day J.D. and Zimmerman H. (1983), The OSI reference model,. Proceedings IEEE 71, pp1334-40 Deering S. (1989), RFC:1112 - Host Extensions for IP Multicasting, IETF Dhawan S. (1995), Networking Device Drivers, Van Nostrand Reinhold Fair L. and Pearson B. (1998), The Best Way to Implement 1394 Technology – Q&A With Intel’s 1394 Experts, Intel Platform Solutions Archive Goldman J. (1998), Applied Data Communications: A Business-Oriented Approach, John Wiley & Sons, pp203-5 Halsall F. (1996), Data Communications, Computer Networks and Open Systems, 4th Ed, Addison-Wesley IEEE Std 1394-1995, Standard for a High Performance Serial Bus, IEEE 110 IEEE Project p1394a, Draft Specification for a High Performance Serial Bus (Supplement), IEEE IEEE Project p1394b, Draft Standard for a High Performance Serial Bus (Supplement), IEEE Intel, Fundamentals of Ethernet Technology, Intel Certification Course FN2 ISO (1984), Basic Reference Model for Open Systems Interconnection, ISO:7498 ISO/IEC 13213:1994, ANSI/IEEE Std 1212, 1994 Edition, Control and Status Register (CSR) Architecture for Microcomputer Buses, ISO/IEC ISO/IEC 8802-2 (1990), ANSI/IEEE Std 802.2 Information Processing Systems – Local Area Networks – Part 2: Logical Link Control, ISO/IEC ISO/IEC 8802-3 (1990), ANSI/IEEE Std 802.3 Information Processing Systems – Local Area Networks – Part 3: Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications, ISO/IEC James M. and Chapman K. (1989), Local Area Networks Architectures and Implementations, Prentice-Hall Jennings R. (1997), Fire on the Wire: The IEEE 1394 High Performance Serial Bus, Adaptec Inc. Johansson P. (1999), Internet Draft: Ipv4 over IEEE 1394, IETF Hoffman G. and Moore D. (1995), IEEE 1394: A Ubiquitous Bus, presented at Compcon’95 in San Francisco, 5 Mar 1995 Laubach M. (1998), RFC:2225 - Classical IP and ARP over ATM, IETF Lewis G. (1999), FireWire – A Bus for all Systems ?, Electronics World (Jan 99) Mackenzie L. (1998), Communications and Networks. McGraw-Hill Markley R. (1990), Data Communications and Interoperability, Prentice-Hall Messmer H. (1997), The Indispensable PC Hardware Book, 3rd Ed, Addison-Wesley Microsoft (1996), Windows 95 Device Driver Kit, Microsoft Corporation Microsoft (1998), Windows 98 Resource Kit, Microsoft Corporation Microsoft (1996), Windows NT Version 4.0 Device Driver Kit, Microsoft Corporation 111 Microsoft (1999), Plug and Play Design Specification for IEEE 1394, Microsoft Corporation Microsoft, PE General Concepts, Microsoft Corporation Norton D. (1992), Writing Windows Device Drivers, Addison-Wesley PCISIG (1995), PCI Local Bus Specification Revision 2.1, PCI Special Interest Group Pearson B. (1998), USB and 1394, Intel Platform Solutions Archive Pietrek M. (1994), Peering Inside the PE: A Tour of the Win32 Portable Executable File Format, Microsoft Systems Journal (March 1994) Plummer D. (1982), RFC:826 – An Ethernet Address Resolution Protocol – or – Converting Network Protocol Addresses to 48.bit Ethernet Address for Transmission on Ethernet Hardware, IETF Postel J. and Reynolds J. (1988), RFC:1042 - A Standard for the Transmission of IP Datagrams over IEEE 802 Networks, IETF Rose M. (1990), A Practical Perspective on OSI, Prentice-Hall Schildt H. (1995), ‘C’ The Complete Reference, 3rd Ed, Osborne Sommerville I. (1992), Software Engineering, Addison-Wesley Tanenbaum A. (1996), Computer Networks, 3rd Ed, Prentice-Hall London, pp420-3 Tanenbaum A. (1990), Structured Computer Organization, Prentice-Hall Texas Instruments (1998), 1394 Technical Overview, TI Texas Instruments (1997), Data Transmission Seminar, TI Texas Instruments (1998), Lynxsoft 1394 Software Application Programmer User’s Guide- SLLU003 v2.2, TI, Chapter 6 Texas Instruments (1997), PCILynx Functional Specification, Rev 1.2, TI Thielen D. and Woodruff B. (1993), Writing Windows Virtual Device Drivers, Addison-Wesley Truong H. et al (1995) LAN Emulation on an ATM Network, IEEE Communications vol33 May pp70-85 Wickelgren I.J. (1997), The Facts about FireWire, IEEE Journal “Spectrum” (April 1997) pp19-25