================================================================================ Covert channels through the looking glass v1.0 - October 2005 Gray-World Team http://www.gray-world.net ================================================================================ ================================================================================ This paper was originally released at Hitchiker's World Issue #10 (have a look at http://www.infosecwriters.com/hhworld/). ================================================================================ ================================================================================ Copyright (c) 2005, Gray World Team . Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. You should have received a copy of the license with this document and it should be present in the fdl.txt file. If you did not receive this file or if you don't think this fdl.txt license is correct, have a look at the official http://www.fsf.org/licenses/fdl.txt licence file. ================================================================================ ======= SUMMARY ======= INTRODUCTION 1. COVERT CHANNELS THEORETICAL CONCEPTS 1.1 Academic definition 1.2 Covert channels types 1.3 Covert channels parameters 1.4 Covert channels and steganography 1.5 Network covert channels 2. NETWORK COVERT CHANNELS IN A GRAY WORLD: WHY AND WHO ? 3. COMMUNICATION CONCEPTS FOR NETWORK COVERT CHANNELS IMPLEMENTATIONS 3.1 Control / Data Channels 3.2 Multiplexing / Demultiplexing 3.3 Communication architectures 3.4 Communication models 4. PLAYING THE GAME VS THE DETECTION TEAM 4.1 Confusing the analyst with multiple sources and destination 4.2 Keeping a low profile 4.3 Live but let die 5. PRACTICAL IMPLEMENTATIONS 6. PRACTICAL IMPLEMENTATIONS: ACTIVE PORT FORWARDER 7. PRACTICAL IMPLEMENTATIONS: SKEEVE 8. PRACTICAL IMPLEMENTATIONS: MSNSHELL 9. PRACTICAL IMPLEMENTATIONS: SOCKSTER & TRAPSTER WEBOGRAPHIE ================================================================================ ============ INTRODUCTION ============ "Soon her eye fell on a little glass box that was lying under the table: she opened it, and found in it a very small cake, on which the words "EAT ME" were beautifully marked in currants. "Well, I'll eat it," said Alice , "and if it makes me grow larger, I can reach the key; and if it makes me grow smaller, I can creep under the door: so either way I'll get into the garden, and I don't care which happens !" - Lewis Carroll, Alice In Wonderland. We started co-writing a short paper about network covert channels and finally you read that one. Parts 1 to 4 are concepts, ideas, food for the mind and next parts describe toys we published because mind has to play sometimes. Enjoy. ================================================================================ ======================================= 1. COVERT CHANNELS THEORETICAL CONCEPTS ======================================= A covert channel is a communication channel that is not designed and/nor intended to exist and that can be used to transfer information in a manner that violates the existing security policy. It can be characterized with several parameters such as noise, bandwidth and stealthiness. However, the concept of "covert channel" is difficult to define precisely and often leads to pedantic discussions about the meaning of "covert" (is it covert, subliminal, hidden, stealth or do_we_really_care_at_all ?). So let's review various academic definitions and concepts so that the reader becomes familiar with the covert channels theoretical concepts and decides himself. 1.1 Academic definition ----------------------- The covert channel concept first seems to appear [NCSC_1993] in [Lampson1973] "a communication channel is covert if it is neither designed nor intended to transfer information at all.". A most common definition states that "a covert channel is a communication channel that allows a process to transfer information in a manner that violates the system's security policy". [DOD_1985], [NCSC_1993], [Rowland_1996]. //------------------------------------------------------------------------\\ [NCSC_1993] also gives specific definitions such as: Definition 2 - A communication channel is covert (e.g., indirect) if it is based on "transmission by storage into variables that describe resource states." Definition 3 - Covert channels "will be defined as those channels that are a result of resource allocation policies and resource management implementation." Definition 4 - Covert channels are those that "use entities not normally viewed as data objects to transfer information from one subject to another." and explains that "given that discretionary models cannot prevent the release of sensitive information through legitimate program activity, it is not meaningful to consider how these programs might release information illicitly by using covert channels." and states that "The dependency of covert channels on the (nondiscretionary) security policy models does not imply one can eliminate covert channels merely by changing the policy model. Certain covert channels will exist regardless of the type of nondiscretionary access control policy used." And for a funny note, look for the "hopefully all" string in [NCSC_1993]. \\------------------------------------------------------------------------// The [WIKIPEDIA_CC] definition tells that a covert channel "is a communication channel that does a writing-between-the-lines form of communication [... and] is parasitic to its host channel; it reduces bandwidth of the host channel by reducing the signal-to-noise ration in the host channel. [...] A covert channel could be defined as a communication channel that transfers some kind of information using a method originally not intended to transfer this kind of information. [And finally] The term is used in the TCSEC specifically to refer to ways of transferring information from a higher classification compartment to a lower classification." 1.2 Covert channels types ------------------------- [DOD_1985] presents two types of covert channels: covert storage channels and covert timing channels. Moreover, it introduces the notion of measuring the threat level of a covert channel by looking for covert channels mechanisms with bandwidth that may exceed a certain amount of bits per second. //------------------------------------------------------------------------\\ "Covert storage channels include all vehicles that would allow the direct or indirect writing of a storage location by one process and the direct or indirect reading of it by another." [DOD_1985] "Covert timing channels include all vehicles that would allow one process to signal information to another process by modulating its own use of system resources in such a way that the change in response time observed by the second process would provide information." [DOD_1985] And for a funny note, [NCSC_1993] states that "In practice, when covert channel scenarios of use are constructed, a distinction between covert storage and timing channels [...] is made even though theoretically no fundamental distinction exists between them. [...] In this guide, we retain the distinction between storage and timing channels exclusively for consistency with the TCSEC." [CC_Here2Stay_1994] describes that "a storage channel is a covert channel where the output alphabet consists of different responses all taking the same time to be transmitted. A timing channel is a covert channel where the output alphabet is made up of different time values corresponding to the same response. A mixed channel is a combination of the two." \\------------------------------------------------------------------------// 1.3 Covert channels parameters ------------------------------ [NCSC_1993] introduces various distinct parameters to characterize covert channels: Noise, Bandwidth/Capacity, Synchronization and Aggregation. "A covert channel is noiseless if any bit transmitted by a sender is decoded correctly by the receiver with probability 1." [NCSC_1993] Bandwidth is used "to denote the rate at which information is transmitted through a channel. "In a covert channel context, bandwidth is given in bits/second" and "is also related to the notion of 'capacity' [..] maximum possible error-free information rate in bits per second." Note that these two parameters are linked because "error-correcting codes help change a noisy channel into a noiseless one" but "the resulting channel will have a lower bandwidth than the similar noise-free channel". [NCSC_1993] //------------------------------------------------------------------------\\ Also take care because "the bandwidth is a separate characteristic of a continuous channel and the capacity is in fact a function of the bandwidth ! We should not reinvent the wheel and use the standard terminology that already exists." [CC_Here2Stay_1994]. Ok ok, no problem :) Another concept interesting to mention (for history ?) is the Small Message Criterion concept stating that "If one has a very sensitive but short message then the capacity is not a sufficient measure of security. [...] When a covert channel exists in a system, the SMC will give guidelines for what will be tolerated in terms of covertly leaking a short covert message of length n bits in time 'tau' with fidelity of transmission p%. The SMC must be used in conjunction with capacity for a full security analysis/validation of a system". [CC_Here2Stay_1994] \\------------------------------------------------------------------------// The synchronization relationship between each part of the communication channel allows one part to notify the other part that it has completed reading or writing data. If sender and receiver exchange synchronization messages in both directions, synchronization and data messages may be indistinguishable. [NCSC_1993] Messages sent or received by the parts of the communication channel may use multiple data variables that can be used as groups to amortize the cost of synchronization. Communication "channels may thus be aggregated serially, in parallel, or in combinations of serial and parallel aggregation to yield optimal (maximum) bandwidth." [NCSC_1993] Other un-academic (?) parameters may be used to characterize covert channels: these are the latency and stealthiness parameters. The Webster's Revised Unabridged Dictionary (1913) states "stealthiness - the state, quality, or character of being stealthy; stealth." and the Free On-line Dictionary of Computing states "latency - the time it takes to a {packet} to cross a network connection, from sender to receiver." [2] Latency and Stealthiness obviously depend on the previous parameters and adjusting their level seems more empiric (how achieve the best latency - stealthiness trade-off) than practical (latency level: x%, stealthiness level: y%, luck level: z%). 1.4 Covert channels and steganography ------------------------------------- [InternetSteg_ActWard_2002] "[...] describes a subliminal channel as one where hidden data piggybacks on an innocuous-looking legitimate communication. By definition, steganographic carriers are subliminal channels since the communication appears to be innocent but really has ulterior information embedded below the threshold of perception". //------------------------------------------------------------------------\\ [InternetSteg_ActWard_2002] also presents models and concepts for restricted environments to use active wardens in order to block the creation of subliminal channels allowing relatively high-bandwidth leakage of information. It introduces a concept named "Minimal Requisite Fidelity" (MRF) that "defines the degree of signal fidelity that is both acceptable to end users and destructive to covert communications" and classifies two kind of carriers: unstructured and structured (respectively for images, audio or anything needing human interpretation and any "well-defined syntax and semantics" instances). It states that "while there are several techniques currently in use that reactively attempt to detect steganography in images, this is understandably an impossible task to complete" and tells that for "unstructured carriers, the limits to what can be changed [in order to remove opportunity to build covert channels] are defined by fuzzy notions such as perception". It also tells that digital watermarking emphasises sometimes more on robustness than on secrecy and that a watermark would be thus proportionally easier to detect. So good, isn't it ? the warden will detect a watermark but will it try to detect that this watermark itself is a carrier ? \\------------------------------------------------------------------------// [Embed_CC_TCPIP_2005] resumes that "steganography can only be prevented by detection, not by attempting to remove any hidden information [...]" (" passive warden threat model") because it will cost too much resource for a warden to be active in many scenarios. Warden may be active for some low level OSI layers but will be hardly convenient for high level OSI layers. 1.5 Network covert channels --------------------------- For a little bit of history, [NCSC_1987] extends [DOD_1985] to "trusted network systems and components" and cites Padlipsky (1978) and Girling (1987) for literature references. Resources for this topic are available in [Rowland_1996] which "details various weaknesses in the TCP/IP protocol suite" that "allow an attacker to leverage techniques in the form of covert channels to surreptitiously pass data in otherwise benign packets.". [PractDH_2002] discusses methods to hide data in the TCP/IP protocol suite and the [CC_TCPIP_Hdr_2002] presentation deals with covert channels related to TCP and IP Headers. [Embed_CC_TCPIP_2005] "study a number of previously proposed schemes for embedding data within the TCP and IP protocol headers, thus creating a steganographic covert channel". The paper "show how the use of these schemes can easily be detected by a passive warden" and propose an "alternative method for embedding data" inside the ISN TCP/IP header field so that "a passive warden cannot detect the use of this method without knowledge of a secret key, subject to some realistic constraints." ================================================================================ ========================================================= 2. NETWORK COVERT CHANNELS IN A GRAY WORLD: WHY AND WHO ? ========================================================= The nature of a covert channel obviously depends on what it's to be used for. For example, it wouldn't be so okay to use an high bandwidth-latency covert channel if we only need to send 1024 bytes in the next 24 hours, isn't it ? In other words, before selecting any communication channel for covert traffic, we need to review our goals, look for the available communication channels and decide which ones offer the best trade-off in term of bandwidth/latency/ stealthiness. Reaching efficient trade-off while designing how to embed our channel implies to answer what we want to do (i.e. who are we and why do we want to achieve this goal) and how we want to do it. Answering the "who ?" question isn't obvious as it would seem at first. There may be as many different 'who' as people willing to design their own covert channel implementation. You may be a legitimate insider who want to access external services without bothering with your local area security policy (for a short period or longer), you may be some not so legitimate person (thanks for patching my systems so that you're the only one with access ;)) who wants to keep a stealthy remote access to one or several corporate or personal systems, you may be some fully legitimate person willing to implement stealth communication methods between systems or you can be anyone who is tainted and not so sharp about definitions. Do we need to answer the "why ?" ? You may want to tunnel some protocol, to continuously download so very great amount of data (db snapshots ? ;)), to protect legitimate communication streams (honeynet ? or "Mr director, I admit this system wasn't part of our honeynet but still it was good idea to set that communication channel"). You may also answer that "why" by figuring out some "what" ideas you can think about (we didn't say play, did we ?). //------------------------------------------------------------------------\\ Stealth Commander Suppose we already compromised a host within a remote network. Depending on our goals, we may want to use uni-directional communication channel or bi-directional communication channels. We can use a source -> destination uni-directional channel if we only need to send commands to the compromised host (are you Up, Attack, Stop attack). We can use destination -> source uni-directional channel if we only need to get data from the compromised host (as getting sniffer traces each day). Or we can finally decide we want full control over the compromised host and then use a bi-directional channel. For a concrete example about Uni-directional 'Are You Up' covert channels, check the [WLAN_STEALTH_2005] paper that presents an application of the port knocking concept for WLAN environments. Depending on these aspects, we can choose the most adapted communication channel to carry our covert channel and focus on the best trade-off for the bandwidth, latency and stealthiness parameters. Battlefield preparation Suppose we already compromised a host within a remote network and that the host is now compromising its neighbours. We don't really need to use any communication channel while the battlefield is automatically prepared (remember all these worms crawling the Internet ?). Once this battlefield is ready, we directly benefit from a multiple sources advantage - that advantage being usable between the compromised hosts themselves, between each compromised host and the operator and between the set of compromised hosts and the operator. Enrolling unwitting soldiers [Unwitting_2003] describes a solution to enrol unwitting end users for covert channels communications. Each time someone browses an Internet website, the remote server can use various fields of the HTTP protocol to carry specific information that the user will forward to another server without knowing it. The presented model states to provide unobservability (i.e. "that an observer cannot tell if messages are being sent or received at all"). Remember all these vulnerabilities in client-side applications and all these "bad" remote servers on the wild W3 ? Seems like the visible part of the iceberg isn't it ? Automated exit Suppose we compromised an host within a remote network. It is quite difficult to know what will be the best communication channels to use. So let our host learn about its environment and then decide itself which covert channels to use. \\------------------------------------------------------------------------// The IT world is evolving day after day and whatever one can say, it is possible to [build|rent|buy] wide scale networks of resources these days. Preserving a relative anonymity being quite easy too, the main problem is to define the communication methods that will link the network components between each other. The usual definition of a covert channel as it is exposed in 1.1 states that "a covert channel is a communication channel that allows a process to transfer information in a manner that violates the system's security policy". Trying to adapt that academic definition is sometimes complicated. One can ask, for example, "what's the system's security policy for an international botnet ?" Being pedantic, maybe that building stealth communication channels to link several botnet is not building covert channels because no communication channels exists ? Because there's no system's security policy to deal with ? Maybe that network covert channels didn't exist between 1973 and 1978 and perhaps should we thank NSCS to have extended DOD TCS criteria related to CC to the network in 1987 because no one was able to discuss the covert channel network concepts from the 1985 definitions. You know what ? It'd seem perfect because if we can use stealth channels (ok, we admit it, they're not covert channels..) it means that academic research about covert channels detection will not bother us, will it ? But wait a minute, the botnet owner has his own security policy and his own communication channels, won't he ? If the owner gets caught because of [wiki your favorite version: storage, DDOS, CC, ISN, suggestion ?], would the communication channels be analyzed for covert channels ? After all, covert channels are not about technical means but about information and about what that information means... Always funnier to look at the puppet master, no ? ;) ================================================================================ ===================================================================== 3. COMMUNICATION CONCEPTS FOR NETWORK COVERT CHANNELS IMPLEMENTATIONS ===================================================================== So, if covert channels are communication channels that are not designed nor intended to exist, then we must design a way to embed our communication streams inside authorized channels. The main question being how to do this. Covert channels may be based on merely all existing protocols from OSI low layers ones as IP, TCP, UDP, ICMP to OSI high layers ones as HTTP, SMTP, etc. However, we only can use protocols authorized by the NACS and we first have to decide about the trade-off we will accept regarding reliability and stealthiness. 3.1 Control / Data Channels --------------------------- There is no academic definition to what has to be a control channel. We may state that control channels carry the information required to handle the data flows from one point to another: establishing communication flows and keeping them up while taking care of bandwidth, latency and stealthiness parameters. The control operations themselves are relatively short amounts of information such as: open/close the data channel, increase/decrease bandwidth on specific channel, interrupt data communication, switch to another control channel type, etc. Specific initialization handshake procedures may sometimes be sufficient to handle data channels over a certain amount of time without having to send information over the control channel. These procedures may include various parameters such as the type of compression/ciphering per data channel or advanced parameters such as (de)multiplexing on the run or what to do when bandwidth/latency/stealthiness parameters reach a certain threshold. The control channel(s) may be based on unilateral tunnel(s) (only sending packets from outside to inside or vice versa) or be based on more sophisticated configurations, use asynchronous methods and various distinct protocols, use high sleeping delays and environment learning methods, etc. As the control channel(s) have to be as stealthy as possible, any unusual activity regarding the standard behaviour (permanent sessions, generation of lots of huge amounts of data/packets, etc.) should be avoided. And because we'll never send large amount of data through the control channel, we may forget about its bandwidth parameter and focus on the stealthiness and latency ones. -- The data channels are reliable communication channels that can be used to transfer information from one side to another. The design of the data channels may also focus on stealthiness if the bandwidth requirements are not high but if we know these channels will carry huge data traffic peaks or flows, we know we will have to choose between a less or more long time of transfer and the risk of being detected. 3.2 Multiplexing / Demultiplexing --------------------------------- We know that covert channels are based on communication channels that are legitimate for the NACS. Therefore, it is (quite) often possible to use several communication channels types to carry our covert channels simultaneously (each channel type having its own requirements in term of bandwidth/latency/ stealthiness). We presented this notion as "Demultiplexing" in [2] while it is presented as "Aggregation" in [NCSC_1993] (do we really care to name this concept potato or potado ? after all, we're discussing hiding topics, no ? ;)). //------------------------------------------------------------------------\\ local network |NACS| Internet || ------------CC_type1------------ / || \ Application /-------------CC_type2-------------\ Application \ / || \ / Client <------------Data Channel------------> Server \ || / \-------------CC_type3-------------/ || \\------------------------------------------------------------------------// [2] For example, several communication channels over HTTP, ICMP and SMTP protocols may be used simultaneously in order to improve the stealthiness of communication control channel (the communication methods may change from time to time, randomly, after a environment learning period, etc.). -- But sometimes, multiple connection channels may puzzle NACS administrators. In this situation, we consider multiplexing several Data and/or Control channels over a single communication channel. Note that doing so has to be considered case per case. We can lower the latency of our channels if we use an high bandwidth channel but we would need to increase the "permanent" parameter if we multiplex control and data channels or would risk to be un-covered because of an abnormally high bandwidth usage. //------------------------------------------------------------------------\\ local network |NACS| Internet || Appl. 1 || Appl. 1 \ || / Client <-------Multiplexed Channel--------> Server / || | \ Appl. 2 || | Appl. 2 || \ || Appl. 3 || \\------------------------------------------------------------------------// [2] 3.3 Communication architectures ------------------------------- Covert channels communication architectures have not to be thought of as usual ones because their primary goal is to remain covered long enough so that we accomplish our task. To do so, we may use standard client/server and peer to peer architectures or design multiple levels architectures (with levels of legitimate/noising/unexisting/etc intermediaries and/or destination/sources - see 3. PLAYING THE GAME VS THE DETECTION TEAM). Client/Server architecture separates clients from servers. Their setup and operating modes are quite different. Generally, the client(s) sends requests to the server(s) which computes and returns a result. Such architecture is scalable as long as the client(s) (that is: a user, a process, a station, a set of stations) has access to at least one server. In a P2P architecture, all participants operate as client and server whereas these two modes are used simultaneously, one after another, only after synchronization or after/before a specific event. Such architecture let us design covert channels communications modes between more than just two parts without having to bother with dedicated servers. Participants to the architecture (shall it be client/server or p2p) may, of course, only be able to know about the way to learn how to contact other participants. The external communication core may thus be "transient" and dynamically change its topology and access points. 3.4 Communication models ------------------------ Various communication models may be used for covert channels. The simple model is based on a single Point-to-Point connection. This "Direct" model assumes that a server component is running on the external network and that the client component opens a communication channel through the NACS. //------------------------------------------------------------------------\\ 1 2 CC Client <-------> NACS <----------> CC Server <___internal_network___> Internet <___external_networks___> \\------------------------------------------------------------------------// Implementation of this model is simple but possibilities are quite restricted. Server and client only can execute what they were designed for. It is, however, sometimes quite enough (see [AckCmd] for an example of remote shell). -- In order to use the covert channel for different types of data, it is possible to use a "Proxy" model. Proxy components accept data streams from clients and servers and act as intermediaries without caring about the kind of data they transmit. This model allows to deal with various distinct applications while using one (or multiple) types of covert channels: //------------------------------------------------------------------------\\ SSH client 1 2 2 3 SSH server Web browser<---> CC Client <------ NACS -----> CC proxy <-----> HTTP Daemon IM client IM login <__________internal_network___________> Internet <___external_networks___> \\------------------------------------------------------------------------// This "Proxy" model is the most popular and almost each program implementing tunnels or covert channels use that scheme. -- But sometimes, previous models are not giving satisfaction because we want the NACS protected services to open the communication channels themselves. This "Reverse communication" model implies that the servers components themselves initiate the communication channels from the protected network to the external one and then wait for requests. This model applies to the client/server model: //------------------------------------------------------------------------\\ 2 1 CC Client <------------------> NACS <---------> Reverse CC Server <_external_networks_> Internet <________internal_network________> \\------------------------------------------------------------------------// and also applies to the proxy model: //------------------------------------------------------------------------\\ 2 3 1 1 4 SSH Client SSH server Web browser<->CC_Client<->CC_server<->NACS<--Reverse_CC_proxy-->HTTP daemon IM client IM login <____external_networks____> Internet <__________internal_network__________> \\------------------------------------------------------------------------// ================================================================================ ========================================= 4. PLAYING THE GAME VS THE DETECTION TEAM ========================================= The communication channel is up and ready to be used to forward data streams. We may now focus on playing the game vs the detection team and their hability to detect and/or interrupt our communication. The rules of engament for the detection team are theoretically quite simple. They may try to detect exceeded specific thresholds in the network or transport layers (see [tcpstatflow]) or detect specific signatures for the tools used to build the communication channel (see [any signature-based IDS]). They may try to detect protocol anomalies generated by the tools (see [WebTap]) or try to learn the "network behaviour" and then use statistical methods to determine if the observed data streams look less or more suspicious (see [WebTap], [3] implementation]). The detection team may also use some Network Security Monitoring (NSM) Model along with standard IDS technologies. [Integ_NSM] and [NetSec_OpenSrc] describe NSM which "involves to collect, analyze and increase indications and warnings to detect and respond to intrusions". [NetSec_OpenSrc] states that "Within the context of NSM indicators are outputs from products which are created by IDS and are usually referred to as alerts. Trained people who may be referred to as analysts should be engaged in interpreting intrusions. The interpretation of indicators results in warnings. Warnings are human conclusions which indicate to decision makers that a network may have been compromised." Basically, the detection team will try to automatically detect anomalies they can conceive and model or they will try to store "enough" data information for an human being to seek and detect suspicious activities. -- Understanding the detection team needs rules is the key of that game. They need to limit the false positive (that is alerting for suspicious data streams while there is nothing but legitimate traffic) and they need to keep a low ratio of false negative (not raising alerts for suspicious traffic). So if the detection team has no rule to detect our communication channel or has no way to set up the related rule (because it would be too expensive in term of system resources, money, false-positive,...), then our communication channel may be safe. Raising the difficulty to detect we are using a communication channel to embed our data streams is possible with several methods. The first strategy is to confuse the analyst with multiple sources and destinations. A second type of method lays on learning and/or using the environment behaviour in order to keep a standard profile. And at last, suppose that we don't care at all if someone detects something suspicious ? ;) //------------------------------------------------------------------------\\ "I am getting rather tired of "everything over port 80" and calling everything a firewall this or firewall that. Getting into a world where you have a so called "firewall" for every type of service that goes over port 80 or you have to somehow try and manage to block it in your proxy while still trying to allow the rest is insane." [80_insane] \\------------------------------------------------------------------------// 4.1 Confusing the analyst with multiple sources and destination --------------------------------------------------------------- Several distinct models to multiply the number of destinations are presented in [1]. Some involve using multiple transit servers that will accept packets of data before sending them to the final destinations while another one introduces the notion of sending traffic to legitimate destinations each time the communication channel has to be used. A last model presents a solution to use legitimate third party components in order to store and retrieve information (see [ErrnoJones] for an application of this model to the HTTP protocol and [DNSCC_UnpubPhrack] for an application to the DNS protocol). It is also possible to multiply the number of sources. Using the same models, the distinct sources can be used alternatively when sending data through the NACS and/or some of them may only send legitimate data in order to fool or increase the volume of data the detection engine will have to inspect or store for further analysis. It is finally possible to use unexisting sources and/or destinations if the recipient is known to be on the path or if the goal is to increase the confusing amount of traffic or even to consider that the destinations themselves are representative of the information to transmit (the more obvious scheme being that a destination represents a bit of information). 4.2 Keeping a low profile ------------------------- This model is presented in [2] as a method to learn what is the communication channel to use regarding its environment. As we know the detection team is playing the behavioral learning game (see [WebTap], [3]), there is no reason for us not to play the same game. The right communication channel may thus be based on various distinct protocols that will be used in function of what the source is authorized to do. Another way to keep a low profile is not to send any superfluous traffic at all and only surreptitiously alterate part of legitimate data streams in order to send and receive information (see [PassiveCClinux]). This solution is, of course, more interesting if you own the intermediary devices between clients and servers so that your proprietary binary client code can send you what it is not supposed to send at all (example ? famous mail synchronisation from everywhere you are with your pda ;)). This concept is a well-known concept in the cryptography research area: [RSA_CC], [BH_BkDoor_2005]. 4.3 Live but let die -------------------- This concept is presented in [2]. It is based on the fact that the covert channel may be built upon control and data channels, each type of channel having distinct needs in term of bandwidth, latency and stealthiness. As the data channel needs bandwidth and latency parameters that may be higher than specific detection thresholds, it is quite feasible to think about a solution that will try its best to cover and keep the control channel alive (even if it means keeping it silent and waiting for better times) while not caring at all about loosing data channels. Another application is that there is no problem that someone detects the communication channel since it's too late (A keylogger sending passwords, see [IcmpKeylog], for example). //------------------------------------------------------------------------\\ Note that [NCSC_1993] states that "Transient covert channels are those which transfer a fixed amount of data and then cease to exist. Normally, bandwidth and capacity calculations apply only to channels that are sustainable indefinitely. Thus, it would seem transient channels are an irrelevant threat." \\------------------------------------------------------------------------// ================================================================================ ============================ 5. PRACTICAL IMPLEMENTATIONS ============================ Designing your covert channel implementation is the most interesting moment. Just free your mind before starting 'cause now you may do anything you want and nobody will stop you (and hopefully not the NACS ;)). You may use various protocols types like IP, ICMP, TCP, HTTP, IRC, DNS, RTSP, etc. and various program types like full client/server or p2p models, kernel modules, CGI programs, self-replicating applications, injectors to running processes or installed applications, etc. And may you ask the "Why should I implement anything ?", we may quote J.B. who told so rightfully about rootkits detection: we "had already defeated this detection mechanism before your released [the detection engine]. See I knew you or someone was going to do this [...] It is kinda non-climactic to create "solutions" for problems that don't exist yet [...]" [Rootkits_discussion]. ================================================================================ ========================================================== 6. PRACTICAL IMPLEMENTATIONS: ACTIVE PORT FORWARDER [apf] ========================================================== 6.1 Description --------------- Active port forwarder is a software tool, which implements several reverse tunneling techniques (RTT). It is designed for people without an external IP who want to make some services available on the Internet. The application is divided into two parts: afserver is placed on the machine with a publicly accessible address, and afclient is placed on the machine behind a firewall or masquerade. When the tunnel between two APF parts is established, all the connections received by the afserver are forwarded via the afclient to the proper destination. The whole communication is secured by the use of SSL. The bigger chunks of data are compressed with the help of Zlib. However, APF is not intended to hide it's presence. The priority is to achieve high bandwidth and reasonably small latency. Moreover, users are not being starved, but the bandwidth is quite fairly distributed between them. 6.2 Implemented Techniques -------------------------- 6.2.1 Direct tcp connections Direct tcp connection is used to create permanent data/control channel, which with flow control/packet buffering provides good performance and reasonably small latency. This type of the channel is rather easily detectable, because long-time connections are the seldom ones. Suppose we want to make our sshd server publicly available and the default behaviour of APF satisfy us. The whole procedure is very simply. On the remote host we have to type: //------------------------------------------------------------------------\\ user@remotehost> afserver \\------------------------------------------------------------------------// And on the local machine: //------------------------------------------------------------------------\\ user@localmachine> afclient -n remotehost -p 22 \\------------------------------------------------------------------------// After this, all the connections to remotehost:50127 will be forwarded to localmachine:22. 6.2.2 HTTP/HTTPS proxies When we can't use direct tcp connections due to the local network security policy, we can try to omit the limitations by the use of HTTP/HTTPS proxies. Active port forwarder can encapsulate the messages into valid proxy queries and HTTP server answers. Moreover, afserver waiting for HTTP packets can still accept direct tcp connections. Suppose we want to make our sshd server publicly available with the use of HTTP PROXY (located on httpproxy:8080). The default behaviour of APF once again satisfy us. The whole procedure is only slightly more complicated. On the remote host we have to type: //------------------------------------------------------------------------\\ user@remotehost> afserver -P \\------------------------------------------------------------------------// And on the local machine: //------------------------------------------------------------------------\\ user@localmachine> afclient -n remotehost -p 22 -P httpproxy \\------------------------------------------------------------------------// After this all the connections to remotehost:50127 will be forwarded to localmachine:22. ================================================================================ ============================================== 7. PRACTICAL IMPLEMENTATIONS: SKEEVE [skeeve] ============================================== 7.1 Description --------------- Skeeve is a software tool, that can easily create an ICMP tunnel between two computers, which may be located in different networks and separated by a firewall. It creates an ICMP tunnel which is based on the use of a Bounce server (The method relies upon the basic IP address spoofing methodology). 7.2 Implemented techniques -------------------------- Skeeve Client accepts TCP connections and works as a converter for the IP header (changing protocol flag from TCP to ICMP echo_request|reply and making some other slight modifications). Skeeve Server is doing the reverse procedure and restores the original IP header settings. Both parts are implemented in one 'C' program as a Loadable Kernel module. The same scheme will be used for reverse data. Only few conditions: - Bounce Server must be able to communicate with both the client and server - Bounce Server has to accept IP packets with spoofed IP address - Bounce Server has to accept ICMP echo_request|reply packets //------------------------------------------------------------------------\\ TCP Client(s) TCP Server(s) | (1) | (2) | +--------------+ ------> +----------------+ | -----> +----------------+ |Skeeve Client | | Bounce Server | | | Skeeve Server | +--------------+ <------ +----------------+ | <----- +----------------+ Internal network DMZ | External network NACS 1) Client sends: IP_SRC - IP of Skeeve Server (spoofed address) IP_DEST - IP of Bounce Server ICMP->ECHO_REQUEST 2) Bounce Server catches the ICMP ECHO_REQUEST message and answer with: IP_SRC - IP of Bounce Server IP_DEST - IP of Skeeve Server ICMP->ECHO_REPLY \\------------------------------------------------------------------------// 7.3 Usage --------- Skeeve is easy to use. For example, we want to get access to the external WWW server: first at all, we need to define some parameters in skeeve.c file: //------------------------------------------------------------------------\\ ... #define PORT 80 #define CLIENT_IP "192.168.1.55" #define BOUNCE_IP "192.168.1.1" #define TARGET_IP "192.168.1.251" ... PORT - which port we will listen to CLIENT_IP - it's our ip BOUNCE_IP - IP of Bounce Server TARGET_IP - Set IP of our server. \\------------------------------------------------------------------------// When parameters are set we compile Skeeve as a lKM: //------------------------------------------------------------------------\\ gcc -c skeeve.c -I /usr/src/linux/include \\------------------------------------------------------------------------// then, we just load the module: //------------------------------------------------------------------------\\ On the Client side: insmod skeeve.o type=client dev={eth0 | ...} On the Server side: insmod skeeve.o type=server dev{eth0 | ... } \\------------------------------------------------------------------------// Look at the kernel messages in /var/log/messages and that's all. Then just connect to the server on port 80 and do anything you want :) ================================================================================ ================================================== 8. PRACTICAL IMPLEMENTATIONS: MSNSHELL [msnshell] ================================================== 8.1 Description --------------- MsnShell is a covert channel tunneling tool. With it, you can remotely control a Linux computer behind a firewall. It, consisting of an executable file as the Msnshell server daemon, encapsulates shell command in MSN protocol. Not only is MsnShell able to work with firewall, but also pierce a HTTP proxy. Computers often are located behind firewalls which deny many malicious connections. Therefore these computers are expected to be relatively safe from external network. But Msn Messenger connection from internal network is usually allowed and is made through a gateway or a http proxy which allows internal computers to access internet via HTTP. The MsnShell key features are: 1. Give a SSH/FTP from any box located in the internal network to an external boxes; 2. Encapsulate SSH/FTP command or result in MSN protocol; 3. Can also work with a HTTP proxy; 4. Multiple access at a same time. ================================================================================ =================================================================== 9. PRACTICAL IMPLEMENTATIONS: SOCKSTER & TRAPSTER [sock|trap/ster] =================================================================== 9.1 Description --------------- Sockster and Trapster are two components of a tunneling framework which can be used to bypass NACS or for building a pentester environment. In the current implementation nearly all tcp based protocols like smtp, pop, vnc, rdp and ssh can be tunneled by the system. The system uses tunneling plugins for different connections so the 'administrator' can choose the best channel available for stealthiness or throughput for each tunnel endpoint. Currently its possible to use http, ftp and dns (udp) as tunnel protocols. Each tunneling plugin presents the same functionality so the enduser will not notice if the connection is over a dns tunnel at one time and over a ftp tunnel at a second connection. The tunneling plugins care about encryption and stealthiness. Using this approach it is not necessary for Trapster or Sockster to implement such functions. This is very useful for pentesting because the engineer can create own plugins with or without encryption. For creating detection engines this is also a big advantage because it is possible to emulate other tunneling tools or normal unencrypted traffic too. The most important advantage over other tools is, that it is not necessary to create a static configuration for each connection to different endpoints through the tunnel because the system can act as a transparent proxy even for protocols which are not designed for proxying. Other tools need a static mapping of client ip:port to destination:port to build a tunnel. Mainly they are only used by one internal endpoint (the pc the internal tunneling endpoint runs on). With the framework presented here the internal tunneling endpoint can be used by all clients in the local subnet. The client doesn't have to be configured to use this 'proxy' because mainly this is not possible - ie for pop3. Also in big environments this it not desired because of the man power necessary for updates. To fullfill this functionality the tunneling framework emulates a tcp stack and grabs all configured network traffic and send it over the tunnel to the desired endpoint. The reverse traffic is then mapped correctly so that the client application don't notice anything. This functionality is implemented as a plugin too (called 'IO plugin') so the 'administrator' can build his own plugins for the framework. A Socks proxy plugin and a http proxy plugin are under development currently. These plugins will emulate a Socks and a http proxy so every http proxy ready application can use the tunnel directly without the need of the generic IO plugin which cannot be used for localhost currently. This kind of plugin is the entry point for one side of the tunnel - mainly the side which is not reachable from another network. To get the current state of Sockster and Trapster it is possible to dump the internal session structures as html via two cgi applications. With these CGI the 'administrator' can look at the sessions currently active or finished and get some additional information about the services itself. 9.2 Sockster ------------ In collaboration with Trapster Sockster build a protocol independent proxy. Sockster is one part of the 'tunnel' and is located mainly on the 'free' internet (network 'I') but can also run in a secured network (network 'A') where direct connections from the network where Trapster runs (network 'B') is denied for the desired protocols. One Sockster can be the 'free' tunneling endpoint for more than one Trapster so many Trapster can use the 'service'. The main application Sockster was designed for connections from network 'B' into the network 'A' or 'I'. This can be used for 'bad' things like reading and sending private mails or using instant messaging from an internal network. The second application for Sockster is to receive session requests for connections from systems in network 'A' or 'I' to systems in network 'B'. Sockster will send these requests to the correct Trapster and receive a request from Trapster for a listening socket for the client application after a few seconds if all goes well. The port number of the listener is sent to the requester application (TunOpenConn.pl for example). The user can then connect to the listening port with his client application. The Sockster will then forward all data from the client socket to the associated Trapster. Using this way the Client can access all ips and services on network 'A' from network 'B'. This can be used for bypassing NACS and firewalls which normally prevent the connection from an external ip. Its possible to make a port scan on an internal ip with this functionality too. The tunneling plugins for Sockster are normally daemons which communicate with Sockster via shared memory, pipes or sockets. This allows to use many different plugins at the same time and thus allows connections to the same Sockster with different protocols at the same time. Currently there is no IO plugin structure for Sockster but a cgi which can be used for requesting connections from the 'B' to 'A'. This CGI will send the request to Sockster and get the answers back from Sockster via IPC. This is a very useful and fast way to request and establish a new connection to a system inside a firewalled network. 9.3 Trapster ------------ Trapster is the other part necessary for the tunneling framework. Trapster builds the tunneling endpoint in network 'B' and manages all connections in the local network. To get the data from the local clients it uses IO plugins like the libpcap/libnet plugin. The IO and tunneling plugins for Trapster are mainly perl modules. They communicate with Trapster via direct requests without the need for IPC. Currently its only possible to use one IO and one tunneling plugin at the same time. In correlation to Sockster, Trapster has two main functions: create sessions and multiplex data from local clients into the tunnel and establish sessions and proxy data from remote clients via Sockster. The second functionality is that Sockster receives requests from Trapster for new connections from network 'A' or 'I' to 'B'. Sockster will then establish the connection to the desired endpoint (internal ip:port) and create a session for the new tunneled connection. 9.4 ?Framework? --------------- Both central services are not designed to be stealthy and covert. This is the task of the plugins. Sockster and Trapster will queue the data so that the plugins can use some 'magic' sending algorithm to be stealthy or noisy. The 'administrator' has to choose the right plugin for the environment at this time. But its also possible to write some new plugins which learn from the local environment and choose the right algorithm themselves. In summary this framework is dedicated to people willing to adapt a very personal tunnel more than to people willing a production ready software with graphical setup. In addition to the framework there are some tools for testing purposes like a tcp stress tester where 'administrator' can test a network link. The graphical application lets the 'administrator' choose what bandwidth to use. With this tool it is very easy to check how well a tunneling solution scales with more traffic and when the sending algorithm is not stealthy anymore because of high load. ================================================================================ =========== WEBOGRAPHIE =========== [1]: http://gray-world.net/projects/papers/covert_paper.txt [2]: http://gray-world.net/projects/papers/rtt.txt [3]: http://tunnel.gotdns.org (Tdetect) [apf]: http://gray-world.net/pr_af.shtml [skeeve]: http://gray-world.net/poc_skeeve.shtml [msnshell]: http://gray-world.net/pr_msnshell.shtml [sock|trap/ster]: http://tunnel.gotdns.org --- [80_insane]: - http://archives.neohapsis.com/archives/dailydave/2005-q3/0050.html [AckCmd]: - http://gray-world.net/tools/ackcmd.zip [BH_BkDoor_2005]: Building Robust Backdoors in Secret Symmetric Ciphers (2005) - A.L. Young - http://www.blackhat.com/presentations/bh-usa-05/bh-us-05-young-update.pdf [CC_Here2Stay_1994]: Covert Channels - Here to Stay? (1994) - Ira S. Moskowitz, Myong H. Kang - http://gray-world.net/papers/moskowitz94covert.pdf [CC_TCPIP_Hdr_2002]: Covert Channels in TCP/IP Headers (2002) - Drew Hintz - http://guh.nu/projects/cc/covertchan_files/frame.htm [DNSCC_UnpubPhrack]: DNS Covert Channels and Bouncing Techniques (2005?) - Anonymous - http://gray-world.net/board/index?PID=2192 [DOD_1985]: Departement of Defense Trusted Computer System evaluation criteria 5200.28-STD (1985) - DoD standard - http://gray-world.net/papers/5200.28-STD.html [Embed_CC_TCPIP_2005]: Embedding Covert Channels into TCP/IP (2005) - S.J. Murdoch, S. Lewis - http://gray-world.net/papers/ih05coverttcp.pdf [ErrnoJones]: Legitimate Sites as Covert Channels - An Extension to the Concept of Reverse HTTP Tunnels (?) - Errno Jones - http://www.gray-world.net/papers/lsacc.txt [IcmpKeylog]: Remote Windows Kernel Exploitation - Step Into the Ring 0 (2005) - Barnaby Jack - http://www.eeye.com/~data/publish/whitepapers/research/OT20050205.FILE.pdf [Integ_NSM]: Integrating the Network Security Monitoring Model (2004) - Richard Bejtlich - http://www.taosecurity.com/sysadmin_apr_04.pdf [InternetSteg_ActWard_2002]: Eliminating Steganography in Internet Traffic with Active Wardens (2002) - G. Fisky, M. Fisk, C. Papadopoulos, J. Neil - http://gray-world.net/papers/ih02.pdf [Lampson_1973]: A Note on the Confinement Problem (1973) - Butler W. Lampson - http://gray-world.net/papers/lampson73note.pdf [NCSC_1987]: Extension to 5200.28-STD to trusted network systems and components. (1987) - National Computer Security Center - http://gray-world.net/papers/NCSC-TG-005.html.gz [NCSC_1993]: A Guide to Understanding Covert Channel Analysis of Trusted Systems (1993) - National Computer Security Center - http://gray-world.net/papers/aguidetocc.txt [NetSec_OpenSrc]: Network Security- An Open-Source Approach (2005) - Blain R. Jones - http://www.infosecwriters.com/texts.php?op=display&id=321 [PassiveCClinux]: The Implementation of Passive Covert Channels in the Linux Kernel (2004) - Joanna Rutkowska - http://gray-world.net/papers/passive-covert-channels-linux.pdf [PractDH_2002]: Practical Data Hiding in TCP/IP (2002) - K. Ahsan, D. Kundur - http://gray-world.net/papers/acm02.pdf [Rootkits_discussion]: - http://archives.neohapsis.com/archives/dailydave/2005-q2/0291.html [Rowland_1996]: Covert Channels in the TCP/IP Protocol Suite (1996) - Craig H. Rowland - http://gray-world.net/papers/ccintcpip.txt [RSA_CC]: What are covert channels? - http://www.rsasecurity.com/rsalabs/node.asp?id=2351 [Unwitting_2003]: New covert channels in HTTP: adding unwitting Web browsers to anonymity sets (2003) - M. Bauer - http://google.that.paper [WIKIPEDIA_CC]: Covert channel Wikipedia definition - http://en.wikipedia.org/wiki/Covert_channel [WLAN_STEALTH_2005]: WLAN and Stealth Issues (2005) - L. Oudot - http://www.blackhat.com/presentations/bh-europe-05/BH_EU_05-Oudot/BH_EU_05-Oudot.pdf