Monogame – Working with Touch

*NOTE – This may also work on other platforms with touch as well, but this hasn’t been tested on anything other than Android at the current moment.

One fundamental aspect to understand about touch and gestures is that code needs structured around the idea of continuous input. Thus, any touch recognition code should be encapsulated somewhere within the main game loop; where this is would be dependent upon how one elects to parse input. The reason for this is that the nature of all touch processing is a sequence of touch events that get fired in succession; there are only a few exceptions to this rule.

The other aspect to understand clearly is nomenclature mapping. What you think you want to achieve in your design may, unless you’re already familiar with either the Monogame or Xamarin frameworks, not necessarily be what either of those aforementioned frameworks are calling it through its API parlance. So be prepared to step outside your normal wheelhouse and, perhaps, discard some long-standing assertions about what truly a touch/gesture really is.

First, the consideration needs made as to if you’re going to handle touches or gestures – don’t get too caught up on thinking that a touch is a single press, because this will cause you some unnecessary grief later. If your program ebbs toward the former, there’s usually no requisite configuration needed, other than to check for the existence of a touch-capable input device and adjust your engine accordingly (the only other exception here would be if one wishes to perform touch-to-cursor mappings, but this is outside the immediate scope of this article). Conversely, gesture handling requires some configurations be made before they can be used, and this is where the Monogame documentation, as well as a majority of the information available on various forums, falls fatally short. We will focus on this once we start addressing gesture handling, which follows a look at touch handling.

Checking for the existence of hardware with touch capabilities can be queried through the TouchPanelCapabilities object, triaged through the TouchPanel static object as such:

TouchPanel.GetCapabilities ().IsConnected;

The IsConnected property is a boolean that indicates the presence of a touch-capable device. Obviously, the absence of one suggests either that the device is intermittently available, or that other sources of input are necessary. In this case, neither the touch nor gesture APIs would be valid.

Concerning touches, the current state of the touch panel holds a collection, called TouchCollection, of TouchLocation objects. Each of these correspond to a single touch event generated by the user when a press is determined to have occurred on the sensitive surface. It’s fair to suggest that multiple points of contact yield multiple TouchLocation instances within the TouchCollection (i.e. as a practical example, one finger would generate one instance, two fingers would generate two instances, etc. This can best be observed while debugging a program using this technique, and paying attention to the Id property of the TouchLocation instance(s).). In order to properly ascertain the most accurate state of the touch panel, we’ll need to do the following:

  1. Obtain the current state of the touch panel
  2. Determine if there are any TouchLocation instances in the TouchCollection collection
  3. If there are, take relative action based on any of the properties of the current TouchLocation instance

This can be accomplished using the following general template

TouchCollection tc = TouchPanel.GetState ();
foreach (TouchLocation tl in tc) {
 // Do something here

Fortunately, TouchLocation instances are very simple to use. There are only four properties that are of any practical significance, and one utility function for trying to determine the previous location (which, at least during my testing, wasn’t useful for anything). They are as follows:

  • Id – An Integer that uniquely identifies a single TouchLocation instance within a TouchCollection. It’s unknown how its potential values are determined, or if orphaned values are subject to recycling after the host TouchLocation has fallen out of scope.
  • Position – A Vector2 that provides the X and Y coordinate serving as the source for the event. Effectively, this is the location on the sensitive surface where contact was made.
  • Pressure – A floating-point number indicating the pressure of the touch. I’m unclear on how exactly this works, since my tests always reported a zero in this property. The only conclusions I can come up with here are either that my device doesn’t support touch sensitivity of this kind, or I missed a configuration step to enable this functionality.
  • State – An instance of TouchLocationState that determines what kind of touch we’re dealing with. This property can have one of four values:
    • Invalid – The press has been somehow deemed invalid; I’ve never seen this occur in testing, and am left to think that either I failed to satisfy conditions to make this occur, or that it’s simply a dump case for an exceptional condition.
    • Moved – Seen when a press has either remained in the same position or has moved. This is a very important state as it has a great deal of practical application, so do try to keep it in mind.
    • Pressed – Observed when a point of contact on the sensitive surface is newly minted. This is only fired once, and will not be seen again, regardless if the contact remains afterward. This could potentially be a source of confusion for a number of programmers.
    • Released – Likely the last state that a TouchLocation would be in before falling out of scope, it will be fired when the source of contact which generated this particular instance is no longer available.

Having said all of that, you now have enough information to properly utilise touches within your programs. As was stated before, simply ensure that your code is contained within the main game loop somewhere, since the data will likely change per frame, or at least as often as it can. An example of how to implement code based off this logic will be illustrated at the end of this discussion.

Gestures, insofar as Monogame is concerned, are a bit of a hassle to work with, mostly due to the lack of upstream documentation. We will seek to correct this imbalance, if not for the official documentation, then at least for programmers who wish to learn more about them.

As was stated previously, although Monogame performs a considerable amount of work to make available the touch and gesture API for programs deployed on capable hardware, gestures are left a bit open-ended by comparison to their touch counterparts. What this means is that as a programmer, you’re required to provide some configuration before the gesture API can be utilised. Here, we assume that you’re only interested in gestures, and not touch-to-cursor mappings, hence will only discuss the former.

Before proceeding, some basic information should be given as to the nature of gestures, and how they’re procedurally handled.

Although there is some very minor overlap with how a touch and gesture are expressed, they are two discrete entities. Gestures can be composed of one or more points of contact, and it’s expected that the locations of these contacts will change, in a particular way, over an indeterminate amount of time (ergonomic measurements would likely dispute this generalisation, but it is for this conversation a generalisation, and not a scientific claim). The ways in which these contacts change, or at least the resultant shape the change yields, as well as the number of contacts involved in the measurement, hints at the particular kind of gesture. In other words, a single point of contact that has a gradual change in its X-axis, be it positive or negative, which would yield a ray (or a series of small rays), is generally considered to be a horizontal drag gesture. When considering the same principles but looking to the Y-axis instead, we now find ourselves dealing with a vertical drag gesture. Pinch and zoom gestures typically involve two or more points of contact that move near concertedly away from or toward each other along the same logical axis. Perhaps paradoxically, at least when considering a contrast between touches and gestures, taps, double taps, and long-presses are registered as gestures as well; these are more concerned with the sustainment of a single point of contact relative to the time from when it was first recognised.

From a stock perspective, Monogame provides eleven types of gestures, referred to as GestureTypes. These types will effectively determine how gesture detection is performed (it’s unclear if the GestureType framework can be extended to facilitate custom gesture types, but this is a considerably advanced topic which will not be discussed here). However, Monogame will not automatically read the touch panel for gestural input. Instead, it needs instructed on which kinds of gestures to detect, and this is provided by the programmer. In any non-looping block of code, preferably during the initialisation routines, you’ll need to specify what are called EnabledGestures, which is a property of the TouchPanel static object. Multiple gestures can be configured by OR’ing one or more of the types together in the assignment statement. For example, if I wanted to parse for both a HorizontalDrag and a DragComplete gesture, I would write the following statement:

TouchPanel.EnabledGestures = GestureType.HorizontalDrag | GestureType.DragComplete;

Once this is complete, you’ll have done enough to get Monogame to start playing nice with at least these two kinds.

Parsing gestural input is, in essence, no different than parsing touch input, but there are some minor differences to the process. To start, we must first determine if there are any gestures with which to read data from. If we do not do this, attempts to read directly from the gesture store will generate fatal exceptions. Fortunately, the TouchPanel static object provides a boolean property called IsGestureAvailable, which will inform clients of the availability of queued gesture data. If we have data, we must convert the data into a sample, which is packaged into the GestureSample class. As with the TouchLocation object, the GestureSample object contains several properties that are of practical interest to the programmer, especially when making contextual decisions that respond to this kind of input. GestureSamples include the following properties:

  • Delta – A Vector2 instance which provides the delta, or difference, data for the first touch point in the gesture. This will change over time, and will always be relative to the coordinates where the touch was first recognised.
  • Position – A Vector2 instance which contains the current coordinates of the first touch point in the gesture.
  • Timestamp – A TimeSpan instance that indicates the time when the gesture was first recognised.
  • GestureType – A GestureType instance that indicates what type of gesture was determined based off several criteria.

Additionally, GestureSample contains properties called Delta2 and Position2, which are used to track a second point of contact that’s being measured as part of the current gesture. What this implies is that insofar as the stock gestures are concerned, Monogame will only be able to handle gestures where no more than two points of contact are involved.

My advice here is to experiment with the data through debugging until you’re comfortable with how these gestures are read, because there are some nuances with how the data continuously polls respective to different gesture kinds. For example, a HorizontalDrag gesture will, while the drag is occurring, constantly emit the HorizontalDrag signal until the contact source is released, terminating the gesture. At this point, if one is checking for the DragComplete signal as well, releasing the contact source will cause the touch panel to emit the DragComplete signal.


To determine if a single press has been made:

TouchCollection tc = TouchPanel.GetState ();
foreach (TouchLocation tl in tc) {
 if (TouchLocationState.Pressed == tl.State) {
  // Execute your domain-specific code here

To determine if a press was made, and has been held in the same position for an arbitrary period of time:

TouchCollection tc = TouchPanel.GetState ();
foreach (TouchLocation tl in tc) {
 if (TouchLocationState.Moved == tl.State) {
  // Execute your domain-specific code here

To track the position of a horizontal drag:

(1) During game initialisation:

TouchPanel.EnabledGestures = GestureType.HorizontalDrag;

(2) During game loop:

while (TouchPanel.IsGestureAvailable) {
 GestureSample gs = TouchPanel.ReadGesture ();
 if (GestureType.HorizontalDrag == gs.GestureType) {
  // Execute your domain-specific code here

PhysicsFS/PhysFS++ Tutorial

This is a follow-up from a promise that I made in my tutorial video on designing an asset manager using SFML. That video can be found here.

This tutorial is centred around PhysFS++, a C++ wrapper for PhysicsFS. For the sake of the tutorial, it’s assumed that the reader is familiar with PhysicsFS and what it’s capable of. Described here is a workflow for simple use of the library. Anything more discrete is beyond the scope and will have to be ascertained by the reader on their own accord.

Lastly of note, PhysFS++ encapsulates PhysicsFS calls in a PhysFS namespace. The functions are global within this namespace and there are only a few classes that are provided. PhysFS++ further encapsulates key PhysicsFS constructs, notably those corresponding to archived files, into said classes that are derivatives of STL stream classes (a huge boon).

Like some other libraries, PhysicsFS requires explicit initialisation before it can be used. This is facilitated by a function named init. It takes one argument of type const char* and is semantically directed at a terminal invocation argument triaged through the much loved argv. However, this can be an empty string especially if you’re not expecting to handle terminal invocations. Thus, a very general call to init can be performed as such:

PhysFS::init (nullptr);

Right of the bat we have to mention a caveat here. From PhysFS++, there’s no way to assert the initialisation process of PhysicsFS. This is counter-intuitive to the upstream library which relies on the tried-and-true zero on failure, non-zero on success return values as sanity checks. Furthermore, while PhysFS++ implements C++ exceptions, the init function doesn’t throw any at all. Ostensibly, what one is left with is an unchecked call to init. Because one needs to compile PhysFS++ from the source, it is possible to modify the code, as I have done, to add a check to this call.

Once init has successfully completed, the next step is to mount an archive file into the virtual filesystem created by init. This is performed with the mount function. mount expects three arguments: the archive file on disk, a string specifying a mount point in the virtual filesystem, and a boolean which appends the mount point to the search path. One should place the archive file in the working directory for the binary file of their executable so PhysFS++ can see it. The second argument can be an empty string which would force root mounting, and the third can be true. Thus, for our example, if we assume we have an archive file named on our disk in the binary’s working directory, we can issue a call to mount as such:

PhysFS::mount (“”, “”, 1);

Unfortunately, as was the case with init, mount wraps an upstream function that adheres to the zero-or-nonzero return value paradigm that is outright dropped by PhysFS++ with no exception catering otherwise. Thus, if mount fails to mount the archive for whatever reason, it’ll be a little difficult to ascertain why; plan accordingly.

Assuming mount has returned properly, one can start to work with the files that are contained in the archive. At this point, you should begin working in the mentality of filesystem calls. It’s possible to have archive files with complex directory layouts which would require one to perform recursive searches. That being said, everything should be considered a file – even a directory. An analysis of the PhysicsFS and PhysFS++ APIs will provide for you the full breadth of your available capabilities so for the sake of brevity, only a select number of those will be touched here.

Enumeration of files in a directory can be performed with the enumerateFiles function. It takes as an argument a string which indicates the directory to use as a root for the enumeration. As a return it provides the caller with a vector of strings indicating the name of the file. For ease of use, a call to this to enumerate the files in root can be performed as such:

auto rootfiles = PhysFs::enumerateFiles (“/”);

To actually work with a file in the filesystem, one needs to use the PHYSFS_file ADT. This upstream ADT is encapsulated by one of three classes: PhysFS::ifstream, PhysFS::ostream, or PhysFS::fstream. Each of these is fantastic in that they encapsulate a PHYSFS_file ADT in a standard stream which means that one can now perform established stream operations on the file itself. Each class has its own use: fstream for reading files, ostream for writing to files, and ifstream for bidirectional file handling. In any case, any of these classes can be instantiated with a string argument that contains a fully qualified pathname to a file in the virtual filesystem that one wants to work with. For example, given that our virtual filesystem has a file named goofycats.png located at the path /textures/cats, we could instantiate a read file as such:

PhysFS::fstream gc (“/textures/cats/goofycats.png”);

Again, the underlying upstream call to open the file goes unchecked.

What you do from here is largely subjective relative to your program’s context. Say, for example (and this is actually pretty specific to my own use case), that you have a series of files of various formats in an archive that are being used in a video game program. The multimedia library you’re using should have ADTs that represent types of assets such as textures, fonts, etc. One could do some work to transpose the raw data of these streams into the aforementioned ADTs for use in the multimedia library. The following snippet of code, borrowed from one of my own projects, illustrates this. A PhysFS::fstream instance named f is created with a certain file. Following up is a char array named d is instantiated with the size of f. The read function of f is called to transpose the bytes from f into d. Then, d is used to instantiate an instance of sf::Music from SFML contained in a std::unordered_map:


This covers most the basic and general needs of users of PhysicsFS. Should you require anything else from it, you’d do well to read up on the Doxygen file of PhysicsFS or the source for PhysFS++. Lastly, when one is done with the library, you’ll need to call a deinitialization function called deinit:

PhysFS::deinit ();

Clicky Game – Development Journal 0001

Because I have no life, and because I really suck at having good ideas, the concept of Clicky Game was born.

Now ordinarily I wouldn’t bat two shits at this thing after having written something like that, but I wanted to use this as an opportunity to really dig deep back into C++ and use some of the features in C++11. I just want an excuse to use lambdas liberally. 🙂

I’m doing a few interesting things with this project. First of all, I’m writing it in Visual Studio. Yes, I know, blasphemy. The reason for this though is because my new job throws me into a DevOps-type position where I’m going to have to be writing code for integration into an ERP Package we use. Unfortunately they only provide an API in C# and since we’re a majority Microsoft Shop, I need a MSVS Primer. This would be it.

Sticking with the badassery that is MSVS, I’m trying out Team Foundation Server for VCS. This is drastically different from my traditional use of Git. So far, I hate it. It’s OVERKILL for a single dev such that it’s almost not quite suitable for a single dev. But we’re going to stick with it.

So far, I’ve had two solid days of coding (about 12 hours a day) and I’ve gotten into this interesting area where I’m very heavily extending the SFML Framework that I’m using for the major library. It’s actually a really good library. It’s just missing a lot of things that I need. For example, there is no native construct for handling 2D animation. I find this to be a little odd since it’s primarily a 2D library. But that’s sort of cool since it lets your roll your own implementation. It also doesn’t have any native constructs for UI. This leaves me in a bit of a predicament since there’s no real solid UI library that integrates with SFML (guichan is strictly for SDL/Allegro) so now my current task has been to create a bit of a library that does this.

Doing all of this seemingly extra work is actually pretty cool since I’m able to use modern coding features like named lambdas for event handling (by use of std::function wrapper for class member). As another challenge, all allocations are dynamic keeping as little overhead on the stack as possible. So far, this seems to have a tremendous performance benefit but, as you could imagine, it’s requiring A LOT of double checking for things like new/delete pairing, reference/dereference members for appropriate address/value access syntax, test assertions on members prior to use in expressions even in places where it may be safe to assume that the memory should be successfully allocated and the respective member instantiated, thread safety, cross-member asset access, etc… A lot of really interesting technical hurdles.

I think I’m going to do a little bit of a piece on this for the next installment of “Something with Greg” which is shaping up to be the programmer episode.