Gigi Labs

Please follow Gigi Labs for the latest articles.

Friday, September 27, 2013

C#: Understanding Recursion with Directory Traversal

Greetings, and welcome to this brand new article at Programmer's Ranch! :)

In this article, we're going to talk about recursion. This technique is often considered an alternative to iteration (i.e. loops), and is useful for a wide variety of situations ranging from computing factorials to clearing empty areas in Minesweeper:


Since factorials are boring, and Minesweeper is a bit complex for this easy tutorial, we're going to look at the filesystem in order to learn about recursion. For example, take a look at the folder for the solution I just created in SharpDevelop:


See, the filesystem is actually a tree data structure. Each folder can contain other files and folders, which can in turn contain other files and folders, and so on. It isn't easy to work with things like trees using loops, but with recursion it just comes natural. Let's see how.

After creating a new console application, add the following at the top, which we need to interact with the filesystem:

using System.IO;

The first thing we want to do is get the current directory where the executable will be running. We do this by using Directory.GetCurrentDirectory(). Let's try that out:

            String currentDir = Directory.GetCurrentDirectory();
            Console.WriteLine(currentDir);
            Console.ReadKey(true);

...and here's what we get:


Now, we want to navigate up to the first CsRecursion folder, which is the solution folder. From there we'll be able to list the contents of all the subfolder. To do this, we create an instance of DirectoryInfo:

            DirectoryInfo dir = new DirectoryInfo(currentDir);

This allows us to get to the parent folder:

            dir = dir.Parent.Parent.Parent;

...and this is what we have so far:


Right, now about listing the folder and subfolder contents. Let's add a method to do that:

        public static void ListContents(DirectoryInfo dir)
        {
           
        }

In this method, we first want to list all the files in that folder. We can do this using DirectoryInfo.GetFiles(), or else using the static Directory.GetFiles() method, which is easier and works directly with file paths (Strings):

            foreach (String file in Directory.GetFiles(dir.FullName))
                Console.WriteLine(file);

Okay, now all we need is to do the same thing for all subfolders. It turns out that our ListContents() method can actually call itself, and pass in each subdirectory as a parameter:

            foreach (DirectoryInfo subdir in dir.GetDirectories())
                ListContents(subdir);

When a method calls itself, it's called recursion. In this case we say we are recursing into subdirectories.

Let's just change Main() so that it calls our ListContents() method:

        public static void Main(string[] args)
        {
            String currentDir = Directory.GetCurrentDirectory();
            DirectoryInfo dir = new DirectoryInfo(currentDir);
            dir = dir.Parent.Parent.Parent;
            ListContents(dir);
           
            Console.ReadKey(true);
        }

...and voilĂ :


As you can see, there's a very small amount of code, and recursion is a perfect fit for this kind of thing, because the same method can work on folders at different levels of the filesystem.

There's an important concept about recursion you need to be aware of, that might not be so evident in this example: the stopping condition. If a method calls itself and has no way to stop calling itself, you get a sort of infinite loop which actually ends in a stack overflow (in short, there's a limit to the number of times a method can call another method). Therefore, a recursive function always needs a way to stop calling itself.

If you're doing factorials, recursion stops when n=1. If you're computing a Fibonacci sequence, n=0 and n=1 are the stopping conditions. In our case, recursion stops when a folder has no further subfolders. In Minesweeper, recursion stops when there are no more adjacent blank squares (you either hit an edge or a number).

Anyhow, as we have seen in this article, recursion is a great technique to use when your data has divergent paths (most notably when dealing with trees). There are a few other interesting applications of recursion that I'll probably be writing about in the future, so stay tuned! :)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.