c-sharp

Modeling Battleship in C# - Components and Setup

NOTE: This is Part 2 of a three-part series demonstrating how we might model the classic game Battleship as a C# program. Part 1 is over here. You might want to use the sample project over on GitHub to follow along with this post. Also, check out my other posts in the Modeling Practice series!

In the first part of this series we discussed how to play a game of Battleship and what kinds of components and strategies we would need to use. With those in place, we can begin modeling the game. Let's build some objects!

Coordinates

The first and most basic object we are going to model is the Coordinates object, which represents a location on a board that can be fired at.

public class Coordinates  
{
    public int Row { get; set; }
    public int Column { get; set; }

    public Coordinates(int row, int column)
    {
        Row = row;
        Column = column;
    }
}

You might be wondering why those properties Row and Column are not a part of a different model, e.g. the Panel model that we're about to define. This is because whenever a shot is fired, the person firing the shot does so by calling out coordinates, and so this class will not only represent coordinates on the game and firing boards, but also coordinates that are under attack.

More Modeling Practice:

(NOTE: In the game, rows are given letter designations, e.g. "A", "B", etc. Here, we'll be using integers, as it makes several calculations easier).

OccupationType

For any given panel, there a few possibilities as to what can be on that panel:

  • If a ship is on the panel, then the panel is occupied. Two ships cannot occupy the same panel.
  • If a shot was fired at that panel, then either a hit or a miss was recorded on that panel.
  • If there's nothing on that panel, the panel is said to be empty.

To represent all of these statuses, I created an enumeration called OccupationType:

public enum OccupationType  
{
    [Description("o")]
    Empty,

    [Description("B")]
    Battleship,

    [Description("C")]
    Cruiser,

    [Description("D")]
    Destroyer,

    [Description("S")]
    Submarine,

    [Description("A")]
    Carrier,

    [Description("X")]
    Hit,

    [Description("M")]
    Miss
}

The Description attribute records the display character used for each of these statuses. We'll see a lot of those characters when we show how to play a game in the next part of this series.

Panel

The next object we need represents a single space on the game boards. I've taken to calling this space a Panel.

public class Panel  
{
    public OccupationType OccupationType { get; set; }
    public Coordinates Coordinates { get; set; }

    public Panel(int row, int column)
    {
        Coordinates = new Coordinates(row, column);
        OccupationType = OccupationType.Empty;
    }

    public string Status
    {
        get
        {
            return OccupationType.GetAttributeOfType<DescriptionAttribute>().Description;
        }
    }

    public bool IsOccupied
    {
        get
        {
            return OccupationType == OccupationType.Battleship
                || OccupationType == OccupationType.Destroyer
                || OccupationType == OccupationType.Cruiser
                || OccupationType == OccupationType.Submarine
                || OccupationType == OccupationType.Carrier;
        }
    }

    public bool IsRandomAvailable
    {
        get
        {
            return (Coordinates.Row % 2 == 0 && Coordinates.Column % 2 == 0)
                || (Coordinates.Row % 2 == 1 && Coordinates.Column % 2 == 1);
        }
    }
}

We should make special note of the IsRandomAvailable property. Remember from the previous part of this series that when we are firing random shots, we don't need to target every panel, but rather every other panel, like so:

IsRandomAvailable helps us implement that strategy. It designates every panel where both row and column coordinates are odd, or both coordinates are even, as being available for a "random" shot selection.

Finally, note the IsOccupied property. We'll be using that property in a later part to determine where to place the ships.

Ships

Speaking of the ships, let's define their base class now.

public abstract class Ship  
{
    public string Name { get; set; }
    public int Width { get; set; }
    public int Hits { get; set; }
    public OccupationType OccupationType { get; set; }
    public bool IsSunk
    {
        get
        {
            return Hits >= Width;
        }
    }
}

The only real trick to this class is the IsSunk property, which merely returns true if the number of hits the ship has sustained is greater than or equal to its width.

Let's also define five additional classes, one for each kind of ship.

public class Destroyer : Ship  
{
    public Destroyer()
    {
        Name = "Destroyer";
        Width = 2;
        OccupationType = OccupationType.Destroyer;
    }
}

public class Submarine : Ship  
{
    public Submarine()
    {
        Name = "Submarine";
        Width = 3;
        OccupationType = OccupationType.Submarine;
    }
}

public class Cruiser : Ship  
{
    public Cruiser()
    {
        Name = "Cruiser";
        Width = 3;
        OccupationType = OccupationType.Cruiser;
    }
}

public class Battleship : Ship  
{
    public Battleship()
    {
        Name = "Battleship";
        Width = 4;
        OccupationType = OccupationType.Battleship;
    }
}

public class Carrier : Ship  
{
    public Carrier()
    {
        Name = "Aircraft Carrier";
        Width = 5;
        OccupationType = OccupationType.Carrier;
    }
}

Each player will instantiate one of each kind of ship in order to play a game.

Game Board

Each player will also need an instance of class GameBoard, which tracks where that player's ships are placed and where their opponent's shots have been fired.

When you get right down to it, a GameBoard is really just a collection of Panel objects that we defined earlier.

public class GameBoard  
{
    public List<Panel> Panels { get; set; }

    public GameBoard()
    {
        Panels = new List<Panel>();
        for (int i = 1; i <= 10; i++)
        {
            for (int j = 1; j <= 10; j++)
            {
                Panels.Add(new Panel(i, j));
            }
        }
    }
}

Firing Board

In addition to the GameBoard, we also need a special kind of GameBoard called FiringBoard, which tracks each players shots and whether they were hits or misses.

public class FiringBoard : GameBoard  
{
    public List<Coordinates> GetOpenRandomPanels() { }

    public List<Coordinates> GetHitNeighbors() { }

    public List<Panel> GetNeighbors(Coordinates coordinates) { }
}

We will define each of those methods in the next (and final) part of this series.

Player

Now we can write up our Player class. Each player will need a collection of ships, an instance of GameBoard, an instance of FiringBoard, and a flag to show whether or not they have lost the game:

public class Player  
{
    public string Name { get; set; }
    public GameBoard GameBoard { get; set; }
    public FiringBoard FiringBoard { get; set; }
    public List<Ship> Ships { get; set; }
    public bool HasLost
    {
        get
        {
            return Ships.All(x => x.IsSunk);
        }
    }

    public Player(string name)
    {
        Name = name;
        Ships = new List<Ship>()
        {
            new Destroyer(),
            new Submarine(),
            new Cruiser(),
            new Battleship(),
            new Carrier()
        };
        GameBoard = new GameBoard();
        FiringBoard = new FiringBoard();
    }
}

The Player class also has a ton of methods which we define in Part 3.

Game

Finally, we need a Game class. This is because, in the final part of this series, we're going to run a bunch of games to see if this system gives any inherent bias to one of the Player objects.

public class Game  
{
    public Player Player1 { get; set; }
    public Player Player2 { get; set; }

    public Game() { }

    public void PlayRound() { }

    public void PlayToEnd() { }
}

Our first objective is achieved: we've created the classes necessary to play a game of Battleship. Now, let's work though how to set up a game.

Setting Up the Game

To start, let's think about what a Player would need to do, once s/he has all their pieces, to set up a game of Battleship. S/he needs to:

  • Place his/her ships on the GameBoard.
  • That's it!

So, okay, there's not a whole lot of setup involved in a game of Battleship. However, there is some, so in this section we're going to implement the code which places a Player's ships, as well as output what their boards look like.

Ship Placement

There are a lot of articles out there that purport to help you win a game of Battleship each time you play (and many of them correspond with the release of that god-awful movie), but for this practice we're not going to bother with more advanced strategies since our goal is not to win games, but to understand the game itself better by modeling it.

In short: our ship placement will be effectively random.

But it cannot be truly random, since two ships cannot occupy the same panel. Therefore we must implement a placement algorithm which places each ship on the board but ensures that each ship does not occupy the same Panel as any other ship.

More Modeling Practice:

Here's the rundown of that algorithm:

  1. For each ship we have left to place:
    1. Pick a random panel which is currently unoccupied.
    2. Select an orientation (horizontal or vertical) at random.
    3. Attempt to place the ship on the proposed panels. If any of those panels are already occupied, or are outside the boundaries of the game board, start over from 1.

Given that the total number of panels (100) is much greater than the space we need to occupy (2 + 3 + 3 + 4 + 5 = 16), this is actually relatively efficient, but not perfect.

Let's start coding up that algorithm, using the Player class we defined in Part 2. We'll create a new method PlaceShips in the Player class and define it like so, and use a random number generator that I stole from StackOverflow:

public void PlaceShips()  
{
    Random rand = new Random(Guid.NewGuid().GetHashCode());
    foreach (var ship in Ships)
    {
        //Select a random row/column combination, then select a random orientation.
        //If none of the proposed panels are occupied, place the ship
        //Do this for all ships

        bool isOpen = true;
        while (isOpen)
        {
            //Next() has the second parameter be exclusive, while the first parameter is inclusive.
            var startcolumn = rand.Next(1,11); 
            var startrow = rand.Next(1, 11);
            int endrow = startrow, endcolumn = startcolumn;
            var orientation = rand.Next(1, 101) % 2; //0 for Horizontal

            List<int> panelNumbers = new List<int>();
            if (orientation == 0)
            {
                for (int i = 1; i < ship.Width; i++)
                {
                    endrow++;
                }
            }
            else
            {
                for (int i = 1; i < ship.Width; i++)
                {
                    endcolumn++;
                }
            }

            //We cannot place ships beyond the boundaries of the board
            if(endrow > 10 || endcolumn > 10)
            {
                isOpen = true;
                continue; //Restart the while loop to select a new random panel
            }

            //Check if specified panels are occupied
            var affectedPanels = GameBoard.Panels.Range(startrow, startcolumn, endrow, endcolumn);
            if(affectedPanels.Any(x=>x.IsOccupied))
            {
                isOpen = true;
                continue;
            }

            foreach(var panel in affectedPanels)
            {
                panel.OccupationType = ship.OccupationType;
            }
            isOpen = false;
        }
    }
}

You may have noticed the following call in the above method:

var affectedPanels = GameBoard.Panels.Range(startrow, startcolumn, endrow, endcolumn);  

Range() is an extension method we defined for this project, and looks like this:

public static class PanelExtensions  
{
    public static List<Panel> Range(this List<Panel> panels, int startRow, int startColumn, int endRow, int endColumn)
    {
        return panels.Where(x => x.Coordinates.Row >= startRow 
                                    && x.Coordinates.Column >= startColumn 
                                    && x.Coordinates.Row <= endRow 
                                    && x.Coordinates.Column <= endColumn).ToList();
    }
}

As you can see, Range() just gives all the panels which are in the square defined by the passed-in row and column coordinates (and is inclusive of those panels).

Show the Boards

The method PlaceShips places each ship on the Player's board. But how can we tell where the ships are? Let's implement another method in the Player class, called OutputBoards:

public void OutputBoards()  
{
    Console.WriteLine(Name);
    Console.WriteLine("Own Board:                          Firing Board:");
    for(int row = 1; row <= 10; row++)
    {
        for(int ownColumn = 1; ownColumn <= 10; ownColumn++)
        {
            Console.Write(GameBoard.Panels.At(row, ownColumn).Status + " ");
        }
        Console.Write("                ");
        for (int firingColumn = 1; firingColumn <= 10; firingColumn++)
        {
            Console.Write(FiringBoard.Panels.At(row, firingColumn).Status + " ");
        }
        Console.WriteLine(Environment.NewLine);
    }
    Console.WriteLine(Environment.NewLine);
}

This method outputs the current boards to the command line. Running a sample application and calling this method, we get the following output:

Shows the placement of Player 1's ships

Shows the placement of Player 2's ships

Summary

In this part, we:

  • Created the components needed to play a game of Battleship.
  • Created an algorithm to allow our Player objects to place their Ships on the board.
  • Created a method to display the current GameBoard and FiringBoard for each player.

Our game is now ready to play! But... how do we do so? That's coming up in Part 3 of Modeling Battleship in C#!

Don't forget to check out the GitHub repository for this series!

Happy Modeling!

Modeling Battleship in C# - Introduction and Strategies

NOTE: This is Part 1 of a three-part series demonstrating how we might model the classic game Battleship as a C# program. You might want to use the sample project over on GitHub to follow along with this post. Also, check out my other posts in the Modeling Practice series!

In software development, often we programmers are asked to take large, complex issues and break them down into smaller, more manageable chunks in order to solve any given problem. I find that this, as with many things, becomes easier the more you practice it, and so this blog has a series of posts called Modeling Practice in which we take large, complex problems and model them into working software applications.

In my case, I love games, so each of the previous entrants in this series have been popular, classic games (Candy Land, Minesweeper, UNO). That tradition continues here, and this time the board game we'll be modeling is the classic naval battle game Battleship.

A picture of the game box, showing two children playing the game and placing red and white pegs on the boards.

My boys (who I've written about before) are now old enough that they can play this game themselves, and so they've been killing hours trying to sink each other's ships.

That's the Modeling Practice we're going to do this time: we're going to model a game of Battleship from start to finish, including how our players will behave. So, let's get started!

What is Battleship?

For those of you who might not have played Battleship before, here's how it works. Each player gets a 10-by-10 grid on which to place five ships: the eponymous Battleship, as well as an Aircraft Carrier, a Cruiser, a Submarine, and a Destroyer. The ships have differing lengths, and larger ships can take more hits. Players cannot see the opposing player's game board.

More Modeling Practice:

Players also have a blank firing board from which they can call out shots. On each player's turn, they call out a panel (by using the panel coordinates, e.g. "A5" which means row A, column 5) on their opponent's board. The opponent will then tell them if that shot is a hit or a miss. If it's a hit, the player marks that panel with a red peg; if it is a miss, the player marks that panel with a white peg. It then becomes the other player's turn to call a shot.

When a ship is sunk, the player who owned that ship should call out what ship it was, so the other player can take note. Finally, when one player loses all five of his/her ships, that player loses.

Image is Sailors play "Battleship" aboard a carrier, found on Wikimedia. In this game, the player who owned the left board would have lost.

The game itself was known at least as far back as the 1890s, but it wasn't until 1967 that Mattel produced the peg-and-board version that most people have seen today. It is that version (and its official rules) that we will use as part of our modeling practice.

Image is You sunk my battleship!, found on Flickr and used under license.

Now, let's get started modeling! First, we need to figure out the components of the game.

Components of the Game

In order to play a game of Battleship, our system will need to be able to model the following components:

  • Coordinates: The most basic unit in the game. Represents a row and column location where a shot can be fired (and where a ship may or may not exist).
  • Panels: The individual pieces of the board that can be fired at.
  • Game Board: The board on which players place their ships and their opponent's shots.
  • Firing Board: The board on which players place their own shots and their results (hit or miss).
  • Ships: The five kinds of ships that the game uses.

All of that is fine and good, but if our system expects to be able to actually play a game, we're going to have to figure out the strategy involved.

Potential Strategies

Here's a sample Battleship game board:

There are two different strategies we'll need to model:

  1. How to place the ships AND
  2. How to determine what shots to fire.

Fortunately (or maybe unfortunately) the first strategy is terribly simple: place the ships randomly. The reason is that since your opponent will be firing at random for much of the game, there's no real strategy needed here.

The real interesting strategy is this one: how can we know where to fire our shots so as to sink our opponent's ships as quickly as possible? One possibility is that, just like placing the ships randomly, we also just fire randomly. This will eventually sink all the opponent's ships, but there is also a better way, and it involves combining two distinct strategies.

First, when selecting where to fire a shot, we don't need to pick from every possible panel. Instead, we only need to pick from every other panel, like so:

Because the smallest ship in the game (the Destroyer) is still two panels long, this strategy ensures that we will eventually hit each ship at least once.

But what about when we actually score a hit? At that point, we should only target adjacent panels, so as to ensure that we will sink that ship:

These are called "searching" shots in my system, and we only stop doing searching shots when we sink a ship.

By using these two strategies in tandem, we ensure that we can sink the opponent's ships in the shortest possible time (without using something like a probability algorithm, which more advanced solvers would do).

Summary

Here's all of the strategies we've discovered so far:

  1. Placement of ships is random; no better strategy available.
  2. Shot selection is partly random (every other panel) until a hit is scored.
  3. Once a hit is scored, we use "searching" shots to eventually sink that ship.
  4. The game ends when one player has lost all their ships.

In the next part of this series, we will begin our implementation by defining the components for our game, including the players, ships, coordinates, and so on. We'll also set up a game to be played by having our players place their ships.

Don't forget to check out the sample project over on GitHub!

Happy Modeling!

Mapping DataTables and DataRows to Objects in C# and .NET

My group regularly uses DataSet, DataTable, and DataRow objects in many of our apps.

(What? Don't look at me like that. These apps are old.)

Anyway, we're simultaneously trying to implement good C# and object-oriented programming principles while maintaining these old apps, so we often end up having to map data from a data set to a C# object. We did this enough times that I and a coworker (we'll call her Marlena) decided to sit down and just make up a new mapping system for use with these DataTable and DataRow objects.

As always with my code-based posts, there's a GitHub project with a full working example app, so check that out too!

One Jump Ahead

So here's a basic problem with mapping from DataSet, DataTable, and DataRow objects: we don't know at compile time what columns and tables exist in the set, so mapping solutions like AutoMapper won't work for this scenario. Our mapping system will have to assume what columns exist. But, in order to make it more reusable, we will make the mapping system return default values for any values which it does not locate.

There's also another, more complex problem: the databases we are acquiring our data from use many different column names to represent the same data. 20 years of different maintainers and little in the way of cohesive naming standards will do that do a database. So, if we needed a person's first name, the different databases might use:

  • first_name
  • firstName
  • fname
  • name_first

This, as might be imagined, makes mapping anything rather difficult. So our system will also need to be able to map from many different column names.

Finally, this system wouldn't be worth much if it couldn't handle collections of objects as well as single objects, so we'll need to allow for that as well.

So, in short, our system needs to:

  1. Map from DataTable and DataRow to objects.
  2. Map from multiple different column names.
  3. Handle mapping to a collection of objects as well as a single object.

We'll need several pieces to accomplish this. But before we can even start building the mapping system, we must first acquire some sample data.

Mine, Mine, Mine

We're going to create some DataSet objects that we can test our system against. In the real world, you would use an actual database, but here (for simplicity's sake) we're just going to manually create some DataSet objects. Here's a sample class which will create two DataSet objects, Priests and Ranchers, each of which use different column names for the same data:

public static class DataSetGenerator  
{
    public static DataSet Priests()
    {
        DataTable priestsDataTable = new DataTable();
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "first_name",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "last_name",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "dob",
            DataType = typeof(DateTime)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "job_title",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "taken_name",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "is_american",
            DataType = typeof(string)
        });

        priestsDataTable.Rows.Add(new object[] { "Lenny", "Belardo", new DateTime(1971, 3, 24), "Pontiff", "Pius XIII", "yes" });
        priestsDataTable.Rows.Add(new object[] { "Angelo", "Voiello", new DateTime(1952, 11, 18), "Cardinal Secretary of State", "", "no" });
        priestsDataTable.Rows.Add(new object[] { "Michael", "Spencer", new DateTime(1942, 5, 12), "Archbishop of New York", "", "yes" });
        priestsDataTable.Rows.Add(new object[] { "Sofia", "(Unknown)", new DateTime(1974, 7, 2), "Director of Marketing", "", "no" });
        priestsDataTable.Rows.Add(new object[] { "Bernardo", "Gutierrez", new DateTime(1966, 9, 16), "Master of Ceremonies", "", "no" });

        DataSet priestsDataSet = new DataSet();
        priestsDataSet.Tables.Add(priestsDataTable);

        return priestsDataSet;
    }

    public static DataSet Ranchers()
    {
        DataTable ranchersTable = new DataTable();
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "firstName",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "lastName",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "dateOfBirth",
            DataType = typeof(DateTime)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "jobTitle",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "nickName",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "isAmerican",
            DataType = typeof(string)
        });

        ranchersTable.Rows.Add(new object[] { "Colt", "Bennett", new DateTime(1987, 1, 15), "Ranchhand", "", "y" });
        ranchersTable.Rows.Add(new object[] { "Jameson", "Bennett", new DateTime(1984, 10, 10), "Ranchhand", "Rooster", "y" });
        ranchersTable.Rows.Add(new object[] { "Beau", "Bennett", new DateTime(1944, 8, 9), "Rancher", "", "y" });
        ranchersTable.Rows.Add(new object[] { "Margaret", "Bennett", new DateTime(1974, 7, 2), "Bar Owner", "Maggie", "y" });
        ranchersTable.Rows.Add(new object[] { "Abigail", "Phillips", new DateTime(1987, 4, 24), "Teacher", "Abby", "y" });

        DataSet ranchersDataSet = new DataSet();
        ranchersDataSet.Tables.Add(ranchersTable);

        return ranchersDataSet;
    }
}

We'll test our system against this sample data.

Something There

Now we can build our actual mapping solution. First off, we need a way to decide what column names map to object properties. It was Marlena's idea to keep those things together, and so we came up with a class called DataNamesAttribute that looks like this:

[AttributeUsage(AttributeTargets.Property)]
public class DataNamesAttribute : Attribute  
{
    protected List<string> _valueNames { get; set; }

    public List<string> ValueNames
    {
        get
        {
            return _valueNames;
        }
        set
        {
            _valueNames = value;
        }
    }

    public DataNamesAttribute()
    {
        _valueNames = new List<string>();
    }

    public DataNamesAttribute(params string[] valueNames)
    {
        _valueNames = valueNames.ToList();
    }
}

This attribute can then be used (in fact, can only be used, due to the AttributeUsage(AttributeTargets.Property) declaration) on properties of other classes. Let's say we're going to map to a Person class. We would use DataNamesAttribute like so:

public class Person  
{
    [DataNames("first_name", "firstName")]
    public string FirstName { get; set; }

    [DataNames("last_name", "lastName")]
    public string LastName { get; set; }

    [DataNames("dob", "dateOfBirth")]
    public DateTime DateOfBirth { get; set; }

    [DataNames("job_title", "jobTitle")]
    public string JobTitle { get; set; }

    [DataNames("taken_name", "nickName")]
    public string TakenName { get; set; }

    [DataNames("is_american", "isAmerican")]
    public bool IsAmerican { get; set; }
}

Now that we know where the data needs to end up, let's start mapping out the mapper (heh).

Reflection

Our mapper class will be a generic class so that we can map from DataTable or DataRow objects to any kind of object. We'll need two methods to get different kinds of data:

public class DataNamesMapper<TEntity> where TEntity : class, new()  
{
    public TEntity Map(DataRow row) { ... }
    public IEnumerable<TEntity> Map(DataTable table) { ... }
}

Let's start with the Map(DataRow row) method. We need to do three things:

  1. Figure out what columns exist in this row.
  2. Determine if the TEntity we are mapping to has any properties with the same name as any of the columns (aka the Data Names) AND
  3. Map the value from the DataRow to the TEntity.

Here's how we do this, using just a bit of reflection:

public TEntity Map(DataRow row)  
{
    //Step 1 - Get the Column Names
    var columnNames = row.Table.Columns
                               .Cast<DataColumn>()
                               .Select(x => x.ColumnName)
                               .ToList();

    //Step 2 - Get the Property Data Names
    var properties = (typeof(TEntity)).GetProperties()
                                      .Where(x => x.GetCustomAttributes(typeof(DataNamesAttribute), true).Any())
                                      .ToList();

    //Step 3 - Map the data
    TEntity entity = new TEntity();
    foreach (var prop in properties)
    {
        PropertyMapHelper.Map(typeof(TEntity), row, prop, entity);
    }

    return entity;
}

Of course, we also need to handle the other method, the one where we can get a collection of TEntity:

public IEnumerable<TEntity> Map(DataTable table)  
{
    //Step 1 - Get the Column Names
    var columnNames = table.Columns.Cast<DataColumn>().Select(x => x.ColumnName).ToList();

    //Step 2 - Get the Property Data Names
    var properties = (typeof(TEntity)).GetProperties()
                                        .Where(x => x.GetCustomAttributes(typeof(DataNamesAttribute), true).Any())
                                        .ToList();

    //Step 3 - Map the data
    List<TEntity> entities = new List<TEntity>();
    foreach (DataRow row in table.Rows)
    {
        TEntity entity = new TEntity();
        foreach (var prop in properties)
        {
            PropertyMapHelper.Map(typeof(TEntity), row, prop, entity);
        }
        entities.Add(entity);
    }

    return entities;
}

You might be wondering just what the heck the PropertyMapHelper class is. If you are, you might also be about to regret it.

Dig a Little Deeper

The PropertyMapHelper, as suggested by the name, maps values to different primitive types (int, string, DateTime, etc.). Here's that Map() method we saw earlier:

public static void Map(Type type, DataRow row, PropertyInfo prop, object entity)  
{
    List<string> columnNames = AttributeHelper.GetDataNames(type, prop.Name);

    foreach (var columnName in columnNames)
    {
        if (!String.IsNullOrWhiteSpace(columnName) && row.Table.Columns.Contains(columnName))
        {
            var propertyValue = row[columnName];
            if (propertyValue != DBNull.Value)
            {
                ParsePrimitive(prop, entity, row[columnName]);
                break;
            }
        }
    }
}

There are two pieces in this method that we haven't defined yet: the AttributeHelper class and the ParsePrimitive() method. AttributeHelper is a rather simple class that merely gets the list of column names from the DataNamesAttribute:

public static List<string> GetDataNames(Type type, string propertyName)  
{
    var property = type.GetProperty(propertyName).GetCustomAttributes(false).Where(x => x.GetType().Name == "DataNamesAttribute").FirstOrDefault();
    if (property != null)
    {
        return ((DataNamesAttribute)property).ValueNames;
    }
    return new List<string>();
}

The other we need to define in ParsePrimitive(), which as its name suggests will parse the values into primitive types. Essentially what this class does is assign a value to a passed-in property reference (represented by the PropertyInfo class). I'm not going to post the full code on this post (you can see it over on GitHub), so here's a snippet of what this method does:

private static void ParsePrimitive(PropertyInfo prop, object entity, object value)  
{
    if (prop.PropertyType == typeof(string))
    {
        prop.SetValue(entity, value.ToString().Trim(), null);
    }
    else if (prop.PropertyType == typeof(int) || prop.PropertyType == typeof(int?))
    {
        if (value == null)
        {
            prop.SetValue(entity, null, null);
        }
        else
        {
            prop.SetValue(entity, int.Parse(value.ToString()), null);
        }
    }
    ...
}

That's the bottom of the rabbit hole, as it were. Now, we can use the DataSet objects we created earlier and our mapping system to see if we can map this data correctly.

Two Worlds

Here's a quick program that can test our new mapping system:

class Program  
{
    static void Main(string[] args)
    {
        var priestsDataSet = DataSetGenerator.Priests();
        DataNamesMapper<Person> mapper = new DataNamesMapper<Person>();
        List<Person> persons = mapper.Map(priestsDataSet.Tables[0]).ToList();

        var ranchersDataSet = DataSetGenerator.Ranchers();
        persons.AddRange(mapper.Map(ranchersDataSet.Tables[0]));

        foreach (var person in persons)
        {
            Console.WriteLine("First Name: " + person.FirstName + ", Last Name: " + person.LastName
                                + ", Date of Birth: " + person.DateOfBirth.ToShortDateString()
                                + ", Job Title: " + person.JobTitle + ", Nickname: " + person.TakenName
                                + ", Is American: " + person.IsAmerican);
        }

        Console.ReadLine();
    }
}

When we run this app (which you can do too), we will get the following output:

Which is exactly what we want!

(I mean, really, did you expect me to blog about something that didn't work?)

Go the Distance

It concerns me that this system is overly complicated, and I'd happily take suggestions on how to make it more straightforward. While I do like how all we need to do is place the DataNamesAttribute on the correct properties and then call an instance of DataNamesMapper<T>, I feel like the whole thing could be easier somehow. Believe it or not, this version is actually simpler than the one we're actually using in our internal apps.

Also, check out the sample project over on GitHub, fork it, test it, whatever. If it helped you out, or if you can improve it, let me know in the comments!

Finally, extra special bonus points will go to anyone who can figure out a) what the hell those odd section titles are about and b) where I got the sample data from.

Happy Coding!

Decimal vs Double and Other Tips About Number Types in .NET

I've been coding with .NET for a long time. In all of that time, I haven't really had a need to figure out the nitty-gritty differences between float and double, or between decimal and pretty much any other type. I've just used them as I see fit, and hope that's how they were meant to be used.

Until recently, anyway. Recently, as I was attending the AngleBrackets conference, and one of the coolest parts of attending that conference is getting to be in an in-depth workshop. My particular workshop was called I Will Make You A Better C# Programmer by Kathleen Dollard, and my reaction was thus:

One of the most interesting things I learned at Kathleen's session was that the .NET number types don't always behave the way I think they do. In this post, I'm going to walk through a few (a VERY few) of Kathleen's examples and try to explain why .NET has so many different number types and what they are each for. Come along as I (with her code) attempt to show what the number types are, and what they are used for!

The Number Types in .NET

Let's start with a review of the more common number types in .NET. Here's a few of the basic types:

  • Int16 (aka short): A signed integer with 16 bits (2 bytes) of space available.
  • Int32 (aka int): A signed integer with 32 bits (4 bytes) of space available.
  • Int64 (aka long): A signed integer with 64 bits (8 bytes) of space available.
  • Single (aka float): A 32-bit floating point number.
  • Double (aka double): A 64-bit floating-point number.
  • Decimal (aka decimal): A 128-bit floating-point number with a higher precision and a smaller range than Single or Double.

There's an interesting thing to point out when comparing double and decimal: the range of double is ±5.0 × 10−324 to ±1.7 × 10308, while the range of decimal is (-7.9 x 1028 to 7.9 x 1028) / (100 to 28). In other words, the range of double is several times larger than the range of decimal. The reason for this is that they are used for quite different things.

Precision vs Accuracy

One of the concepts that's really important to discuss when dealing with .NET number types is that of precision vs. accuracy. To make matters more complicated, there are actually two different definitions of precision, one of which I will call arithmetic precision.

  • Precision refers to the closeness of two or more measurements to each other. If you measure something five times and get exactly 4.321 each time, your measurement can be said to be very precise.
  • Accuracy refers to the closeness of a value to standard or known value. If you measure something, and find it's weight to be 4.7kg, but the known value for that object is 10kg, your measurement is not very accurate.
  • Arithmetic precision refers to the number of digits used to represent a number (e.g. how many numbers after the decimal are used). The number 9.87 is less arithmetically precise than the number 9.87654332.

We always need mathematical operations in a computer system to be accurate; we cannot ever expect 4 x 6 = 32. Further, we also need these calculations to be precise using the common term; 4 x 6 must always be precisely 24 no matter how many times we make that calculation. However, the extent to which we want our systems to be either arithmetically precise has a direct impact on the performance of those system.

If we lose some arithmetic precision, we gain performance. The reverse is also true: if we need values to be arithmetically precise, we will spend more time calculating those values. Forgetting this can lead to incredible performance problems, problems which can be solved by using the correct type for the correct problem. These kinds of issues are most clearly shown during Test 3 later in this post.

Why Is Int The Default?

Here's something I've always wondered. If you take this line of code:

var number = 5;  

Why is the type of number always an int? Why not a short, since that takes up less space? Or maybe a long, since that will represent nearly any integer we could possibly use?

Turns out, the answer is, as it often is, performance. .NET optimizes your code to run in a 32-bit architecture, which means that any operations involving 32-bit integers will by definition be more performant than either 16-bit or 64-bit operations. I expect that this will change as we move toward a 64-bit architecture being standard, but for now, 32-bit integers are the most performant option.

Testing the Number Types

One of the ways we can start to see the inherent differences between the above types is by trying to use them in calculations. We're going to see three different tests, each of which will reveal a little more about how .NET uses each of these types.

Test 1: Division

Let's start with a basic example using division. Consider the following code:

private void DivisionTest1()  
{
    int maxDiscountPercent = 30;
    int markupPercent = 20;
    Single niceFactor = 30;
    double discount = maxDiscountPercent * (markupPercent / niceFactor);
    Console.WriteLine("Discount (double): ${0:R}", discount);
}

private void DivisionTest2()  
{
    byte maxDiscountPercent = 30;
    int markupPercent = 20;
    int niceFactor = 30;
    int discount = maxDiscountPercent * (markupPercent / niceFactor);
    Console.WriteLine("Discount (int): ${0}", discount);
}

Note that the only thing that's really different about these two methods are the types of the local variables.

Now here's the question: what will the discount be in each of these methods?

If you said that they'll both be $20, you're missing something very important.

The problem line is this one, from DivisionTest2():

int discount = maxDiscountPercent * (markupPercent / niceFactor);  

Here's the problem: because markupPercent is declared as an int (which in turn makes it an Int32), when you divide an int by another int, the result will be an int, even when we would logically expect it to be something like a double. .NET does this by truncating the result, so because 20 / 30 = 0.6666667, what you get back is 0 (and anything times 0 is 0).

In short, the discount for DivisionTest1 is the expected $20, but the discount for DivisionTest2 is $0, and the only difference between them is what types are used. That's quite a difference, no?

Test 2 - Double Addition

Now we get to see something really weird, and it involves the concept of arithmetic precision from earlier. Here's the next method:

public void DoubleAddition()  
{
    Double x = .1;
    Double result = 10 * x;
    Double result2 = x + x + x + x + x + x + x + x + x + x;

    Console.WriteLine("{0} - {1}", result, result2);
    Console.WriteLine("{0:R} - {1:R}", result, result2);
}

Just by reading this code, we expect result and result2 to be the same: multiplying .1 x 10 should equal .1 + .1 + .1 + .1 + .1 + .1 + .1 + .1 + .1 + .1.

But there's another trick here, and that's the usage of the "{O:R}" string formatter. That's called the round-trip formatter, and it tells .NET to display all parts of this number to its maximum arithmetic precision.

If we run this method, what does the output look like?

By using the round-trip formatter, we see that the multiplication result ended up being exactly 1, but the addition result was off from 1 by a miniscule (but still potentially significant) amount. Question is: why does it do this?

In most systems, a number like 0.1 cannot be accurately represented using binary. There will be some form of arithmetic precision error when using a number such as this. Generally, said arithmetic precision error is not noticeable when doing mathematical operations, but the more operations you perform, the more noticeable the error is. The reason we see the error above is because for the multiplication portion, we only performed one operation, but for the addition portion, we performed ten, and thus caused the arithmetic precision error to compound each time.

Test 3 - Decimal vs Double Performance

Now we get to see something really interesting. I'm often approached by new .NET programmers with a question like the following: why should we use decimal over double and vice-versa? This test pretty clearly spells out when and why you should use these two types.

Here's the sample code:

private int iterations = 100000000;

private void DoubleTest()  
{
    Stopwatch watch = new Stopwatch();
    watch.Start();
    Double z = 0;
    for (int i = 0; i < iterations; i++)
    {
        Double x = i;
        Double y = x * i;
        z += y;
    }
    watch.Stop();
    Console.WriteLine("Double: " + watch.ElapsedTicks);
    Console.WriteLine(z);
}

private void DecimalTest()  
{
    Stopwatch watch = new Stopwatch();
    watch.Start();
    Decimal z = 0;
    for (int i = 0; i < iterations; i++)
    {
        Decimal x = i;
        Decimal y = x * i;
        z += y;
    }
    watch.Stop();
    Console.WriteLine("Decimal: " + watch.ElapsedTicks);
    Console.WriteLine(z);
}

For each of these types, we are doing a series of operations (100 million of them) and seeing how many ticks it takes for the double operation to execute vs how many ticks it takes for the decimal operations to execute. The answer is startling:

The operations involving double take 790836 ticks, while the operations involving decimal take a whopping 16728386 ticks. In other words, the decimal operations take 21 times longer to execute than the double operations. (If you run the sample project, you'll notice that the decimal operations take visibly longer than the double ones).

But why? Why does double take so much less time than decimal?

For one thing, double uses base-2 math, while decimal uses base-10 math. Base-2 math is much quicker for computers to calculate.

Further, what double is concerned with is performance, while what decimal is concerned with is precision. When using double, you are accepting a known trade-off: you won't be super precise in your calculations, but you will get an acceptable answer quickly. Whereas with decimal, precision is built into its type: it's meant to be used for money calculations, and guess how many people would riot if those weren't accurate down to the 23rd place after the decimal point.

In short, double and decimal are two totally separate types for a reason: one is for speed, and the other is for precision. Make sure you use the appropriate one at the appropriate time.

Summary

As can be expected from such a long-lived framework, .NET has several number types to help you with your calculations, ranging from simple integers to complex currency-based values. As always, it's important to use the correct tool for the job:

  • Use double for non-integer math where the most precise answer isn't necessary.
  • Use decimal for non-integer math where precision is needed (e.g. money and currency).
  • Use int by default for any integer-based operations that can use that type, as it will be more performant than short or long.

Don't forget to check out the sample project over on GitHub!

Are there any other pitfalls or improvements we should be aware of? Feel free to sound off in the comments!

Happy Coding!

Huge thanks to Kathleen Dollard (@kathleendollard) for providing the code samples and her invaluable insight into how to effectively explain what's going on in these samples. Check out her Pluralsight course for more!