best-practices

Between Two Stacks: The Consequences of a Data-Less Decision

We've been having an ongoing debate in our team about what archicture to use to implement our new enterprise-level application. There are two possible solutions, one familiar, one fast, but we can't seem to reach a conclusion as to which to use. A lack of applicable data is forcing us to make this key decision on intuition and guesswork, and I can't help but wonder how else we might be able to decide which path to take.

A nighttime long-exposure photo of lights from speeding cars on a seaside highway, leaving bright colored lines in their wake Speed lights 2 from Flickr, used under license

Familiarity vs Performance

Our new teammate Jerry, my boss Frank, and I have been kicking around ways to ensure that this new service will be blazing fast and thoroughly scalable, since much of our company's infrastructure will depend on it. Specifically, we're trying to determine the best (read: fastest) way of accessing the information in this system's database, since we believe that the amount of reads will be orders of magnitude larger than the amount of writes. It was partly for this reason that I benchmarked the performance of Entity Framework vs Dapper vs ADO.NET.

Throughout all of this, Jerry, Frank, and I have collectively tried to determine which assortment of technologies will allow the system to be both blazing fast and scalable, as well as not too different from what we already know. This, as you might imagine, is more difficult than we thought it would be.

There are two possible architectures we have bandied about. The first one is the one my group is most familiar with: the Microsoft stack of SQL Server, Entity Framework, and ASP.NET Web API. We build almost all of our other apps using this stack, and so development time would be much quicker if we use this setup.

The second possible architecture involves a less-familiar but theoretically more-performant stack: Redis, Dapper, RabbitMQ, and Web API, implemented using the Command Query Responsibility Segregation (CRQS) pattern. In theory, this architecture would allow the system to more redundant, more scalable, more performant, more testable, more everything (at least according to Jerry). Problem is, with the exception of Web API, nobody on my team has ever developed any products using these technologies or patterns.

So, since we lack the experience to make an educated guess as to which technology stack is "better", we wanted to use metrics to help us make a more informed decision. That, unfortunately for us, proved to be impossible.

Blindfolded Decision-Making

There's an implicit assumption in the desire to use data to make a decision, and that is that said data exists.

Our thought process went like this: if we can determine the amount of load this system will need to handle, we could make a better decision on which architecture to use (moderate load = familiar stack, heavy load = performance stack). Say we choose to go with the full MS-stack (SQL Server, Entity Framework, Web API), which many (including both Jerry and Frank) have argued will be less optimized and less performant than the theoretically-optimized stack (Redis, Dapper, RabbitMQ, Web API). In an absolute sense, we will be picking the slower option. Do we care? Even if it is the slower of the two options, would it be fast enough for our purposes?

We have no data, no metrics, no information of any kind that can give us an idea of what our load expectation will be. There's no infrastructure in place, no repository of statistics and metrics that we can review, parse, and draw conclusions from. How can we make a decision as to which architecture to use if we don't have any pertinent data?

It's a Catch-22. We need the metrics to choose the best architecture, but we need to actually implement the damn thing in order to get metrics, and implementation requires us to select an architecture. In the best case, the metrics could reveal a clear path for us to venture down. In the worst case, well, we'd be in the same situation we are now, having to make an important decision while blindfolded due to lack of supporting data.

So how will we break this impasse? We're just gonna have to pick one.

There's no other choice left to us; we'll need to pick which stack we think is best for now, implement it, and improve it later as we start to collect metrics. Given this, it seems likely that we'll go with the performance-optimized stack, since we know that will provide us scalability and responsiveness benefits into the future.

Still, though, I have to wonder if the metrics we needed might have clearly shown us the path we should go down. Without evidence, the decision being debated is one that will be made out of hope, not proof. For now, we'll just have to hope that we will choose correctly.

Have you ever encountered a decision like this, where the "best" solution wasn't clear and the methods by which you could determine which solution was better didn't exist or weren't thorough enough? How did you pick a solution? Let me know in the comments!

Ten Commandments For Naming Your Code

There are only two hard things in Computer Science: cache invalidation and naming things.

-- Phil Karlton

Naming things is hard.

A pie chart, titled 'Programmer’s Hardest Tasks', with 'Naming things' taking 49% of the chart
Image taken from How to Name Things, used under license

As developers, we spend a lot of time and effort trying to name things appropriately. This can cause us no small amount of frustration, as the ability to name things properly requires abilities (a mastery of your primary spoken language, a larger vocabulary, etc.) that aren't necessarily "part of the job" for everyday developers.

Still, good naming is a critical skill. Code is useless if no one else can read it, and so we must name our variables and functions, our methods and resources appropriately so someone (probably future you) can find and understand them again later.

Because naming is so difficult, it helps to have a particular set of rules to follow, even if you can't always abide by all of them. So, I present to you the totally not made up Ten Commandments For Naming Your Code!

Ten Commandments For Naming Your Code

  1. Thou shalt be specific.
  2. Thou shalt not use unnecessary words.
  3. Thou shalt not use abbreviations.
  4. Thou shalt use the code's primary human language.
  5. Thou shalt not make up words.
  6. Thou shalt not include type.
  7. Thou shalt only use non-obvious words if the meaning is obvious.
  8. Thou shalt prefer active voice.
  9. Thou shalt use consistent syntax.
  10. Thou shalt break these rules if necessary.

Let's investigate each of these commandments to better understand what they mean.

Thou shalt be specific

The entire purpose of naming is to impart meaning. We should be able to have any programmer of any skill level be able to read our code and understand what the objects represent, even if they don't yet know how everything fits together.

We cannot convey the correct meaning without being specific. Take this string for example:

string name = "Jack";  

What kind of name is this? A first name, a last name, a middle name, something else? Does it make a difference? A person who has no experience with this code will not know what precisely this variable means. If this string is, in fact, something more specific than just a name, then we should add more specific wording:

string monkeyName = "Jack"; //We named the monkey Jack!  

Thou shalt not use unnecessary words

Don't go overboard on being specific, though. There's no reason to include words that don't directly add meaning to the name; they're just noise. Consider:

double interestRateForDatabaseStorageAndPaymentCalculations = 0.05;  

Well, that sure is specific. But will interest rate ever not be used for "database storage" and "payment calculations"? Most of those words could be considered extra, so we could reduce the name down to this:

double interestRate = 0.05;  

By naming this variable more descriptively, we've significantly reduced the likelihood that another programmer will not understand what this value represents. The more they understand, the more quickly they can start contributing to the code themselves.

To be fair, using the first two commandments together provides a fine line for anybody to walk. Only experience and failure will let us inch closer to the ideal of being specific without being long winded.

Thou shalt not use abbreviations

There is no guarantee of whether another programmer will understand what a particular abbreviation means, but there is a guarantee that at least one person won't. Consider:

int dpaTotal = 2000;  

What does dpaTotal stand for? We have no idea, and unless we're intimately familiar with this code base already, we are unlikely to uncover what this abbreviation actually means just from reading the nearby lines. Instead, ditch the abbreviation and write out the words that the abbreviation is hiding:

int downPaymentTotal = 2000;  

Now we have a much better grasp on what is going on here: this is a down payment amount, and we don't need to go digging further into the code to understand what information this variable holds.

Thou shalt use the code's primary human language

If the rest of the codebase is written in English, write your code in English. If the codebase is written in Farsi, in Zulu, in Wingdings or however your team communicates with each other, use that same primary language. There's no reason to have Spanish words show up in a codebase written in English; no debemos que ser bilingüe para leer el código.

Thou shalt not make up words.

Don't do this:

var grafutie = "10M-12F" //What's this mean?  

Making up words forces developers who are new to the codebase to ask someone what the word means, and nobody wants to waste time trying to figure out what flagranvoci serit hansaio gerotaman.

Thou shalt not include type.

Also known as: don't use Hungarian Notation.

Adding a type or prefix of any kind (especially if it is abbreviated) generally adds to the length of the name without also imparting additional meaning. In many languages (and IDEs, and environments, and so on) the type will be obvious from the usage or declaration, so including type in our names is just adjExtra nounNoise prnounWe vrbMust advMentally vrbFilter nounOut.

Thou shalt only use non-obvious words if the meaning is obvious.

This is perfectly fine:

for(int x = 99; x >= 0; x--)  

In this case, x is only being used as an iterator, and has no further meaning. A fellow programmer will most likely read this line and understand what x is being used for.

Of course, if the X actually did have a more important meaning that just being an iterator, we should name it appropriately:

for(int bottlesOfBeerOnTheWall = 99; bottlesOfBeerOnTheWall >= 0; bottlesOfBeerOnTheWall--) //Take one down, pass it around!  

Thou shalt prefer active voice.

In a sentence written using active voice, the thing doing the action is the subject of that sentence. Active voice shows that an action is being done by an object. An example might look like this:

public void Transmogrify()  

The word Transmogrify is written in active voice, showing that this method will transmogrify something. This is opposed to passive voice, which looks like this:

public void GetTransmogrification()  

In this snippet, GetTransmogrification is passive; it does not infer exactly how it will get the transmogrification. There will be times where we will need passive voice, but usually, since our code will be doing the action, we write our code using the corresponding active voice.

Thou shalt use consistent syntax.

Regardless of which syntax rules are in use in your the project, use the same one.

If all variables start with an underscore, our variables should start with an underscore. If all classes contain the word Class, ours should as well. Even if the existing syntax violates some of the other commandments, we should still use it, because being consistent is more important than trying to impose rules on a system that isn't using them already.

If you are in a position to change or influence the syntax rules you are using so that they are more consistent (or more sane), do it. Otherwise, suck it up and write your code so that it looks like all the other code in the project. A codebase should be written in a single, consistent style, no matter what (or how stupid) it is.

Thou shalt break these rules if necessary

You don't have to listen to me. There's no substitute for critical thinking. No set of guidelines will cover every possibility, and these are not the exceptions to this rule. Follow these "commandments" whenever feasible, but if you must break them, do so and don't fret about it.

Summary

Naming is hard, yes, but it is not impossible. By putting a little extra effort into devising good names, we improve our code's readability immensely. Trust me, the time spent thinking about good names is worth it. Future you, and any other developers that inevitably will end up maintaining your code, will thank you for it.

Need more info? Check out Robert C Martin's book Clean Code, which is the definitive code craftsmanship handbook.

Do you know of any naming guidelines that I may have missed, or that have helped you solve your naming issues? Are any of these "commandments" more important than the others? Let me know in the comments!

Happy Coding!

Best Practices: Fight Code Ambiguity with Enumerations

I've written before about the idea that code needs to have a reason to exist. Right alongside that idea is another I frequently find myself having to be reminded of: code must have a clear, explicit meaning.

Let me clarify what I mean by that (irony alert!). I think that a reason to exist and meaning are two distinct ideas. In my mind, having meaning gives the code purpose and importance (much like it does for humans). Funny thing is, many coders don't put enough effort into revealing the meaning of their code to their fellow developers. The absence of meaning leads to one of the most dangerous enemies a programmer will ever have to face: a monster known to as Ambiguity.

How do you slay this beast? There are many avenues to take, and one of them is to wield a weapon known to us programmers as Enumerations.

Ambiguity: the Programmer's Terror

Consider the following line of code:

var value = DownloadFile("FileName.pdf", true);  

After reading this line, we immediately have a problem: we don't know what true actually means! We know (or can reasonably assume) what the DownloadFile() method is supposed to do, and the first parameter looks like a filename, but what in the world does that second parameter represent? We have no idea, and now we have to dig down further to find out. Digging into this particular method won't take much time, but if we're having to do it on tens or hundreds of different methods in a given day it gets exhausting rather quickly.

Let us also consider this scenario:

var success = UploadFile("FileName.pdf", content, 1, 5, 12);  

That's even worse than the previous one! What do 1, 5, 12, mean? Size, number of pages, type of file? We can't determine that from just reading this snippet. Worse, anybody who comes along and needs to make changes to this code is less likely to actually follow through, since they'll probably have to know what the values mean in order to feel comfortable making changes and they won't have easy access to that knowledge. Not only does this snippet have awful readability, it indirectly hampers developers' ability to make the code better.

The monster Ambiguity has reared it's ugly head in these two examples, spitting fire and promising doom. So what do we do about it? We pick up the mighty sword known as Enumerations and use it to cut the head off of the beast.

The Best Defense is a Good Enum

Enumerations (aka Enums) are a construct in .NET used to provide a value with a name. By naming our Enums properly and clearly, we can give meaning to otherwise meaningless values.

An example Enumeration might look like this:

public Enum HistoryType  
{
    Standard,
    Detailed,
    Exempt
}

Note that Enumerations in .NET, by default, start at value 0 and increment. So for HistoryType, Standard has value 0, Detailed has value 1, and Exempt has value 2. If we wanted to set specific values, we can do so like this:

public Enum HistoryType  
{
    Standard = 1,
    Detailed = 4,
    Exempt = 9
}

If we wanted to update the code snippets we saw earlier to use Enumerations, we might do so like this:

var value = DownloadFile("FileName.pdf", FileDownloadMethod.NewTab);

var value2 = UploadFile("FileName.pdf", content, HasDescription.Yes, HistoryType.Exempt, EmailTarget.Executives);  

I'm a huge fan of Enumerations, and I think these examples nicely illustrate why. Since we've refactored the two bad examples from earlier into using Enums, we can now tell exactly what they do without needing to dive any further. We've saved ourselves precious time by just taking a bit more time to clearly explain what these values mean.

There's two situations in which using Enumerations is almost always the best option:

  1. When you would otherwise be passing a boolean parameter to a function
  2. When you would otherwise be using a magic number.

There's no reason to let Ambiguity rule your code when the sword of Enumerations can be wielded so easily. Use it wisely, and you too shall be endowed with the skills to fell this mighty beast.

Or, at least, write cleaner code. Whatever works for you.

Happy Coding!

Using POST-REDIRECT-GET in ASP.NET MVC

Anybody that's been on the internet for more than five seconds has encountered one of these:

I'm a fan of getting rid of anything that interferes with the user experience, and these dialogs certainly get in the way. There's a pattern we can implement, called POST-REDIRECT-GET, that will eliminate these dialogs. Let's see what that pattern is, and how we can implement it in a simple ASP.NET MVC application.

What is PRG?

POST-REDIRECT-GET is a pattern that says a POST action should always REDIRECT to a GET action. This pattern is meant to provide a more intuitive interface for users, specifically by reducing the number of duplicate form submissions.

The Normal Way

Here's the code files we'll use for a regular POST scenario:

ViewModels/Home/AddUserVM.cs
public class AddUserVM  
{
    [DisplayName("First Name:")]
    [Required(ErrorMessage = "Please enter a first name.")]
    public string FirstName { get; set; }

    [DisplayName("Last Name:")]
    [Required(ErrorMessage = "Please enter a last name.")]
    public string LastName { get; set; }

    [DisplayName("Date of Birth:")]
    [DataType(DataType.Date)]
    [DisplayFormat(DataFormatString = "{0:yyyy-mm-dd}", ApplyFormatInEditMode = true)]
    [Required(ErrorMessage = "Please select a date of birth.")]
    public DateTime DateOfBirth { get; set; }
}
Controllers/HomeController.cs
public class HomeController : Controller  
{
    [HttpGet]
    public ActionResult Normal()
    {
        AddUserVM model = new AddUserVM();
        return View(model);
    }

    [HttpPost]
    public ActionResult Normal(AddUserVM model)
    {
        if(!ModelState.IsValid)
        {
            return View(model);
        }

        return RedirectToAction("Index");
    }
}
Views/Home/Normal.cshtml
@model StrictPRGDemo.ViewModels.Home.AddUserVM

<h2>Add a User (Normal)</h2>

@using(Html.BeginForm())
{
    <div>
        <div>
            @Html.LabelFor(x => x.FirstName)
            @Html.TextBoxFor(x => x.FirstName)
            @Html.ValidationMessageFor(x => x.FirstName)
        </div>
        <div>
            @Html.LabelFor(x => x.LastName)
            @Html.TextBoxFor(x => x.LastName)
            @Html.ValidationMessageFor(x => x.LastName)
        </div>
        <div>
            @Html.LabelFor(x => x.DateOfBirth)
            @Html.EditorFor(x => x.DateOfBirth)
            @Html.ValidationMessageFor(x => x.DateOfBirth)
        </div>
        <div>
            <input type="submit" value="Save" />
        </div>
    </div>
}

Our rendered page looks like this:

If we immediately click Save, our validation fires:

But what happens if we refresh the page? We get the validation warning:

Let's get rid of that, using PRG.

The PRG Way

PRG says that all POSTS need to redirect to a GET action, which sounds easy enough. But if we try this in a naive solution, where instead of returning the View(model) when validation fails we just redirect back to the GET action, the validation messages and input values will not appear.

Here's the problem: those validation messages and input values are stored in the ModelState, which gets recreated when moving between actions. We need a way to save the model state somewhere that we can access it later.

Lucky for us, there's a data structure called TempData. TempData allows data to exist for the current request and the next one, and then the data gets deleted. Sounds like that'll fill our needs, don't it?

In fact, six years ago, Kazi Mansur Rashid wrote a blog post that laid out exactly how we can use TempData to store the ModelState, and even better, how we can wrap that action in a set of attributes.

Here's those attributes:

Attributes/ModelStateTransfer.cs
public abstract class ModelStateTransfer : ActionFilterAttribute  
{
    protected static readonly string Key = typeof(ModelStateTransfer).FullName;
}

public class ExportModelStateAttribute : ModelStateTransfer  
{
    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {
        //Only export when ModelState is not valid
        if (!filterContext.Controller.ViewData.ModelState.IsValid)
        {
            //Export if we are redirecting
            if ((filterContext.Result is RedirectResult) || (filterContext.Result is RedirectToRouteResult))
            {
                filterContext.Controller.TempData[Key] = filterContext.Controller.ViewData.ModelState;
            }
        }

        base.OnActionExecuted(filterContext);
    }
}

public class ImportModelStateAttribute : ModelStateTransfer  
{
    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {
        ModelStateDictionary modelState = filterContext.Controller.TempData[Key] as ModelStateDictionary;

        if (modelState != null)
        {
            //Only Import if we are viewing
            if (filterContext.Result is ViewResult)
            {
                filterContext.Controller.ViewData.ModelState.Merge(modelState);
            }
            else
            {
                //Otherwise remove it.
                filterContext.Controller.TempData.Remove(Key);
            }
        }

        base.OnActionExecuted(filterContext);
    }
}

This creates two attributes: ExportModelState and ImportModelState.

ExportModelState takes the current ModelState and stores it in TempData, and ImportModelState reads from TempData and merges the found ModelState (if it exists) into the new one. Simple, right?

In fact, in usage, it's even simpler:

Controllers/HomeController.cs
[HttpGet]
[ImportModelState]
public ActionResult Strict()  
{
    AddUserVM model = new AddUserVM();
    return View(model);
}

[HttpPost]
[ExportModelState]
public ActionResult Strict(AddUserVM model)  
{
    if (!ModelState.IsValid)
    {
        return RedirectToAction("Strict");
    }
    return RedirectToAction("Index");
}

Know what's even better? Our ViewModel and View don't have to change at all!

Now, let's imagine we're back at this point:

If we refreshed the page, in the normal scenario we would have gotten the duplicate submission warning, but in our PRG scenario, we get this:

Look at that! We don't get an error, and the page actually refreshed itself rather than trying to resubmit the values. Best of all, the user's experience is that the page did exactly what the user told it to do. It's a little more work than the normal way, but IMO it makes it just that much nicer for our users.

It's just that easy! Check out the sample project on GitHub and let me know what you think of this technique in the comments.