Microsoft.Data - It’s not as evil as you think

I wanted to jump in with my $0.02 (Canadian Winking smile) on Microsoft.Data.  David Fowler, fellow ASP.Net team member, posted about it earlier today, and the response has beenactive Smile.  The message behind most of these responses has been that it encourages bad practices to novice developers.  I think there’s an important point that’s being missed here: It doesn’t matter how hard we work, as professional developers, to create clean architectures and abstractions, there’s a whole world of novice developers who just don’t care about that.  They want to write code now and be done with it.  This is the audience targeted by Microsoft.Data, and the WebMatrix product as a whole.  If novice developers come to the Microsoft platform and see these complicated architectures (which, don’t get me wrong, have massive benefits for professional development), they simply won’t adopt them, they’ll just head to a different platform which allows them to get their job done quickly.

Microsoft.Data is, in my opinion, actually a step in the right direction.  For example, let’s say I’m a novice developer, and I go out and look for samples and documentation in order to piece together a simple product list from my database.  As long as we do our work right (and we plan to), the documentation will lead me to write something like this:

@{ var products = db.Query(“SELECT * FROM Products WHERE CategoryId = @0”, categoryId); }

@foreach(var product in products) {
    <li>@product.Name</li>
}

It works, and it’s SQL Injection safe.  Note that although David didn’t blog directly about this, Microsoft.Data fully supports parameterized SQL and it actually supports it better than traditional ADO.Net (note that I don’t have to fiddle with SqlParameter objects).

Now, I’m learning more about proper architecture, and I decide to switch to an ORM and use Linq for my queries.  A quick one-liner change and I’m all set:

@{ var products = db.Products.Where(p => p.CategoryId == categoryId); }

@foreach(var product in products) {
    <li>@product.Name</li>
}

[NOTE: WebMatrix doesn’t include a true ORM, as there are plenty of good options out there, this is just a sample of how one could ramp up from Microsoft.Data to more powerful ORMs]

And if I want to go full-tilt and use ASP.Net MVC, I can move my data access code into a Controller (or even deeper into my architecture) and then a bit of copy-pasting gets me to:

public ActionResult Products(int categoryId) {
    var products = db.Products.Where(p => p.CategoryId == categoryId);
    return View(products);
}
@foreach(var product in Model) {
    <li>@product.Name</li>
}

From there I'm free to keep refactoring things behind further abstractions (Repository patterns, etc.). I've started simple and refactored as necessary to improve the architecture. In every step, I've been able to take a lot of the existing code with me, which traditional ADO.Net and other platforms (like PHP) don't make quite as easy.

The simple fact is this: The audience we’re targeting is already using inline SQL, they are perfectly happy to keep doing so and they are not interested in clean abstractions (to the point of finding them complex and unnecessary).  Microsoft.Data, and the entire ASP.Net Web Pages framework (the inline page model used in WebMatrix), is an attempt to provide a simple model for web development that provides on-ramps to guide users towards best practices.

Inside Razor - Part 3 - Templates

One of the features of Razor which hasn’t been discussed a lot is the Inline Template feature.  Razor includes the ability to provide an inline Razor template as an argument to a method.  At the moment, this is only used by the Grid helper in ASP.Net Web Pages (and Scott Guthrie showed it back in his original blog post).  However, we don't have much documentation on how to create your own templated helpers yet, so I figured I'd talk a bit about it.

First, let’s take a look at what code is generated when we use an inline template.  I’ve written a sample templated helper called “Repeat” which just repeats the content of the template a specified number of times (we'll take a look at the implementation later).  The page that uses this helper looks something like this:

<!DOCTYPE html>
<html>
     <head>
         <title>Repeat Helper Demo</title>
     </head>
     <body>
         <p>Repeat Helper</p>
         <ul>
             @Repeat(10, @<li>List Item</li>);
         </ul>
     </body>
</html>

And when we run it, we get the following output:

Screen shot of rendered HTML.

Let’s take a look at the implementation of “Repeat”.  I wrote it inline in the page in this case, but you could just as easily write it in a static class in App_Code and reference it that way.  In Razor, the “@functions” block lets you write code that will be injected as-is into the body of the generated class.

@using System.Text;
@functions {
      public static IHtmlString Repeat(int times, Func<int, object> template) {
           StringBuilder builder = new StringBuilder();
          for(int i = 0; i < times; i++) {
              builder.Append(template(i));
          }
          return new HtmlString(builder.ToString());
      }
}

So, the template is being passed in as a Func<int, object> and when we invoke it, we get back the result of running the template.  But, if you look at line 6, you’ll notice we’re passing in an argument to the template function.  Let’s take a look at the C# that is generated when we write a call to Repeat.  Here’s the generated code to match the Razor code in the first code listing:

this.Write(Repeat(10,item => new Microsoft.WebPages.Helpers.HelperResult(__writer => {
     @__writer.Write(" ");
     @__writer.Write("<li>List Item</li>");
})));

It’s a little complex looking, but essentially, what’s happening is that we’re writing out a lambda which accepts a single parameter called “item” (the type of which is determined by the method you’re passing it to).  When that lambda runs, we construct and return a HelperResult.  HelperResult is a class defined in the ASP.Net Web Pages framework, and it’s essentially a wrapper around yet another delegate which writes text to a TextWriter.  Think of it as a mini Razor template, when you invoke the delegate, it writes the content of the template.  The advantage of wrapping the delegate up in the HelperResult class is that we can treat it just like a string in most places since it overrides ToString to return the result of executing the template.

The “item” parameter is used in helpers like the Grid helper to provide the current data item to the template so that it can be used.  The Repeat helper passes in the iteration number as this parameter, which we can access from within the template by using “@item”.  For example:

@Repeat(10, @<li>List Item #@item</li>);

Which will render:

ss2

In summary, if you want to use Razor templates in your helper methods, just add a parameter with the type Func<?, object> where ? can be any type you want.  As an interesting little exercise, try converting the Repeat helper to take an IEnumerable<T> and pass each item of that enumerable to the template, rendering the result.

I've uploaded the Razor file containing my sample helper here: RepeatHelper.cshtml.txt (.56 KB) (Note: To avoid issues with file types on my hoster, it has a .txt extension, just remove that and you're good to go!)

Please feel free to ask questions in the comments, or by emailing me at andrew AT andrewnurse DOT net.  I'm also on twitter at @anurse and I keep an eye on the "razor" tag on StackOverflow so there's no shortage of ways to ask!

Using the Razor parser outside of ASP.Net

When Scott Guthrie originally blogged about Razor, he mentioned that it was fully hostable outside of ASP.Net.  The engine itself is not quite as detached from System.Web as we’d like, but it’s close and we’re going to get it way closer in the next release.

Having said that, you can still host Razor outside of the ASP.Net pipeline with the current beta! It’s a little trickier, and you do technically need to reference System.Web.  I’ve written a sample console app that I’m attaching to this post called “rzrc” which takes in  a .cshtml or .vbhtml Razor file and runs it through the parser and code generator to produce a .cs or .vb file.  I’ll walk through the main logic here and go over what each section does.

However, I was not the first to do this! Full credit for that goes to Gustavo Machado, who wrote an excellent post in which he used Reflector to work out how to run the Razor parser and code generator.  Well done Gustavo!  There are a few things that this version does that Gustavo’s doesn’t, such as cleaning up the Web-related stuff in the generated code and selecting the language based on the Code Language, but he basically hit it spot on!

The first thing my console app does is get the input file name, extract the extension and look up what Razor Code Language it uses.  This is done using the CodeLanguageService class, which is part of the Razor APIs:

CodeLanguageService languageService = CodeLanguageService.GetServiceByExtension(extension);
if (languageService == null) {
    Console.WriteLine("{0} is not a Razor code language", extension);
    return;
}

Then, we fire up the parser and the code generator.  A CodeLanguageService is basically a factory for constructing a Code Parser, to parse the code blocks after an “@” and a matching Code Generator to write the final C# or VB class.

InlinePageParser parser = new InlinePageParser(languageService.CreateCodeParser(), new HtmlMarkupParser());
CodeGenerator codeGenerator = 
    languageService.CreateCodeGeneratorParserListener(className,
                                                        rootNamespaceName: "Template", 
                                                        applicationTypeName: "object", 
                                                        inputFileName, 
                                                        baseClass: "System.Object");

When you run the Razor Parser, you must provide it with an object implementing IParserConsumer.  This interface has callbacks which the parser will call when it encounters various Razor constructs (more details on the Razor parse tree later).  CodeGenerator implements this interface and responds to the these callbacks by generating code.  However, it does nothing with the errors, so in the console app, I’ve written a very simple IParserConsumer called CustomParserConsumer which wraps the code generator and outputs errors to the console.  I won’t put the code here, but it’s in the sample, so take a look there if you’re interested.

Now that we’ve got all the objects we need, we can actually run the parser over the input

CustomParserConsumer consumer = new CustomParserConsumer() { CodeGenerator = codeGenerator };
using (StreamReader reader = new StreamReader(inputFileName)) {
    parser.Parse(reader, consumer);
}

Once Parse returns, the Code Generator will have built a CodeDOM tree representing the generated code during the callbacks, so we know that our code is ready to go.  Right now, the Code Generator adds in some web specific things.  For example, when we constructed the Code Gneerator above, we gave it an “applicationTypeName” which (in a web context) is the type name of the class defined in Global.asax, if there is one.  Since we are trying to generate a template that isn’t related to the web, we can get the CodeDOM tree from the Code Generator and remove these things.

codeGenerator.GeneratedCode.Namespaces[0].Types[0].Members.RemoveAt(0);
codeGenerator.GeneratedCode.Namespaces[0].Types[0].BaseTypes.Clear();
codeGenerator.GeneratedCode.Namespaces[0].Imports.Clear();

Finally, we use the CodeDOM to write the code to a C# or VB class file (provider is a CodeDomProvider from System.CodeDom.Compiler):

using (StreamWriter writer = new StreamWriter(outputFile)) {
    provider.GenerateCodeFromCompileUnit(codeGenerator.GeneratedCode, writer, new CodeDom.CodeGeneratorOptions());
}

And we’re done!  This is definitely more complicated than we’d like, but there are plans to simplify this API significantly in future releases.  For the most part, all we’ve done is left the methods our ASP.Net Build Provider uses open and accessible.  I wouldn’t bet on these APIs staying around too long, but any API changes from here on should be simplifications.  For now though, check out the sample I’ve attached and play around!  Note that you must have WebMatrix installed to use the sample. 

I’ve put some comments in which start with “EXT” which contain tips on how to extend this code to your own use.  Please feel free to take this code and use it absolutely anywhere you want!  Let me know how your using Razor by either tweeting me at @anurse or email me at andrew AT andrewnurse DOT net.

Download the console app here: rzrc.zip (3.46 KB)

Inside Razor - Part 2 - Expressions

This is part 2 of my Inside Razor series.  Read Part 1 here.

In my previous post, I glossed over one line in my sample:

<li>@p.Name ($@p.Price)</li>

Well, it's finally time to get in to how this is parsed!  So, to recap from last time, when we see the “<li>” here, we know that we are parsing a block of markup which ends at the “</li>”.  The markup parser scans forward until it finds the end tag, but before it reaches it, it sees an “@”.  So, just as with “@foreach”, it switches to the code parser.

This is where things get a bit different.  The C# code parser looks at that first identifier: “p” and checks its internal list of C# keywords.  Of course, “p” is not a C# keyword, so the C# code parser enters “Implicit Expression” mode.  The algorithm for parsing implicit expressions is something like the following:

  1. First, Read an identifier
  2. Is the next character a “(“ or “[“?
    • Yes - Read to the matching “)” or '”]”, then Go To 2
    • No – Continue to 3
  3. Is the next character a “.”?
    • Yes – Continue to 4
    • No – End of Expression
  4. Is the character AFTER the “.” a valid start character for a C# identifier?
    • Yes – Read the “.” and Go To 1
    • No – DO NOT Read the “.”, and End the Expression

The high-level overview of this algorithm is that an implicit expression is an identifier, followed by any number of method calls (“()”), indexing expressions (“[]”) and member access expressions (“.”).  And, whitespace is not allowed (except for within “()” or “[]”).  So for example, these are all valid implicit expressions in Razor:

@p.Name
@p.Name.ToString()
@p.Name.ToString()[6 - 2]
@p.Name.Replace(“ASPX”, “Razor”)[i++]

However, the following are not valid, and the second section (after the arrow, “==>”) is the only part that would be considered part of the expression by Razor:

@1 + 1 ==> @
@p++ ==> @p
@p    .   Name ==> @p
@p.Name.Length – 1 ==> @p.Name.Length

This is why we have another syntax for expressions: “@(…)”.  This syntax allows anything you want within the “()”.  So, you can write all of the previous examples using that syntax as an escape-hatch:

@(1 + 1) 
@(p++) 
@(p    .   Name) 
@(p.Name.Length - 1)

Once we’ve identified the expression, we pass it along to our code generator.  When generating the code for “@foreach () { … }”, we just dump that code into the generated C# class as-is, but when we identify an expression (either implicit or explicit) we do something a little different.  You probably noticed that unlike ASPX, there is only one control construct: “@”, there is no “@=” to distinguish code that we run vs. expressions that we render the value of.  This is where some of the magic of Razor comes in.  If we see “@foreach” for example, we know that “foreach” is a C# keyword, so that block is written as a statement to be executed.  When we see “@p.Name” or “@(1 + 1)”, we know that they are expressions, so after executing them, we render the result.  So basically:

  • @if, @switch, @try, @foreach, @for, etc. are equivalent to “<% %>”
  • @p.Name, @(p++), @(1 + 1), etc. are equivalent to “<%: %>”

Another side note is that expressions are equivalent to “<%:” and NOT “<%=”.  We made a decision in Razor that HTML encoding should be the default, and that if you want to write unencoded strings, you can use the IHtmlString interface that has been blogged about before.

So, with all that background, we can quickly jump back to our initial sample:

<li>@p.Name ($@p.Price)</li>

When we see “@p.Name” we identify that as an expression, but the space before the “(“ stops us from interpreting it as a method call.  Then “ ($” are all markup and when we see the “@”, we interpret “@p.Price” as an expression and stop at the “)”.

So there’s a quick overview of how Razor identifies and parses expressions.  In my next post I’m going to discuss hosting the Razor parser outside of ASP.Net.  As before, please feel free to leave comments if you have questions, or send me a tweet (@anurse) or an email (andrew AT andrewnurse DOT net).

Quick Update - Microsoft WebMatrix Beta released

Scott Guthrie just announced the first beta release of Microsoft WebMatrix.  I'll leave you to check out his blog post to find out more.

This is also your first chance to try out Razor.  We haven't released the MVC View Engine for Razor, but WebMatrix includes ASP.Net Web Pages, a simple page model that uses Razor syntax.  After installing WebMatrix, just create a new site, drop a CSHTML file in it, put some code in and go!

I'll post more details later, but for now, check out the post and play with the bits!

Inside Razor – Part 1 – Recursive Ping-Pong

This is the first of my blog posts about the parser for the new ASP.Net Razor syntax.  We’ve been working on this parser for a while now, and I want to share some of how it works with my readers!

The Razor parser is very different from the existing ASPX parser.  In fact, the ASPX parser is implemented almost entirely with Regular Expressions, because it is a very simple language to parse.  The Razor parser is actually separated into three components: 1) A Markup parser which has a basic understanding of HTML syntax, 2) A Code parser which has a basic understanding of either C# or VB and 3) A central orchestrator which understands how the two mix together.  Note that when I say “basic understanding” I mean basic, we’re not talking about full-fledged C# and HTML parsers here.  I’ve joked with people on the team that we should call them “Markup Understander” or “Code Comprehender” instead :).

So the Razor parser has three “actors”: The Core Parser, the Markup Parser and the Code Parser.  All three work together to parse a Razor document.  Now, let’s take a Razor file and do a full summary of the parsing procedure using these actors.  We’ll use the sample that I used last time:

<ul>
    @foreach(var p in Model.Products) {
    <li>@p.Name ($@p.Price)</li>
    }
</ul>

Ok, now we start at the top. The Razor parser is essentially in one of three states at any time during the parsing: Parsing a Markup Document, Parsing a Markup Block or Parsing a Code Block.  The first two are handled by the Markup Parser, and the last is handled by the Code Parser.  So, when the Core Parser is fired up for the first time, it calls into the Markup Parser and asks it to parse a Markup Document and return the result.  Now the parser is in the Markup Document state.  In this state, it simply scans forward to the next “@” character, it doesn’t care about tags or other HTML concepts, just “@”.  When it reaches an “@”, it makes a decision: “Is this a switch to code, or is it an email address?”  This decision is basically done by looking just before and just after the “@” to see if they are valid email characters.  This is the default convention, but there are escape sequences to force it to be treated as a switch to code.

In this case, when we see our first “@”, it is preceded by whitespace, which is not valid in an email address.  So, we now know we are switching to code.  The Markup Parser calls into the Code Parser and asks it to parse a Code Block.  A Block, in terms of the Razor Parser, is basically a single chunk of Code or Markup with a clear start and end sequence.  So, the ‘foreach’ statement here is an example of a Code Block.  It starts at the “f” character and ends at the “}” character.  The Code Parser knows enough about C# to know this, so it starts parsing the code.  The Code Parser does some very simple tracking of C# statements, so when it gets to the “<li>” it knows it’s at the start of a C# statement.  “<li>” is not something you can put at the start of a C# statement, so the Code Parser knows that this is the start of nested Markup Block.  So, it calls back into the Markup Parser, to have it parse a block of HTML.  This creates a sort of recursive ping-pong game between the Code and Markup parsers.  We start in Markup, then call into Code, then call into Markup and so on before finally returning back up this whole chain.  At the moment, the call stack in the parser looks something like this:

  • HtmlMarkupParser.ParseDocument()
    • CSharpCodeParser.ParseBlock()
      • HtmlMarkupParser.ParseBlock()

(Obviously, I am leaving out a lot of helper methods :)).

This highlights a fundamental difference between ASPX and Razor.  In an ASPX file, you can think of Code and Markup as two parallel streams.  You write some Markup, then you jump over and write some code, then you jump back and write some Markup, and so on.  A Razor file is like a tree.  You write some Markup, and then put some Code inside that Markup, then put some Markup inside that Code, and so on.

So, we’ve just called into the Markup Parser to parse a block of Markup, this block starts at “<li>” and ends at the matching “</li>”.  Until that matching “</li>”, we won’t consider the Markup Block finished.  So even if you had a “}” somewhere inside the “<li>” it wouldn’t terminate the “foreach”, because we haven’t come far enough up the stack yet.

While parsing the “<li>”, the Markup Parser sees more “@” characters, which means even more calls into the Code Parser. And so the call stack grows:

  • HtmlMarkupParser.ParseDocument()
    • CSharpCodeParser.ParseBlock()
      • HtmlMarkupParser.ParseBlock()
        • CSharpCodeParser.ParseBlock()

I’ll go into detail on how these blocks are terminated later, because it is a little complicated, but eventually we finish these code blocks and we’re back in the “<li>” block.  Then, we see “</li>” so we finish that block and pop back up to the “foreach” block.  The “}” terminates that block, so we back up to the top of our stack again: the Markup Document.  Then we read until the end of the file, not finding anymore “@” characters.  And we’re done!  We’ve parsed the entire file!

I hope that’s made the general structure of the parsing algorithm somewhat more clear.  The key take-away here is to avoid thinking of Code and Markup as separate streams and think of them as constructs you nest inside each other.  Our next topic will be Implicit Expressions, which is the logic that allows us to detect what parts of “@p.Name ($@p.Price)” are code, and what are markup.  I’ll give you a hint, we took some inspiration from PowerShell here ;).

Please post any questions or comments in the comments section or email me at “andrew AT andrewnurse DOT net”!

Introducing Razor – A New View Engine for ASP.Net

UPDATE: Fixed broken examples (I hope :)).

Earlier this morning, Scott Guthrie blogged about a new View Engine we’re developing for ASP.Net.  As many of my readers know, I joined the ASP.Net team back in October in 2009, and I’m really excited to finally be able to share what I have been working on for the past 8 months.  When I joined Microsoft, I was shown some early prototypes for this new syntax and over the course of the next 8 months, we developed it into the Beta we’re going to be releasing very soon.

Writing the parser for Razor has essentially been my job for the last 8 months, so I’d like to describe a little about some of the design ideas that went in to it as well as some of the interesting ways we implemented things.  This is the first in a few blog posts I’ll do about the Razor syntax as well as the Parser design.

Razor syntax is designed around one primary goal: Make code and markup flow together with as little interference from control characters as possible.  For example, let’s take the following ASPX:

<ul>
    <% foreach(var p in Model.Products) { %>
    <li><%= p.Name %> ($<%= p.Price %>)</li>
    <% } %>
</ul>

Now, let's boil it down to the parts that we actually care about, removing all of the extra ASPX control characters:

<ul>
    foreach(var p in Model.Products) {
    <li>p.Name ($p.Price)</li>
    }
</ul>

Obviously there isn't enough data here to unambiguously determine what's code and what's markup. When we were designing Razor, we started from here and added as little as we could to make it absolutely clear what is code and what is markup. We wanted Razor pages to be Code+Markup, with a little extra stuff as possible. We even used that goal as inspiration for the file extension for C# and VB Razor pages: cshtml and vbhtml.

So, using the C# Razor syntax, the above example becomes:

<ul>
    @foreach(var p in Model.Products) {
    <li>@p.Name ($@p.Price)</li>
    }
</ul>

If you ask me, that's pretty darn close to the previous example. Razor takes advantage of a deep knowledge of C# (or VB) and HTML syntaxes to infer a lot about what you intended to write. Let’s take this sample and break it down chunk by chunk to see how Razor parses this document.

<ul>

When Razor starts parsing a document anything goes until we see an "@". So this line just gets classified as Markup and we move on to the next

@foreach(var p in Model.Products) {

Here's where things get interesting. Now, Razor has found an "@". The "@" character is the magic character in Razor. One character, not 5 “<%=%>”, and we let the parser figure out the rest. If you skip ahead a bit, you’ll notice that there's nothing that indicates the end of the block of code (like the "%>" sequence in ASPX). Rather than having it's own syntax for delimiting code blocks, Razor tries to add as little as possible and just uses the syntax of the underlying language to determine when the code block is finished. In this case, Razor knows that a C# foreach statement is contained within "{" and "}" characters, so when you reach the end of the foreach block, it will go back to markup

<li>@p.Name ($@p.Price)</li>

Now, things get even more interesting. Didn't I just say we were in Code until the ending of the foreach? This looks a lot like Markup, and we’re still inside the foreach! This is another case where Razor is using the syntax of the underlying language to infer your intent. We know that after the "{" C# is expecting some kind of statement. But, instead of a statement, we see an HTML tag "<li>", so Razor infers that you intended to switch back to Markup. So we've essentially got a stack of 3 contexts: When we started out we were in Markup, then we saw @foreach so we went to Code, now we've see <li> so were back in Markup. At the closing </li> tag, we know you've finished the inner Markup block, so we go back to the body of the foreach.

}

Then we see the end of the foreach block, so we go back to the top-level Markup context.

</ul>

And we continue parsing markup until the next "@", or the end of the file. You may have noticed I skipped over a bit in the middle of the <li> tag. I'll save the details of that for my next post, but the essential logic is the same: "@" starts code, and we use C# syntax to tell us when that code block is finished.

It's great to finally be able to share the result of our hard work with everyone. We've worked really hard to try to create a really clean syntax for mixing code and markup and I know I’d love to hear your feedback. Post in the comments on my blog, send me a tweet at "@anurse" or send me email at "andrew AT andrewnurse DOT net".

And I know you're all probably eagerly awaiting a chance to try this out. Don’t worry, we'll have a public beta soon that you can try out!